Nginx 502 Bad Gateway 排查实战：从日志定位到修复

2026年6月14日 · 阅读约需26分钟

一、问题现象：那个让人头疼的 502

作为运维或开发人员，你一定遇到过这个场景：

访问网站时，浏览器突然显示：
502 Bad Gateway
或者更具体一点：
502 Bad Gateway – Connection reset by peer

那一刻，你的内心是不是这样的：

😱 怎么又 502 了？
🤔 刚才还好好的啊？
😤 到底是 Nginx 的问题还是后端的问题？

别慌！今天我们就来系统性地排查和解决 Nginx 502 错误。

二、502 Bad Gateway 是什么？

2.1 技术定义

502 Bad Gateway 是 HTTP 状态码之一，表示：

网关或代理服务器（这里就是 Nginx）从上游服务器（后端服务，如 PHP-FPM、Node.js、Tomcat 等）收到了无效的响应。

简单来说：

用户 → Nginx（网关）→ 后端服务（PHP-FPM/Node.js等）
         ↓
      502 错误 = Nginx 说："后端给我的响应有问题！"

2.2 常见的 502 错误信息

在 Nginx 错误日志中，你可能会看到这些关键词：

错误信息	含义
`Connection reset by peer`	后端主动断开了连接
`Connection refused`	后端服务没启动或端口没监听
`upstream timed out`	后端响应超时
`recv() failed`	接收后端响应失败
`SSL: error`	SSL 握手失败

三、排查第一步：查看日志！

记住：日志是排查问题的第一手资料！

3.1 Nginx 错误日志位置

默认情况下，Nginx 错误日志在：

/var/log/nginx/error.log

如果你的 Nginx 是自定义配置，日志位置可能不同。查看配置：

nginx -t  # 测试配置并显示配置文件路径
cat /etc/nginx/nginx.conf | grep error_log

3.2 实时查看日志

# 实时监控错误日志
tail -f /var/log/nginx/error.log

# 只看最近 50 行
tail -n 50 /var/log/nginx/error.log

# 过滤 502 相关错误
grep -i "502\|bad gateway\|connection reset" /var/log/nginx/error.log

3.3 典型日志示例分析

让我们看几个真实的日志案例：

案例 1：PHP-FPM 没启动

2026/06/14 10:30:45 [error] 12345#0: *123456 connect() failed 
(111: Connection refused) while connecting to upstream, 
client: 192.168.1.100, server: example.com, request: "GET / HTTP/1.1", 
upstream: "fastcgi://127.0.0.1:9000"

关键词：Connection refused → 后端服务没启动或端口没监听

案例 2：PHP-FPM 进程不够用

2026/06/14 10:35:20 [error] 12345#0: *123456 recv() failed 
(104: Connection reset by peer) while reading response header from upstream, 
client: 192.168.1.100, server: example.com, request: "POST /api/login HTTP/1.1", 
upstream: "fastcgi://127.0.0.1:9000"

关键词：Connection reset by peer → 后端主动断开连接，可能是进程数不够

案例 3：后端响应超时

2026/06/14 10:40:15 [error] 12345#0: *123456 upstream timed out 
(110: Connection timed out) while reading response header from upstream, 
client: 192.168.1.100, server: example.com, request: "GET /report HTTP/1.1", 
upstream: "fastcgi://127.0.0.1:9000"

关键词：upstream timed out → 后端处理时间太长，超时了

四、常见原因及解决方案

4.1 原因一：后端服务没启动

现象：日志显示 Connection refused

排查步骤：

# 1. 检查 PHP-FPM 状态
systemctl status php-fpm
# 或者
systemctl status php8.1-fpm  # 根据你的版本调整

# 2. 检查端口是否监听
netstat -tlnp | grep 9000
# 或者
ss -tlnp | grep 9000

# 3. 测试端口连通性
curl -v http://127.0.0.1:9000

解决方案：

# 启动 PHP-FPM
systemctl start php-fpm

# 设置开机自启
systemctl enable php-fpm

# 如果启动失败，查看 PHP-FPM 日志
tail -f /var/log/php-fpm/error.log

4.2 原因二：PHP-FPM 进程数不够

现象：日志显示 Connection reset by peer，高并发时频繁 502

排查步骤：

# 1. 查看当前 PHP-FPM 进程数
ps aux | grep php-fpm | wc -l

# 2. 查看 PHP-FPM 配置
cat /etc/php-fpm.d/www.conf | grep -E "pm =|pm.max_children|pm.start_servers|pm.min_spare_servers|pm.max_spare_servers"

# 3. 查看 PHP-FPM 慢日志（如果开启了）
tail -f /var/log/php-fpm/www-slow.log

典型配置问题：

; 配置太少，高并发时不够用
pm = dynamic
pm.max_children = 5        ; 最多 5 个进程，太少了！
pm.start_servers = 2
pm.min_spare_servers = 1
pm.max_spare_servers = 3

解决方案：调整 PHP-FPM 配置

# 编辑配置文件
vim /etc/php-fpm.d/www.conf

调整为：

; 根据服务器内存调整，一般每个 PHP-FPM 进程占用 50-100MB
pm = dynamic
pm.max_children = 50        ; 增加到 50（根据内存调整）
pm.start_servers = 10       ; 启动时创建 10 个
pm.min_spare_servers = 5    ; 最少空闲 5 个
pm.max_spare_servers = 20   ; 最多空闲 20 个
pm.max_requests = 500       ; 每个进程处理 500 个请求后重启，防止内存泄漏

重启 PHP-FPM：

systemctl restart php-fpm

4.3 原因三：后端响应超时

现象：日志显示 upstream timed out，某些耗时操作（如大文件上传、复杂报表生成）时 502

排查步骤：

# 1. 查看 Nginx 超时配置
grep -E "timeout|read_timeout" /etc/nginx/nginx.conf
grep -E "timeout|read_timeout" /etc/nginx/conf.d/*.conf

# 2. 测试后端响应时间
time curl -v http://127.0.0.1:9000/some-slow-page.php

解决方案：增加超时时间

在 Nginx 配置中增加：

http {
    # ... 其他配置

    # 增加超时时间（根据需要调整）
    fastcgi_connect_timeout 60s;
    fastcgi_send_timeout 120s;
    fastcgi_read_timeout 120s;

    # 或者针对 PHP-FPM
    proxy_connect_timeout 60s;
    proxy_send_timeout 120s;
    proxy_read_timeout 120s;
}

如果是某个特定的慢接口，可以单独配置：

location /api/report {
    fastcgi_read_timeout 300s;  # 这个接口允许 5 分钟
    # ... 其他配置
}

重载 Nginx：

nginx -s reload

4.4 原因四：Nginx 配置错误

现象：配置修改后突然 502

排查步骤：

# 1. 测试 Nginx 配置
nginx -t

# 2. 查看配置文件
cat /etc/nginx/conf.d/your-site.conf

常见配置错误：

错误 1：upstream 配置错误

# 错误：端口写错了
upstream php_backend {
    server 127.0.0.1:9001;  # PHP-FPM 实际监听 9000
}

# 正确
upstream php_backend {
    server 127.0.0.1:9000;
}

错误 2：fastcgi_pass 配置错误

# 错误：用了 proxy_pass 但应该用 fastcgi_pass
location ~ \.php$ {
    proxy_pass http://127.0.0.1:9000;  # 错！
}

# 正确
location ~ \.php$ {
    fastcgi_pass 127.0.0.1:9000;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
}

4.5 原因五：防火墙或 SELinux 阻止

现象：服务都启动了，但还是 Connection refused

排查步骤：

# 1. 检查防火墙状态
systemctl status firewalld
# 或者
ufw status

# 2. 检查 SELinux 状态
getenforce

# 3. 查看 SELinux 日志
grep "denied" /var/log/audit/audit.log | grep nginx

解决方案：

# 临时关闭防火墙测试
systemctl stop firewalld

# 或者开放端口
firewall-cmd --add-port=9000/tcp --permanent
firewall-cmd --reload

# 临时关闭 SELinux 测试
setenforce 0

# 或者设置 SELinux 规则
setsebool -P httpd_can_network_connect 1

五、实战案例：从 502 到修复的完整过程

让我们通过一个真实案例，演示完整的排查流程。

5.1 问题描述

网站在高并发时频繁出现 502 错误，刷新后可能恢复。

5.2 排查步骤

第 1 步：查看 Nginx 错误日志

tail -f /var/log/nginx/error.log

发现：

2026/06/14 11:00:00 [error] 12345#0: *123456 recv() failed 
(104: Connection reset by peer) while reading response header from upstream

关键词：Connection reset by peer → 后端主动断开

第 2 步：检查 PHP-FPM 状态

systemctl status php-fpm
# 运行中，没问题

ps aux | grep php-fpm | wc -l
# 只有 5 个进程

第 3 步：查看 PHP-FPM 配置

cat /etc/php-fpm.d/www.conf | grep pm.max_children
# pm.max_children = 5

问题找到了：PHP-FPM 最多只有 5 个进程，高并发时不够用！

第 4 步：调整配置

vim /etc/php-fpm.d/www.conf

修改为：

pm.max_children = 50
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 20
pm.max_requests = 500

第 5 步：重启 PHP-FPM

systemctl restart php-fpm

第 6 步：验证修复

# 模拟高并发测试
ab -n 1000 -c 100 http://example.com/

# 查看结果
# 502 错误消失了！

六、预防措施：如何避免 502？

6.1 监控告警

# 1. 监控 PHP-FPM 进程数
# 2. 监控 Nginx 错误日志
# 3. 设置 502 错误告警

# 简单的监控脚本示例
#!/bin/bash
ERROR_COUNT=$(grep "502" /var/log/nginx/error.log | wc -l)
if [ $ERROR_COUNT -gt 10 ]; then
    echo "502 错误过多，请检查！" | mail -s "Nginx 502 告警" admin@example.com
fi

6.2 合理配置

根据服务器配置合理设置：

; PHP-FPM 配置参考（4核8G服务器）
pm.max_children = 80        ; 8G内存 / 100MB每个 ≈ 80
pm.start_servers = 20
pm.min_spare_servers = 10
pm.max_spare_servers = 30
pm.max_requests = 1000

6.3 性能优化

# 1. 开启 OPcache（PHP）
# 2. 使用 Redis 缓存
# 3. 数据库优化（索引、慢查询优化）
# 4. 静态资源 CDN 加速

七、总结：502 排查速查表

现象	可能原因	排查命令	解决方案
`Connection refused`	后端没启动	`systemctl status php-fpm`	启动服务
`Connection reset by peer`	进程数不够	`ps aux \\| grep php-fpm \\| wc -l`	调整配置
`upstream timed out`	响应超时	`time curl ...`	增加超时时间
配置修改后 502	配置错误	`nginx -t`	修正配置
服务都启动但 502	防火墙/SELinux	`getenforce`	调整规则

八、写在最后

502 错误并不可怕，关键是掌握正确的排查方法：

看日志：永远是第一步
定位原因：根据日志关键词判断
针对性解决：不要盲目重启
预防为主：监控、优化、合理配置

希望这篇文章能帮你快速解决 502 问题！如果你有其他排查技巧，欢迎在评论区分享～

参考资料：

如果你觉得这篇文章对你有帮助，欢迎点赞、收藏、转发～