一次服务器优化记录(502错误和运行堵塞)

 

 

序言

一个朋友的网站,打不开了,让帮忙看下原因。

 

问题现象

  1. 打开网站,显示502 bad gateway。
  2. 登陆宝塔后台,cpu爆满,运行状态阻塞

      

 

分析过程

过程1: 登陆服务器后台

  通过命令htop和top

分析可知,可能是php的被卡主了。

过程2:分析日志,

  宝塔的日志在 /www/server/php/72/var/log 这个路径下面,php-fpm.log  slow.log这两个文件

       php-fpm.log (截取部分)

[17-Jul-2022 11:38:00] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 17 total children
[17-Jul-2022 11:38:01] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 22 total children
[17-Jul-2022 11:38:02] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 27 total children
[17-Jul-2022 11:38:03] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 32 total children
[17-Jul-2022 11:38:04] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 37 total children
[17-Jul-2022 11:38:05] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 42 total children
[17-Jul-2022 11:38:06] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 47 total children
[17-Jul-2022 11:38:07] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 52 total children
[17-Jul-2022 11:38:08] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 57 total children
[17-Jul-2022 11:38:09] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 62 total children
[17-Jul-2022 11:38:10] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 67 total children
[17-Jul-2022 11:38:11] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 72 total children
[17-Jul-2022 11:38:12] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 77 total children

  slow.log(截取部分)


[17-Jul-2022 10:10:40]  [pool www] pid 8998
script_filename = /www/wwwroot/xxx.com/app/index.php
[0x00007fd9d5826bb0] curl_exec() /www/wwwroot/xxx.com/framework/function/communication.func.php:15
[0x00007fd9d5826720] ihttp_request() /www/wwwroot/xxx.com/framework/function/communication.func.php:58
[0x00007fd9d58266b0] ihttp_get() /www/wwwroot/xxx.com/addons/xxxx/inc/mobile/sync.inc.php:3

[17-Jul-2022 10:15:40]  [pool www] pid 9001
script_filename = /www/wwwroot/xxx.com/app/index.php
[0x00007fd9d5826b40] curl_exec() /www/wwwroot/xxx.com/framework/function/communication.func.php:15
[0x00007fd9d58266b0] ihttp_request() /www/wwwroot/xxx.com/addons/xxx/inc/mobile/sync.inc.php:3

[17-Jul-2022 11:09:02]  [pool www] pid 27667
script_filename = /www/wwwroot/xxx.com/web/index.php
[0x00007fd9d581da20] session_start() /www/wwwroot/xxx.com/web/source/utility/code.ctrl.php:23

[17-Jul-2022 11:17:53]  [pool www] pid 27735
script_filename = /www/wwwroot/xxx.com/web/index.php
[0x00007fd9d581da20] session_start() /www/wwwroot/xxx.com/web/source/utility/code.ctrl.php:23

[17-Jul-2022 11:42:34]  [pool www] pid 2083
script_filename = /www/wwwroot/xxx.com/web/index.php
[0x00007fb9aec1da20] session_start() /www/wwwroot/xxx.com/web/source/utility/code.ctrl.php:23

  slow.log没看到什么异常,fpm的有点像是线程不够引起的。

过程3: 增大fpm的进程数

  进入宝塔后台,打开php的设置,修改成如下图所示。

修改前的参数 修改后的参数

  重启服务器(重启服务应该也可以),跟踪一段时间,状态如下。

过程4:感觉cpu占用还是太高,在没有太多访问的时候,继续分析。

  1、命令ps aux | sort -k3nr | head -n 10,发现还是php-fpm占用的。

log和信息没有了,昨天没有保存。

  2、命令strace -frt -e trace=network -p 4889,跟踪,其中4889为进程id。

  log没有保存。

       发现是在 accept、socket、sendto这几个地方阻塞了,里面有ms_core_cache、ims_users_la相关的字样。

  3、进入网站代码根目录,搜索

grep -rn ms_core_cache
grep -rn ims_users_la

  分析,可能是缓存出了问题

过程5,缓存分析

  1、网站使用的是微擎,检查了一下,默认配置的是mysql做缓存,缓存里面的数据比较多。可能是这里卡主了。

  2、重新查看了一下占用cpu的进程,发现mysql有时候占用也挺高的。

  3、配置redis,打开网站根目录下的 data/config.php 文件。

$config['setting']['cache'] = 'redis';        //这里以前是mysql,修改为redis

//下面这些原来是没有的,需要新增。
$config['setting']['redis']['server'] = '127.0.0.1';    //如果redis服务器的IP。
$config['setting']['redis']['port'] = 6379;             //redis的端口号
$config['setting']['redis']['pconnect'] = 1;
$config['setting']['redis']['timeout'] = 1;
$config['setting']['redis']['auth'] = 'xxxx';           //redis的密码
$config['setting']['redis']['requirepass'] = 'xxxx';    //redis的密码

  4、重启了一下服务器,跟踪一段时间,状态如下

过程6:进一步跟踪,

  运行16个小时,状态如下。

结论

原因以及处理方案

1、之前部署服务器的时候,没有考虑到php的持续访问状态。

==>(解决方案) 重新修改php的最大动态运行线程(20=>100),最低运行线程增大。

2、网站的缓存采用的mysql的方式,这个比较吃系统缓存。

==>(解决方案),修改缓存机制为 redis机制。

 

最终原因,可能是因为缓存机制的原因,php-fpm的进程应该是不需要调整,这个没有去测试。

 

posted @ 2022-07-18 12:31  鹰翱  阅读(723)  评论(0编辑  收藏  举报