liunx服务器某个Java运行服务CPU占用率过高问题排查及修复
进入服务器 用 top 命令查看
top
top - 09:57:55 up 40 days, 22:05, 9 users, load average: 4.44, 4.03, 3.85
Tasks: 741 total, 3 running, 738 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.8 us, 5.9 sy, 0.0 ni, 88.1 id, 0.0 wa, 1.0 hi, 1.1 si, 0.0 st
MiB Mem : 75612.2 total, 8957.1 free, 35217.4 used, 31437.6 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 39580.4 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2664227 museum 20 0 14.4g 867896 43196 S 78.6 1.1 798:35.65 java
86521 redis 20 0 333628 56776 2820 S 14.1 0.1 4467:55 redis-server
3370157 museum 20 0 14.4g 840884 41688 S 4.6 1.1 3:16.61 java
2797 mysql 20 0 16.2g 6.6g 18564 S 4.3 8.9 6379:50 mysqld
1203554 museum 20 0 14.3g 795724 41332 S 3.9 1.0 52:52.93 java
1144857 root 20 0 245420 6992 4236 S 3.3 0.0 39:53.81 sshd
3258067 root 20 0 244884 6676 4452 S 3.3 0.0 30:24.07 sshd
392412 museum 20 0 16.3g 933680 63456 S 2.6 1.2 2388:37 java
89655 museum 20 0 15.5g 1.3g 27228 S 2.0 1.7 1896:47 java
25420 1000 20 0 753840 70152 27396 S 1.6 0.1 344:25.97 kiali
找出占用CPU最高的进程的pid:2664227 的进程 命令 JPS
[museum@museum-test txn]$ jps
2262626 cztech-museum-pac-mss-service.jar
1317781 cztech-service-gateway.jar
4009151 cztech-museum-ticket-check-service.jar
1402193 cztech-manage-gateway.jar
966342 cztech-platform-cashier-service.jar
867906 cztech-platform-common-service.jar
885471 cztech-museum-outer-portal-service.jar
482323 cztech-museum-pac-boss-service.jar
3469280 cztech-museum-manage-portal-service.jar
1468736 cztech-platform-form-service.jar
392412 BrokerStartup
108961 NamesrvStartup
1461618 cztech-museum-official-manage-service.jar
2037945 Jps
667244 cztech-platform-bdm-service.jar
3800095 cztech-common-human-service.jar
2172873 cztech-platform-sso-service.jar
198761 xxl-job-admin.jar
1404089 cztech-visitor-gateway.jar
119278 cztech-platform-bpm-service.jar
**2664227 cztech-platform-transaction-service.jar**
1203554 cztech-museum-guide-service.jar
114485 cztech-common-material-service.jar
3901069 cztech-museum-ticket-service.jar
89655 nacos-server.jar
1284840 cztech-museum-openapi-service.jar
946997 cztech-museum-collection-service.jar
确定该程序的pid是这个: ps -ef|grep
[museum@museum-test txn]$ ps -ef|grep cztech-platform-transaction-service.jar
museum 2664227 1 85 7月15 ? 13:18:14 java -Dnacos.server-addr=nacos.cqcztech.cn:8848 -Dreactor.netty.http.server.accessLogEnabled=true -Dspring.cloud.nacos.config.username=nacos -Dspring.cloud.nacos.config.password=%tgb6YHN -Dspring.cloud.nacos.discovery.username=nacos -Dspring.cloud.nacos.discovery.password=%tgb6YHN -Dproject=museum -server -Dlogs.home=/home/museum/logs -Dspring.profiles.active=test -Dfile.encoding=UTF-8 -Djava.awt.headless=true -Djava.io.tmpdir=/home/museum/tmp/ -DsqlQueryMaxLength=50000 -server -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djava.net.preferIPv4Stack=true -Xms256m -Xmx512m -Xmn64m -XX:PermSize=64m -Xss256k -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:LargePageSizeInBytes=128m -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -Dloader.path=/home/museum/program/lib,/home/museum/program/config -jar /home/museum/program/platform/txn/cztech-platform-transaction-service.jar
通过Arthas分析该Java运行的服务的进程
[museum@museum-test arthas]$ java -jar arthas-boot.jar
[INFO] JAVA_HOME: /usr/lib/java/jdk1.8.0_144/jre
[INFO] arthas-boot version: 3.7.2
[INFO] Process 2664227 already using port 3658
[INFO] Process 2664227 already using port 8563
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 2664227 /home/museum/program/platform/txn/cztech-platform-transaction-service.jar
[2]: 2262626 /home/museum/program/pac/mss/cztech-museum-pac-mss-service.jar
[3]: 1317781 /home/museum/program/gateway/service/cztech-service-gateway.jar
[4]: 1402193 /home/museum/program/gateway/manage/cztech-manage-gateway.jar
[5]: 2290469 /home/museum/program/vistor/video/cztech-museum-videolive-service.jar
[6]: 966342 /home/museum/program/platform/cashier/cztech-platform-cashier-service.jar
[7]: 867906 /home/museum/program/platform/common/cztech-platform-common-service.jar
[8]: 885471 /home/museum/program/vistor/portal/cztech-museum-outer-portal-service.jar
[9]: 482323 /home/museum/program/pac/boss/cztech-museum-pac-boss-service.jar
[10]: 3469280 /home/museum/program/manage/portal/cztech-museum-manage-portal-service.jar
[11]: 1468736 /home/museum/program/platform/form/cztech-platform-form-service.jar
[12]: 392412 org.apache.rocketmq.broker.BrokerStartup
[13]: 3370157 /home/museum/program/vistor/ticket/cztech-museum-ticket-service.jar
[14]: 108961 org.apache.rocketmq.namesrv.NamesrvStartup
[15]: 1461618 /home/museum/program/manage/official/cztech-museum-official-manage-service.jar
[16]: 667244 /home/museum/program/platform/bdm/cztech-platform-bdm-service.jar
[17]: 3800095 /home/museum/program/manage/human/cztech-common-human-service.jar
[18]: 2172873 /home/museum/program/platform/sso/cztech-platform-sso-service.jar
[19]: 198761 /home/museum/middleware/xxl-job-admin/bin/xxl-job-admin.jar
[20]: 1404089 /home/museum/program/gateway/visitor/cztech-visitor-gateway.jar
[21]: 119278 /home/museum/program/platform/bpm/cztech-platform-bpm-service.jar
[22]: 1203554 /home/museum/program/vistor/guide/cztech-museum-guide-service.jar
[23]: 114485 /home/museum/program/manage/material/cztech-common-material-service.jar
[24]: 89655 /home/museum/middleware/nacos/target/nacos-server.jar
[25]: 2088486 /home/museum/program/manage/ticket-check/cztech-museum-ticket-check-service.jar
[26]: 1284840 /home/museum/program/manage/openapi/cztech-museum-openapi-service.jar
[27]: 946997 /home/museum/program/collection/collection/cztech-museum-collection-service.jar
[28]: 1257580 /home/museum/program/vistor/member/cztech-museum-member-service.jar
通过Arthas看资源的耗用情况看板 : dashboard
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTED DAEMON
95 lettuce-epollEventLoop-6-1 main 5 RUNNABLE 41.84 2.091 358:35.676 false true
166 pool-4-thread-1 main 5 TIMED_WAITING 33.61 1.680 273:19.935 false false
103 lettuce-eventExecutorLoop-1-4 main 5 WAITING 1.02 0.051 5:30.768 false true
173 lettuce-eventExecutorLoop-1-18 main 5 WAITING 0.98 0.048 5:27.252 false true
207 Thread-43 main 10 TIMED_WAITING 0.97 0.048 4:50.993 false false
175 lettuce-eventExecutorLoop-1-20 main 5 WAITING 0.87 0.043 5:28.749 false true
163 lettuce-eventExecutorLoop-1-15 main 5 WAITING 0.83 0.041 5:32.722 false true
162 lettuce-eventExecutorLoop-1-14 main 5 WAITING 0.78 0.038 5:31.926 false true
105 lettuce-eventExecutorLoop-1-5 main 5 WAITING 0.66 0.033 5:35.152 false true
108 lettuce-eventExecutorLoop-1-6 main 5 WAITING 0.65 0.032 5:27.778 false true
58 lettuce-eventExecutorLoop-1-1 main 5 TIMED_WAITING 0.63 0.031 5:42.399 false true
109 lettuce-eventExecutorLoop-1-7 main 5 WAITING 0.63 0.031 5:29.607 false true
Memory used total max usage GC
heap 138M 249M 505M 27.37% gc.parnew.count 8149
par_eden_space 9M 51M 51M 18.66% gc.parnew.time(ms) 109122
par_survivor_space 1M 6M 6M 17.58% gc.concurrentmarksweep.count 11
cms_old_gen 127M 192M 448M 28.50% gc.concurrentmarksweep.time(ms) 1416
nonheap 221M 231M -1 95.78%
code_cache 77M 77M 240M 32.12%
metaspace 129M 137M -1 94.27%
compressed_class_space 15M 16M 1024M 1.50%
direct 1M 1M - 100.00%
mapped 0K 0K - 0.00%
Runtime
分析是哪一个线程导致的CPU占用率较高 : thread -n 3
命令:查看占用CPU率最高的前三个线程--> thread -n 3
"C1 CompilerThread0" [Internal] cpuUsage=1.63% deltaTime=3ms time=1170ms
"arthas-command-execute" Id=23 cpuUsage=0.11% deltaTime=0ms time=401ms RUNNABLE
at java.management@11.0.7/sun.management.ThreadImpl.dumpThreads0(Native Method)
at java.management@11.0.7/sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:466)
at com.taobao.arthas.core.command.monitor200.ThreadCommand.processTopBusyThreads(ThreadCommand.java:199)
at com.taobao.arthas.core.command.monitor200.ThreadCommand.process(ThreadCommand.java:122)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.process(AnnotatedCommandImpl.java:82)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.access$100(AnnotatedCommandImpl.java:18)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:111)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:108)
at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessTask.run(ProcessImpl.java:385)
at java.base@11.0.7/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base@11.0.7/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.7/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base@11.0.7/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base@11.0.7/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base@11.0.7/java.lang.Thread.run(Thread.java:834)
"VM Periodic Task Thread" [Internal] cpuUsage=0.07% deltaTime=0ms time=584ms
定位到线程使用的代码行
private ScheduledExecutorService service = Executors.newSingleThreadScheduledExecutor();
public void asyncCreateOrderFromQueue() {
Runnable task = () -> {
try {
Set<String> keys = redisTemplate.keys(STREAM_PREFIX + "*");
//检查队列是否有消息
if (keys == null || keys.isEmpty()) {
return;
}
for (String key : keys) {
//检查队列是否有数据
MapRecord<String, String, JSONObject> orderRecord = redisQueue.getOrderBodyFromRedis(key);
if (orderRecord == null) { //没有消息则不消费
continue;
}
Map orderRecordValue
= orderRecord.getValue();
PreOrderBusinessOrderRequest request = JSON.parseObject((String) orderRecordValue.get("payload"),
PreOrderBusinessOrderRequest.class);
// 调用下单接口
MerchantGoodsMapping mapping = merchantGoodsMappingRepository.getByGoodsType(request.getMerchantGoodsType());
//消息确认及删除消息
redisQueue.ackAndDelete(request.getGoodsNo(), orderRecord.getId().getValue());
Result<Object> createOrderResult = null;
try {
createOrderResult = doCreateOrder(request, mapping);
} catch (Exception e) {
log.error("订单创建异常:doCreateOrder", e);
} finally {
redisTemplate.opsForZSet().remove(RedisQueue.generateZSortName(request.getGoodsNo()), request.getUserId().toString());
}
//从zsort 中删除 ,删除该用户的请求
redisTemplate.opsForZSet().remove(RedisQueue.generateZSortName(request.getGoodsNo()), request.getUserId().toString());
//保存订单数据到缓存中
redisQueue.putCreateOrderResult(createOrderResult, request.getGoodsNo(), orderRecord.getId().getValue());
//发送小程序通知
}
} catch (Exception e) {
log.error("消费订单队列异常:asyncCreateOrderFromQueue--->" + e.getMessage(), e);
}
};
service.scheduleAtFixedRate(task, 0, redisQueue.getHandleInterval(), TimeUnit.MILLISECONDS);
}
分析原因:ScheduledExecutorService.scheduleAtFixedRate 方法导致的
这不就是一个定时调度线程吗:作用 100ms调度一次 定时一直调度使用 会导致CPU占用很大吗? 百度一下原因
ScheduledExecutorService.scheduleAtFixedRate 方法本身在正常使用情况下不应该导致CPU占用过高。然而,如果你遇到了CPU占用过高的问题,可能有以下几个原因和解决方法:
任务执行时间过长:
如果你传递给 scheduleAtFixedRate 的任务执行时间过长,或者任务中包含了密集的计算或IO操作,这可能导致调度器无法及时启动下一个任务。这样会导致任务重叠,从而占用更多的CPU资源。
解决方法:确保每个任务在预定的执行周期内能够及时完成。如果任务执行时间不确定或者非常长,考虑使用 scheduleWithFixedDelay 方法,并在任务结束后再次调度。
异常或错误处理不当:
如果任务中发生了异常而没有得到适当处理,可能会导致任务执行线程一直处于忙碌状态,从而导致CPU占用过高。
解决方法:在任务中添加适当的异常处理机制,确保异常情况下能够优雅地处理任务并释放资源。
任务数量过多:
如果同时有大量任务被调度,并且这些任务的执行时间有重叠,可能会导致线程池中的线程数增多,从而增加CPU的负载。
解决方法:评估任务的执行频率和资源消耗,根据实际情况调整线程池的大小或者优化任务的调度策略。
线程池配置不当:
如果线程池的大小配置不当,可能会导致任务排队时间过长,或者线程频繁被创建和销毁,进而影响CPU的使用效率。
解决方法:根据任务的性质和数量,适当地配置线程池的核心线程数、最大线程数以及队列类型(如有界队列或无界队列)。
JVM和操作系统调优:
如果以上方法都无法解决问题,可能需要对JVM和操作系统进行调优,以确保能够更好地处理高负载情况。
解决方法:调整JVM的堆内存大小(通过 -Xmx 和 -Xms 参数),以及操作系统的资源限制(如文件句柄、线程数等),以满足应用程序的需求。
综上所述,调用 ScheduledExecutorService.scheduleAtFixedRate 方法本身不应该导致性能瓶颈或CPU占用过高。问题很可能源于任务本身的执行逻辑或者线程池配置,需要综合考虑以上几个方面来进行排查和优化。
替换成别的方式:还是老实new线程或者使用自定义线程池,然后通过睡眠来做到定期执行策略吧
作者:隔壁老郭
个性签名:独学而无友,则孤陋而寡闻。做一个灵魂有趣的人!
如果觉得这篇文章对你有小小的帮助的话,记得在右下角点个“推荐”哦,博主在此感谢!
Java入门到入坟
万水千山总是情,打赏一分行不行,所以如果你心情还比较高兴,也是可以扫码打赏博主,哈哈哈(っ•̀ω•́)っ✎⁾⁾!