博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

客户端 CPU100%问题分析

Posted on 2015-05-07 00:29  No.40  阅读(667)  评论(0编辑  收藏  举报

问题描述:

测试运行一段时间后,测试客户端CPU100%Loadrunner界面有错误报出。

问题分析过程:

抓取堆栈信息

分析堆栈发现线程有1000多,大部分为BLOCKED状态,ACTIVE状态基本看到的都是nio的,暂时没看到问题。

搜索测试代码类名,看看有没有测试代码引起的问题。

发现测试代码有好几个以下的堆栈

Thread 1134: (state = IN_NATIVE)

 - java.net.NetworkInterface.getAll() @bci=0 (Compiled frame; information may be imprecise)

 - java.net.NetworkInterface.getNetworkInterfaces() @bci=0, line=334 (Compiled frame)

 - com.alibaba.rocketmq.remoting.common.RemotingUtil.getLocalAddress() @bci=0, line=112 (Compiled frame)

 - com.alibaba.rocketmq.client.ClientConfig.<init>() @bci=19, line=32 (Compiled frame)

 - com.alibaba.rocketmq.client.producer.DefaultMQProducer.<init>

(java.lang.String, com.alibaba.rocketmq.remoting.RPCHook) @bci=1, line=95 (Compiled frame)

 - com.alibaba.rocketmq.client.producer.DefaultMQProducer.<init>(java.lang.String) @bci=3, line=86 (Compiled frame)

 - ********************MQProducer.<init>(java.lang.String, java.lang.String) @bci=71, line=62 (Compiled frame)

 - ********************.RocketMQ.sendMessage() @bci=76,line=119 (Compiled frame)     //119为源代码行

 - ********************.RocketMQ$1$1.safeRun() @bci=7, line=53 (Compiled frame)

 - ********************.SafeRunnable.run() @bci=1, line=13 (Compiled frame)

 - java.util.concurrent.ThreadPoolExecutor.runWorker

(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)

 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)

 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

发现 java.net.NetworkInterface.getAll() ,此方法比较耗费CPU,之前遇到过类似案例。接着分析为什么这几个线程会卡到这。

以下是问题代码,119行是(1)的位置

public void sendMessage() {

        try {

            //

            t = UserManager.getTransManager().CreateTransaction("Performace-client", trace);

            Message msg = new Message("Performace", msgContent.getBytes("UTF-8"));

            if (producer == null) {

                producer = new MQProducer("Performace", "192.168.143.135:9876");      (1)

                producer.start();

                rst = producer.product(msg);

            } else {

                rst = producer.product(msg);

            }

   

            //

   

        } catch (Exception e) {

            //

            if (producer != null) {

                producer.shutdown();

                producer = null;

            }

        }

    }

问题分析

sendMessage方法会被随机的注册到一个timer线程池上,有可能会在同一时间点或者很近时间点同时执行该方法。

producer.product(msg);为给远端发送信息,如果因为网络原因或者其他未知原因导致Exception,会把producer赋值为null

当再次执行sendMessage会重新初始化producer,如果恰好有多线程并发执行sendMessage,可能会导致重复初始化以及其他并发问题,导致恶性循环。

修改后

public void sendMessage() {

        try {

            //

            t = UserManager.getTransManager().CreateTransaction("Performace-client", trace);

            Message msg = new Message("Performace", msgContent.getBytes("UTF-8"));

            synchronized (this) {

                if (producer == null) {

                    producer = new MQProducer("Performace", "192.168.143.135:9876");

                    producer.start();

                }

            }

            rst = producer.product(msg);

   

            //

   

        } catch (Exception e) {

            //

            if (producer != null) {

                producer.shutdown();

                producer = null;

            }

        }

    }

加个同步等待,问题解决

   

作者:No.40

Blog:http://www.cnblogs.com/no40