motan的线程保护策略
背景:
线上ne-account服务由于调用量及qps都较高,在上线期间,motan日志打出如下错误:
2018-09-14 12:18:19 [ERROR] ThreadProtectedRequestRouter reject request: request_method=XXXXXX request_counter=76 =76 max_thread=100
github上作者的回复: https://github.com/weibocom/motan/issues/551
重现:
配置motan服务MotanDemoService,提供4个方法,其中hello1的处理逻辑为sleep 1s
2018-09-14 13:58:27 [INFO] add method sign:hell817ff3733269, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello, paramtersDesc=java.lang.String, version=1.0] 2018-09-14 13:58:27 [INFO] add method sign:helld1f0f2c9182d, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello2, paramtersDesc=java.lang.String, version=1.0] 2018-09-14 13:58:27 [INFO] add method sign:hell52228017a74a, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello4, paramtersDesc=java.lang.String, version=1.0] 2018-09-14 13:58:27 [INFO] add method sign:hellaea7504ac806, methodinfo:MethodInfo [group=motan-demo-rpc, interfaceName=com.weibo.motan.demo.service.MotanDemoService, methodName=hello3, paramtersDesc=java.lang.String, version=1.0]
服务配置:
<motan:protocol id="demoMotan" default="true" name="motan" maxServerConnection="80000" maxContentLength="1048576" maxWorkerThread="100" minWorkerThread="100" threads="100" />
motan服务暴露在8002端口,启动一个客户端,请求服务器1000次
汇总结果如下:
- 测试场景1
75并发访问服务端,请求1000次,结果如下:
motan demo is finish. success: 1000 error: 0 -
测试场景2
80并发访问服务端,请求1000次,结果如下:
motan demo is finish. success: 150 error: 850
服务端错误信息:
2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100 2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100 2018-09-14 14:58:03 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100 2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=76 =76 max_thread=100 2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=78 =78 max_thread=100 2018-09-14 14:58:04 [ERROR] ThreadProtectedRequestRouter reject request: request_method=com.weibo.motan.demo.service.MotanDemoService.hello request_counter=77 =77 max_thread=100
综述:跟作者描述一致,当并发度达到 方法数的 3/4 * 100(线程数)= 75时,触发motan的熔断机制,产生大量拒绝请求,客户端报错如下:
com.weibo.api.motan.exception.MotanServiceException: error_message: RoundRobinLoadBalance No available referers for call : referers_size=1 requestId=1611565139528515715 interface=com.weibo.motan.demo.service.MotanDemoService method=hello(java.lang.String), status: 503, error_code: 10001,r=null at com.weibo.api.motan.cluster.loadbalance.AbstractLoadBalance.selectToHolder(AbstractLoadBalance.java:84) at com.weibo.api.motan.cluster.ha.FailoverHaStrategy.selectReferers(FailoverHaStrategy.java:90) at com.weibo.api.motan.cluster.ha.FailoverHaStrategy.call(FailoverHaStrategy.java:53) at com.weibo.api.motan.cluster.support.ClusterSpi.call(ClusterSpi.java:73) at com.weibo.api.motan.proxy.RefererInvocationHandler.invoke(RefererInvocationHandler.java:132) at com.sun.proxy.$Proxy10.hello(Unknown Source) at com.weibo.motan.demo.client.DemoRpcCli
代码:
类ProviderProtectedMessageRouter
protected boolean isAllowRequest(int requestCounter, int totalCounter, int maxThread, Request request) { if (methodCounter.get() == 1) { return true; } // 该方法第一次请求,直接return true if (requestCounter == 1) { return true; } // 不简单判断 requsetCount > (maxThread / 2) ,因为假如有2或者3个method对外提供, // 但是只有一个接口很大调用量,而其他接口很空闲,那么这个时候允许单个method的极限到 maxThread * 3 / 4 if (requestCounter > (maxThread / 2) && totalCounter > (maxThread * 3 / 4)) { return false; } // 如果总体线程数超过 maxThread * 3 / 4个,并且对外的method比较多,那么意味着这个时候整体压力比较大, // 那么这个时候如果单method超过 maxThread * 1 / 4,那么reject return !(methodCounter.get() >= 4 && totalCounter > (maxThread * 3 / 4) && requestCounter > (maxThread * 1 / 4)); }
解决方案:
除了作者提到的3点:
1是想办法提高业务处理效率,减少单个请求的处理耗时;
2是增加server节点数量,降低单server的qps;
3是如果服务端性能良好可以增加处理线程数量,例如1500
对于不好改动的接口,需要将请求量大的接口单独抽离出来,如下:
<motan:protocol id="demoMotan" default="true" name="motan" maxServerConnection="80000" maxContentLength="1048576" maxWorkerThread="100" minWorkerThread="100" threads="100" /> <motan:protocol id="demoMotanSingleMethod" default="true" name="motan" maxServerConnection="80000" maxContentLength="1048576" maxWorkerThread="100" minWorkerThread="100" threads="100" />
期望接口能够承受100并发,测试结果如下:
motan demo is finish. success: 1000 error: 0