毕设开发日志2017-12-01-Scan超时
【前言】
本篇博客主要描述一下在开发过程中遇到的scan的超时问题。
【问题描述】
刚刚完成了对索引表的定义和建议,并且在单元测试中对该表进行插入和扫描时均未发现错误。但是在对该表进行整体更新时,需要在扫描weather表的过程中对该表进行不断的更新操作。但是发现每次更新到第100条数据的时候就报scan的超时错误。即使只更新一行数据中的某一列也是如此(只获取区块首的时候扫描量会大大下降),于是证明不是扫描量的问题。具体报错如下
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hbase.client.ScannerTimeoutException: 1255382ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:97) at com.zxc.fox.dao.IndexDao.updateAll(IndexDao.java:118) at com.zxc.fox.dao.IndexDao.main(IndexDao.java:38) Caused by: org.apache.hadoop.hbase.client.ScannerTimeoutException: 1255382ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:417) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:332) at org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:94) ... 2 more Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 533, already closed? at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2017) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:313) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:241) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:310) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:291) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException): org.apache.hadoop.hbase.UnknownScannerException: Name: 533, already closed? at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2017) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1199) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:31889) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:200) ... 9 more
【解决过程】
先查看了datanode的错误日志,然后参考了如下两篇博客:参考博客01,参考博客02,在代码中配置了超时时间,未成功。修改了hadoop配置文件,但是没有起作用。仔细思考了问题出错的原因,比对了之前自己写的一些方法,最后发现我嵌套了两个scan,于是修改此部分代码,先用一个scan得到所有的城市id之后保存在一个list里,在遍历这个list来代替之前的scan,这样做在实践中没有出现明显的时间消耗,但是却避免了scan超时问题。
【体会】
出现该问题可能是由多种原因造成的,先检查一下。可以从如下几方面考虑
1. 是否你自己每次的scan处理较耗时? -> 优化处理程序,scan一些设置调优(比如setBlockCache(false) )
2. 是否每次scan的caching设置过大? -> 减少caching (一般默认先设100)
3. 是否是网络或机器负载问题? -> 查看集群原因
4. 是否HBase本身负载问题? -> 查看RegionServer日志