YARN异常YarnException:Failed while publishing entity的解决方案

版本:HDP3.0
mapreduce提交任务计算时,job已经结束,但是容器仍不能关闭持续等待五分钟

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING
INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

五分钟后抛出异常:

org.apache.hadoop.yarn.exceptions.YarnException:Failed while publishing entity
...
Cause By :com.sun.jersey.api.client.ClientHandlerException:java.net.SocketTimeoutException:Read timed out
...
Cause By :java.net.SocketTimeoutException:Read timed out

发生这种情况是因为来自ATSv2的嵌入式HBASE崩溃。

解决这个问题的方法需要重置ATsv2内嵌HBASE数据库
1.停止Yarn服务

Ambari -> Yarn-Actions -> Stop

2.删除Zookeeper上的ATSv2 Znode

zookeeper-client -server zookeeper-quorum-servers
rmr /atsv2-hbase-unsecure或rmr /atsv2-hbase-secure(如果是kerberized集群)

3.从HDFS移动Hbase时间线服务器Hbase嵌入式数据库

hdfs dfs -mv /atsv2/hbase/tmp/

4.开始使用yarn服务

Ambari - > Yarn-Actions- > Start

再次重新提交任务,发现程序正常,问题解决

posted @ 2019-02-25 17:44  sssuperMario  阅读(895)  评论(0编辑  收藏  举报