ambari 集群遇到的一些问题

1、org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE

spark 建表或者使用hive的jdbc driver的时候出现问题：

解决：

关闭Hdfs的安全检查（permission checking）：将hdfs-xml中 dfs.permissions 属性的值设置为 false 。但是这种方法的弊端是会导致Hdfs系统中所有的安全特性都被禁用，使Hdfs的安全性降低。在hdfs 的config中搜索就行了

2、failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.

3、sqoop 导入hive表，发现spark读取的条数和hive的不一致

关闭hive的acid的配置

4、azkaban配置出现azkaban.utils.UndefinedPropertyException: Missing required property 'azkaban.native.lib'问题：

（1）在executor 的bin的同级目录下打开 plugins/jobtypes 目录，编辑里面的properties文件，增加azkaban.native.lib=false

（2）修改配置文件：conf/azkaban.properties ，找到azkaban.jobtype.plugin.dir，将jobtypes的绝对路径填写上去

（3）重新启动executor和server。

5、hdfs 出现datanode 起不来，报错：Cannot set priority of datanode process，这个时候，需要把 hdfs 的配置的目录的权限给设置成777，如果还不行的话，需要格式化整个hdfs的数据，hdfs namenode -format

6、启动namenode的时候如果出现退不出safe mode，则需要 hadoop dfsadmin -safemode leave

但是如果出现连接不上namenode的8020端口，则需要：

（1）、退出ambari-server和namenode 所在的 ambari-agent

（2）、执行hdfs namenode -format

（3）、将hdfs的目录权限设置成777

（4）、重启ambari-server和angent

这个时候重启hdfs就不会出现连不上namenode的8020的问题，不过如果不是重启所有的ambari-agent可能会导致集群的datanode的cluster-id和namenode的对不上，导致datanode起来就退出。这个时候需要把namenode的VERSION里面的cluster-id复制到每一个datanode的VERSION文件里面。

有可能等一会儿就好了....让子弹飞一会儿...

7、版本：HDP3.0

mapreduce提交任务计算时，job已经结束，但是容器仍不能关闭持续等待五分钟

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

INFO[Thread-100] org.apache.hadoop.yarn.event.AsyncDispatcher:Waiting for AsyncDispatcher to drain.Thread state is :WAITING

五分钟后抛出异常:

org.apache.hadoop.yarn.exceptions.YarnException:Failed while publishing entity

...

Cause By :com.sun.jersey.api.client.ClientHandlerException:java.net.SocketTimeoutException:Read timed out

...

Cause By :java.net.SocketTimeoutException:Read timed out

发生这种情况是因为来自ATSv2的嵌入式HBASE崩溃。

解决这个问题的方法需要重置ATsv2内嵌HBASE数据库

1.停止Yarn服务

Ambari -> Yarn-Actions -> Stop

2.删除Zookeeper上的ATSv2 Znode

zookeeper-client -server zookeeper-quorum-servers

rmr /atsv2-hbase-unsecure或rmr /atsv2-hbase-secure（如果是kerberized集群）

3.从HDFS移动Hbase时间线服务器Hbase嵌入式数据库

hdfs dfs -mv /atsv2/hbase/tmp/

4.开始使用纱线服务

Ambari - > Yarn-Actions- > Start

再次重新提交任务，发现程序正常，问题解决

8、superset导出csv 中文字符，修改成gbk编码：

修改 superset/config.py

CSV_EXPORT = {

'encoding': 'gbk',

}

posted @ 2020-03-10 14:08 fbiswt 阅读(3180) 评论(0) 编辑收藏举报

刷新页面返回顶部

冯博

ambari 集群遇到的一些问题

公告

冯博

ambari 集群 遇到的一些问题

公告

ambari 集群遇到的一些问题