hadoop-yarn SLS运行中的问题

在运行SLS时可能会碰到如下问题:

命令:

sh $HADOOP_HOME/share/hadoop/tools/sls/bin/slsrun.sh --input-sls=/home/c/sls/output2/sls-jobs.json --nodes=/home/c/sls/output2/sls-nodes.json --output-dir=/home/c/sls/output1 --print-simulation

其中input-sls和--nodes的文件最好加上绝对路径,如果只写一个文件名,则默认从当前文件夹下取文件。

1.报错:

Exception in thread "main" java.lang.RuntimeException: 
java.lang.NullPointerException
     at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
     at 
org.apache.hadoop.yarn.sls.SLSRunner.startAMFromSLSTraces(SLSRunner.java:313)
     at org.apache.hadoop.yarn.sls.SLSRunner.startAM(SLSRunner.java:248)
     at org.apache.hadoop.yarn.sls.SLSRunner.start(SLSRunner.java:145)
     at org.apache.hadoop.yarn.sls.SLSRunner.main(SLSRunner.java:528)
Caused by: java.lang.NullPointerException
     at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
     at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
     ... 4 more

原因:找不到sls-runner.xml,只有在/hadoop/etc/hadoop文件夹下的xml配置文件才会被发现,而在当前hadoop版本中,sls-runner.xml在/hadoop/share/hadoop/tools/sls/sample-conf中。因此将sls-runner.xml拷贝至/hadoop/etc/hadoop下即可。


2.报错:

java.lang.NullPointerException 
at org.apache.hadoop.yarn.sls.web.SLSWebApp.(SLSWebApp.java:86)

原因:找不到html文件夹,而html文件夹在/hadoop/share/hadoop/tools/sls目录下,因此到该目录下,执行slsrun.sh脚本即可。


3.报错:

18/07/11 16:58:48 WARN capacity.CapacityScheduler: Couldn't find application application_1531299523163_0001
18/07/11 16:58:48 WARN resourcemanager.RMAuditLogger: USER=jenkins	OPERATION=Application Finished - Failed	TARGET=RMAppManager	RESULT=FAILURE	DESCRIPTION=App failed with state: FAILED	PERMISSIONS=Application application_1531299523163_0001 submitted by user jenkins to unknown queue: sls_queue_1	APPID=application_1531299523163_0001
18/07/11 16:58:48 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1531299523163_0001,name=N/A,user=jenkins,queue=sls_queue_1,state=FAILED,trackingUrl=N/A,appMasterHost=N/A,startTime=1531299528010,finishTime=1531299528035,finalStatus=FAILED

容器启动失败

原因:yarn-site.xml配置文件没有配置好,在/hadoop/etc/hadoop下有个空的yarn-site.xml,系统默认执行该文件,因此报错。其实在sls/sample-conf文件夹下除了上面的sls-runner.xml文件,还有一个专门为sls例子准备的yarn-site.xml。将此文件替换至/hadoop/etc/hadoop的yarn-site.xml即可。


posted on 2018-07-11 17:27  sichenzhao  阅读(341)  评论(0编辑  收藏  举报

导航