xgboost(1.0) yarn(附CDH 5.14的个人心得)
个人心得(CDH5.14,心得是对下文转载步骤的补充):
CDH5.14的config.mk
config.mk的配置要改成如下:
USE_HDFS = 1
HDFS_LIB_PATH = /home/user/xgboost/xgboost-package/libhdfs/lib
HADOOP_HOME = /opt/cloudera/parcels/CDH
HADOOP_HDFS_HOME = /opt/cloudera/parcels/CDH
环境变量
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HADOOP_HOME=/opt/cloudera/parcels/CDH
修改yarn.py
编辑 xgboost-package/dmlc-core/tracker/dmlc_tracker/yarn.py,
在48行修改:
out = out.decode('utf-8').split('\n')[0].split()
编译dmlc-yarn.jar
No FileSystem for scheme: hdfs 错误
修改xgboost目录下的文件/xgboost-package/dmlc-core/tracker/yarn/src/main/java/org/apache/hadoop/yarn/dmlc/Client.java