hadoop 集群子节点不启动 spark-slave1: ssh: Could not resolve hostname spark-slave1: Name or service not known
报错信息:
./start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [master] master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-master.out spark-slave1: ssh: Could not resolve hostname spark-slave1: Name or service not known spark-slave2: ssh: Could not resolve hostname spark-slave2: Name or service not known Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-master.out starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-master.out spark-slave2: ssh: Could not resolve hostname spark-slave2: Name or service not known spark-slave1: ssh: Could not resolve hostname spark-slave1: Name or service not known
分析:报错信息大概意思是无法解析spark-slave1和spark-slave2主机名,我子节点的主机名明明是node1和node2,找了很久终于找到了问题所在
在slaves文件中
vim /usr/local/hadoop/etc/hadoop/slaves spark-slave1 spark-slave2
设置了默认的子节点主机名,改为自己的子节点即可
vim /usr/local/hadoop/etc/hadoop/slaves
node1
node2
然后重启hadoop
./stop-all.sh //关闭 ./start-all.sh //启动
然后发现就不报错了,子节点启动成功
@master:/usr/local/hadoop/sbin# jps 5698 ResourceManager 6403 Jps 5547 SecondaryNameNode 5358 NameNode
@node1:~# jps 885 Jps 744 NodeManager 681 DataNode
@node2:~# jps 914 Jps 773 NodeManager 710 DataNode
总结:hadoop下的slaves文件官方解释为:集群里的一台机器被指定为 NameNode,另一台不同的机器被指定为JobTracker。这些机器是masters,余下的机器即作为DataNode也作为TaskTracker,这些机器是slaves,在slaves文件中列出所有slave的主机名或者IP地址,一行一个。 意思就是子节点主机名或ip,当然也可以设置为172之类的ip地址。