Hadoop-env.sh[翻译]
说明: 某天 ,把hadoop-env。sh的注释看了看 , 感觉受益匪浅,于是想要写一篇告诉大家,文档是最靠谱的,鉴于我的水平有限,只能翻译大概,切勿吐槽,提建议请留言
摘要:
1.这个文件中只有JAVA_HOME是必须要写的,其他的都是有默认值的PS 这个变量最好写地址, 而不要写${JAVA_HOME}
# Set Hadoop-specific environment variables here.
在这个文件中设置hadoop特有的环境变量
# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.
这里只要求JAVA_HOME的环境变量 , 其他的都是可选的。正确运行这个分布式的配置文件时,
最好在这个文件里面设置JAVA_HOME ,所以在远程的节点中也要正确的定义它
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
使用一个符合java的 , 这是必须要求的。
export JAVA_HOME=“《写入你的JDK所在位置》”
# Extra Java CLASSPATH elements. Optional. 额外的java的alsspath元素 ,这个是可选的
# export HADOOP_CLASSPATH=
# The maximum amount of heap to use, in MB. Default is 1000. Heap区使用的大小,默认是1000M
# export HADOOP_HEAPSIZE=2000
# Extra Java runtime options. Empty by default. java的运行选项,有默认值
# export HADOOP_OPTS=-server
# Command specific options appended to HADOOP_OPTS when specified以下指定的选项追加到HADOOP_OPTS中
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS"
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS"
export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"
# export HADOOP_TASKTRACKER_OPTS=
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
# export HADOOP_CLIENT_OPTS
# Extra ssh options. Empty by default.SSH的选项,有默认值
# export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
# Where log files are stored. $HADOOP_HOME/logs by default. 指明日志放在哪里,$HADOOP_HOME/logs 是默认值
# export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
# File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default.指定从哪个 远程的从节点的主机名, $HADOOP_HOME/conf/slaves 是默认读取的文件
# export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
# host:path where hadoop code should be rsync'd from. Unset by default.
# export HADOOP_MASTER=master:/home/$USER/src/hadoop
# Seconds to sleep between slave commands. Unset by default. This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HADOOP_SLAVE_SLEEP=0.1
在salve命令中 , 没有设置的话使用默认值。 这个在大型的集群里面很有用
# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
# the users that are going to run the hadoop daemons. Otherwise there is
# the potential for a symlink attack.
# export HADOOP_PID_DIR=/var/hadoop/pids
指定PID文件放在那里 。默认是/tmp by default.
注意:应该设置一个用户可以在hadoop运行的守护进程 , 否则会有潜在的符号链接的攻击
# A string representing this instance of hadoop. $USER by default. 一个代表hadoop实例的字符串 , 默认是$USER
# export HADOOP_IDENT_STRING=$USER
# The scheduling priority for daemon processes. See 'man nice'. 一个为守护进程调节优先级的进程
# export HADOOP_NICENESS=10
export JAVA_HOME=/usr/java/jdk1.7.0_76