dfs常见的配置文件中的value与description
照抄于网络:
name | value | description |
dfs.namenode.logging.level | info | The logging level for dfs namenode. Other values are "dir"(trace namespace mutations), "block"(trace block under/over replications and blockcreations/deletions), or "all". |
dfs.secondary.http.address | 0.0.0.0:50090 | The secondary namenode http server address and port. If the port is 0 then the server will start on a free port. |
dfs.datanode.address | 0.0.0.0:50010 | The address where the datanode server will listen to. If the port is 0 then the server will start on a free port. |
dfs.datanode.http.address | 0.0.0.0:50075 | The datanode http server address and port. If the port is 0 then the server will start on a free port. |
dfs.datanode.ipc.address | 0.0.0.0:50020 | The datanode ipc server address and port. If the port is 0 then the server will start on a free port. |
dfs.datanode.handler.count | 3 | The number of server threads for the datanode. |
dfs.http.address | 0.0.0.0:50070 | The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port. |
dfs.https.enable | false | Decide if HTTPS(SSL) is supported on HDFS |
dfs.https.need.client.auth | false | Whether SSL client certificate authentication is required |
dfs.https.server.keystore.resource | ssl-server.xml | Resource file from which ssl server keystore information will be extracted |
dfs.https.client.keystore.resource | ssl-client.xml | Resource file from which ssl client keystore information will be extracted |
dfs.datanode.https.address | 0.0.0.0:50475 | |
dfs.https.address | 0.0.0.0:50470 | |
dfs.datanode.dns.interface | default | The name of the Network Interface from which a data node should report its IP address. |
dfs.datanode.dns.nameserver | default | The host name or IP address of the name server (DNS) which a DataNode should use to determine the host name used by the NameNode for communication and display purposes. |
dfs.replication.considerLoad | true | Decide if chooseTarget considers the target's load or not |
dfs.default.chunk.view.size | 32768 | The number of bytes to view for a file on the browser. |
dfs.datanode.du.reserved | 0 | Reserved space in bytes per volume. Always leave this much space free for non dfs use. |
dfs.name.dir | ${hadoop.tmp.dir}/dfs/name | Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. |
dfs.name.edits.dir | ${dfs.name.dir} | Determines where on the local filesystem the DFS name node should store the transaction (edits) file. If this is a comma-delimited list of directories then the transaction file is replicated in all of the directories, for redundancy. Default value is same as dfs.name.dir |
dfs.web.ugi | webuser,webgroup | The user account used by the web interface. Syntax: USERNAME,GROUP1,GROUP2, ... |
dfs.permissions | true | If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories. |
dfs.permissions.supergroup | supergroup | The name of the group of super-users. |
dfs.data.dir | ${hadoop.tmp.dir}/dfs/data | Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. |
dfs.replication | 3 | Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. |
dfs.replication.max | 512 | Maximal block replication. |
dfs.replication.min | 1 | Minimal block replication. |
dfs.block.size | 67108864 | The default block size for new files. |
dfs.df.interval | 60000 | Disk usage statistics refresh interval in msec. |
dfs.client.block.write.retries | 3 | The number of retries for writing blocks to the data nodes, before we signal failure to the application. |
dfs.blockreport.intervalMsec | 3600000 | Determines block reporting interval in milliseconds. |
dfs.blockreport.initialDelay | 0 | Delay for first block report in seconds. |
dfs.heartbeat.interval | 3 | Determines datanode heartbeat interval in seconds. |
dfs.namenode.handler.count | 10 | The number of server threads for the namenode. |
dfs.safemode.threshold.pct | 0.999f | Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Values less than or equal to 0 mean not to start in safe mode. Values greater than 1 will make safe mode permanent. |
dfs.safemode.extension | 30000 | Determines extension of safe mode in milliseconds after the threshold level is reached. |
dfs.balance.bandwidthPerSec | 1048576 | Specifies the maximum amount of bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second. |
dfs.hosts | Names a file that contains a list of hosts that are permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted. | |
dfs.hosts.exclude | Names a file that contains a list of hosts that are not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded. | |
dfs.max.objects | 0 | The maximum number of files, directories and blocks dfs supports. A value of zero indicates no limit to the number of objects that dfs supports. |
dfs.namenode.decommission.interval | 30 | Namenode periodicity in seconds to check if decommission is complete. |
dfs.namenode.decommission.nodes.per.interval | 5 | The number of nodes namenode checks if decommission is complete in each dfs.namenode.decommission.interval. |
dfs.replication.interval | 3 | The periodicity in seconds with which the namenode computes repliaction work for datanodes. |
dfs.access.time.precision | 3600000 | The access time for HDFS file is precise upto this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS. |
dfs.support.append | false | Does HDFS allow appends to files? This is currently set to false because there are bugs in the "append code" and is not supported in any prodction cluster. |
docs/hdfs-default.html
这里是hdfs参数的含义。
其中可见
dfs.replication.min
最小副本数
dfs.safemode.threshold.pct
阈值比例
Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Values less than or equal to 0 mean not to start in safe mode. Values greater than 1 will make safe mode permanent.
指定应有多少比例的数据块满足最小副本数要求。小于等于0意味不进入安全模式,大于1意味一直处于安全模式。
dfs.replication.min 是定义数据块复制的最小复制量、
dfs.safemode.threshold.pct定义当小与一个比例的数据块没有被复制, 那就将系统切换成安全模式, 所以在这里填写的值应该是0~1之间的数, 也就是你所认为系统能安全运行的最小复制延迟量, 如果填写大于或等于1, 那不意味着系统始终在安全模式下, 这样是不能对外提供服务的。 如果该值填写过小, 那需要考虑复制的数据是否安全了, 这个值还是不要改的好,使用默认的参数 99.9%