十四:Using CGroups with YARN
Cgroups可以控制linux 上应用程序的资源(内存、CPU)使用,yarn可以使用Cgroups来CPU使用。Cgroups的配置,在yarn-site.xml中设置:
1)启用Cgroups:
Configuration Name | Description |
---|---|
yarn.nodemanager.container-executor.class | This should be set to “org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor”. CGroups is a Linux kernel feature and is exposed via the LinuxContainerExecutor. |
yarn.nodemanager.linux-container-executor.resources-handler.class | This should be set to “org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler”. Using the LinuxContainerExecutor doesn’t force you to use CGroups. If you wish to use CGroups, the resource-handler-class must be set to CGroupsLCEResourceHandler. |
yarn.nodemanager.linux-container-executor.cgroups.hierarchy | The cgroups hierarchy under which to place YARN proccesses(cannot contain commas). If yarn.nodemanager.linux-container-executor.cgroups.mount is false (that is, if cgroups have been pre-configured), then this cgroups hierarchy must already exist |
yarn.nodemanager.linux-container-executor.cgroups.mount | Whether the LCE should attempt to mount cgroups if not found - can be true or false. |
yarn.nodemanager.linux-container-executor.cgroups.mount-path | Where the LCE should attempt to mount cgroups if not found. Common locations include /sys/fs/cgroup and /cgroup; the default location can vary depending on the Linux distribution in use. This path must exist before the NodeManager is launched. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler, and yarn.nodemanager.linux-container-executor.cgroups.mount is true. A point to note here is that the container-executor binary will try to mount the path specified + “/” + the subsystem. In our case, since we are trying to limit CPU the binary tries to mount the path specified + “/cpu” and that’s the path it expects to exist. |
yarn.nodemanager.linux-container-executor.group | The Unix group of the NodeManager. It should match the setting in “container-executor.cfg”. This configuration is required for validating the secure access of the container-executor binary. |
2)用Cgroups控制资源:
Configuration Name | Description |
---|---|
yarn.nodemanager.resource.percentage-physical-cpu-limit | This setting lets you limit the cpu usage of all YARN containers. It sets a hard upper limit on the cumulative CPU usage of the containers. For example, if set to 60, the combined CPU usage of all YARN containers will not exceed 60%. |
yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage | CGroups allows cpu usage limits to be hard or soft. When this setting is true, containers cannot use more CPU usage than allocated even if spare CPU is available. This ensures that containers can only use CPU that they were allocated. When set to false, containers can use spare CPU if available. It should be noted that irrespective of whether set to true or false, at no time can the combined CPU usage of all containers exceed the value specified in “yarn.nodemanager.resource.percentage-physical-cpu-limit”. |
注意:
1)要想使用Cgroup container,则linux 主机必须启用Cgroups
2)当前yarn并不限制CPU使用;yarn通过线程控制来限制应用使用的内存量;
参考资源:
http://developer.51cto.com/art/201401/426616.htm
(6)yarn.scheduler.maximum-allocation-mb
单个任务可申请的最多物理内存量,默认是8192(MB)。
默认情况下,YARN采用了线程监控的方法判断任务是否超量使用内存,一旦发现超量,则直接将其杀死。由于Cgroups对内存的控制缺乏灵活性 (即任务任何时刻不能超过内存上限,如果超过,则直接将其杀死或者报OOM),而Java进程在创建瞬间内存将翻倍,之后骤降到正常值,这种情况下,采用 线程监控的方式更加灵活(当发现进程树内存瞬间翻倍超过设定值时,可认为是正常现象,不会将任务杀死),因此YARN未提供Cgroups内存隔离机制。