YARN笔记——技术点汇总 - netoxi

公告

昵称： netoxi
园龄： 7年6个月
粉丝： 71
关注： 2

YARN笔记——技术点汇总

概况

1. YARN：Yet Another Resource Negotiator，统一资源管理调度平台。

2. 与MRv1：脱胎于MRv1，解决了MRv1可靠性差、扩展性差、资源利用率低、无法支持异构计算资源的问题。

3. YARN与计算框架：整个Hadoop以YARN为中心，计算框架都可插拔。

原理

资源调度器分类

1. 集中调度器（Monolithic Scheduler）

a) 原理：全局只运行一个中央调度器。

b) 特点：高并发作业情况下，容易出现性能瓶颈。

c) 举例：MRv1。

2. 双层调度器（Two-level Scheduler）

a) 原理：中央调度器管理集群中所有资源，按照一定策略（如FIFO、Fair、Capacity、Delay、Dominant Resource Fair）将资源粗粒度地分配给个框架调度器；各框架收到资源后再根据作业特性细粒度地将资源分配给容器执行具体计算任务。

b) 特点：二级调度器大大减轻了中央调度器的负载，提升并发和资源利用率。

c) 举例：Apache YARN、Apache Mesos。

3. 状态共享调度器（Shared-state Scheduler）

a) 起源：Google的Omega论文。

b) 特点：不成熟。

YARN架构

1. 架构图

2. 守护进程

名称	集群中数目	作用
ResourceManager	1	负责集群所有资源的统一管理和调度
NodeManager	多个（至少1个）	负责管理单计算节点、容器的生命周期管理、追踪节点健康状况

3. 架构与主要进程

ResourceManager

1. 职责：双层调度器中的中央调度器。

2. 调度过程：当多个作业同时提交时，ResourceManager在多个竞争的作业间权衡优先级并仲裁；资源分配后，ResourceManager不再关心应用内部资源分配和每个应用状态。减轻了ResourceManager负荷，增强了扩展性。

3. 架构

a) YarnScheduler：基于应用程序的资源申请执行资源调度，目前能调度CPU核和内存，支持FIFO、Capacity、自适应、自学习、动态优先级等调度器，相当于双层调度器的中央调度器。

b) ApplicationManager：负责管理已提交的应用集合。应用提交后，检查ApplicationMaster资源请求合法性，再确定没有其他已提交的应用使用相同ID。

c) ApplicationMasterService：响应来自所有ApplicationMaster的请求，包括注册新ApplicationMaster、接收来自任意正在结束的ApplicationMaster的终止或取消注册请求、认证来自不同ApplicationMaster的所有请求、获取所有来自所有运行ApplicationMaster的Container分配和释放请求。确保任意时间任意ApplicationMaster只有一个线程可以发送请求给ResourceManager，即将来自ApplicationMaster的RPC请求串行化，实现中央调度器悲观并发。

d) ResourceTrackerService：负责和NodeManager交互。包括响应NodeManager周期性发送的PRC请求心跳、注册新节点、接收前面注册节点的心跳、确保只有合法的节点可以和ResourceManager通信。

e) ClientService：负责处理来自客户端到ResourceManager的RPC通信，有作业提交、作业终止、获取应用程序、队列、集群统计、用户ACL等信息。

NodeManager

1. 职责：管理YARN集群中单计算节点。

2. 架构

a) NodeStatusUpdater：NodeManager启动时，向ResourceManager注册，并发送该节点的可用资源信息；接下来与ResourceManager通信时，周期性汇报新启动的Container、正在运行的Container、状态更新和已完成的Container信息；ResourceManager也可通过NodeStatusUpdater销毁正在运行的Container。

b) ContainerManager：NodeManager最核心组件之一，有许多子组件构成，每个子组件负责容器管理的一部分功能，协同管理节点的所有容器。

c) RPC Server：ApplicationMaster和NodeManager间通信的唯一通道。

d) ResourceLocalizationService：负责Container所需资源本地化，按照正确的URI从HDFS下载Container所需的文件资源，如JAR文件。

e) AuxService：运行用户通过配置附属服务的方式扩展功能，使每个节点可定制特定框架需要的服务（hdfs-site.xml的参数“yarn.nodemanager.aux-services”）。

f) ContainerLauncher：维护一个线程池并行完成Container操作（如启停Container）。

g) ContainerMonitor：负责监控Container资源使用量。

h) LogHandler：一个可插拔组件，用户自定义控制Container日志保持方式。

i) ContainerExecutor：与底层操作系统交互，安全存放Container需要的文件和目录，进而以一种安全的方式启动和清除Container对应的进程。

ApplicationMaster

1. ApplicationMaster：每个应用程序都有自己专属的ApplicationMaster，不同计算框架实现也不同。

2. 职责

a) 向ResourceManager申请资源；

b) 在对应NodeManager上启动Container来执行任务，并监控Container状态；

c) 双层调度器中的二级调度器。

Container

1. Container：动态资源分配单位，封装了多维度资源，如CPU、内存、磁盘、网络（目前只支持CPU、内存）。

2. 申请：ApplicationMaster向ResourceManager申请资源时，ResourceManager返回的资源用Container表示。

3. 任务执行：每个任务只有一个Container，只能使用该资源代表的资源量。

YARN工作流程

1. 客户端向ResourceManager提交应用。

2. ResourceManager向NodeManager发出指令，为该应用启动一个Container，并在其中启动ApplicationMaster。

3. ApplicationMaster向ResourceManager注册。

4. ApplicationMaster采用轮训方式向ResourceManager的YarnScheduler申领资源。

5. ApplicationMaster与资源对应的NodeManager通信，请求启动计算任务。

6. NodeManager根据资源量大小、所需运行环境，在Container中启动任务。

7. 各任务向ApplicationMaster汇报各自状态和进度。

8. 应用运行完成后，ApplicationMaster向ResourceManager注销并关闭自己。

YARN资源调度

1. YARN资源管理机制

a) 资源池：资源以资源池形式组织，每个资源池对应一个队列，用户提交作业时以“-queue”参数指定队列。

b) 队列：队列以树形方式组织，整个集群资源用root表示。未指定队列默认使用root。

1. FIFO Scheduler：先按作业优先级高低，再按到达时间先后选择执行作业。

2. Capacity Scheduler

a) 原理：以队列划分资源，每个队列可设定资源最低保证和最大使用上限，每个用户可设定资源使用上限；当某队列资源空闲时，可将剩余资源共享给其他队列。

b) 调度过程：先按资源使用率由小到大遍历各子队列，直到子队列为叶子队列为止，则选择该队列分配用户，这样保证选中最空闲队列；再选择提交时间较早的作业；接着选择优先级高的Container，优先级排序方式为Node Local（本地）、Rack Local（同机架）、No Local（跨机架）；最后队列内部采用FIFO调度。

3. Fair Scheduler

a) Fair Scheduler：由Fackbook开发，与Capacity Scheduler相似。

b) 与Capacity Scheduler区别

i. 队列内部支持多种调度策略：队列内部可选择FIFO、Fair和DRF调度策略。

ii. 支持资源抢占：队列剩余资源可共享给其他队列，当队列有新应用提交时，调度器强制回收。

iii. 负载均衡：基于任务数目的负载均衡，即尽可能将任务均衡分配到各节点。

iv. 提高小应用响应时间：小应用可快速获取资源，避免饿死状况。

操作

Overview

1. Yarn commands are invoked by the bin/yarn script. Running the yarn script without any arguments prints the description for all commands.

yarn [--config confdir] COMMAND

2. Yarn has an option parsing framework that employs parsing generic options as well as running classes.

COMMAND_OPTIONS	Description
--config confdir	Overwrites the default Configuration directory. Default is ${HADOOP_PREFIX}/conf.
COMMAND COMMAND_OPTIONS	Various commands with their options are described in the following sections. The commands have been grouped into User Commands and Administration Commands.

User Commands

1. jar

a) Runs a jar file. Users can bundle their Yarn code in a jar file and execute it using this command.

b) Format.

yarn jar <jar> [mainClass] args...

2. application

a) Prints application(s) report/kill application.

b) Format.

yarn application <options>

c) Options.

COMMAND_OPTIONS	Description
-list	Lists applications from the RM. Supports optional use of -appTypes to filter applications based on application type, and -appStates to filter applications based on application state.
-appStates States	Works with -list to filter applications based on input comma-separated list of application states. The valid application state can be one of the following: ALL, NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, KILLED
-appTypes Types	Works with -list to filter applications based on input comma-separated list of application types.
-status ApplicationId	Prints the status of the application.
-kill ApplicationId	Kills the application.

3. node

a) Prints node report(s).

b) Format.

yarn node <options>

c) Options.

COMMAND_OPTIONS	Description
-list	Lists all running nodes. Supports optional use of -states to filter nodes based on node state, and -all to list all nodes.
-states States	Works with -list to filter nodes based on input comma-separated list of node states.
-all	Works with -list to list all nodes.
-status NodeId	Prints the status report of the node.

4. logs

a) Dump the container logs.

b) Format.

yarn logs -applicationId <application ID> <options>

c) Options.

COMMAND_OPTIONS	Description
-applicationId <application ID>	Specifies an application id
-appOwner AppOwner	AppOwner (assumed to be current user if not specified)
-containerId ContainerId	ContainerId (must be specified if node address is specified)
-nodeAddress NodeAddress	NodeAddress in the format nodename:port (must be specified if container id is specified)

5. classpath

a) Prints the class path needed to get the Hadoop jar and the required libraries.

b) Format.

yarn classpath

6. version

a) Prints the version.

b) Format.

yarn version

Administration Commands

Refers to official documentation.

作者：netoxi
出处：http://www.cnblogs.com/netoxi
本文版权归作者和博客园共有，欢迎转载，未经同意须保留此段声明，且在文章页面明显位置给出原文连接。欢迎指正与交流。

posted on 2017-07-28 09:31 netoxi 阅读(960) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页