spark 内核调度理解

上图对于spark运行机制，可以概括为以下几步来理解。

1. Create DAG of RDDs to represent computation

2. Create logical execution plan for DAG

1). Pipeline as much as possible

2). Split into "stages" based on need to reorganize data

3. Schedule and execute individual tasks

1) Split each stage into tasks

2) A task is data + computation

3) Execute all tasks within a stage before moving on

posted on 2015-03-05 15:41 Ai_togic 阅读(155) 评论(0) 编辑收藏举报

刷新页面返回顶部

Marshall