Spark 源码解读(二)SparkContext的初始化之创建并初始化SparkUI
Spark 源码解读(二)SparkContext的初始化之创建并初始化SparkUI
SparkUI的创建与初始化
SparkContext创建SparkUI
_ui =
if (conf.getBoolean("spark.ui.enabled", true)) {
Some(SparkUI.createLiveUI(this, _conf, listenerBus, _jobProgressListener,
_env.securityManager, appName, startTime = startTime))
} else {
// For tests, do not enable the UI
None
}
// Bind the UI before starting the task scheduler to communicate
// the bound port to the cluster manager properly
//sparkUI绑定端口
_ui.foreach(_.bind())
可以看到,如果不需要提供SparkUI服务,可以将属性Spark.ui.enabled修改为false.
SparkUI的创建
def createLiveUI(
sc: SparkContext,
conf: SparkConf,
listenerBus: SparkListenerBus,
jobProgressListener: JobProgressListener,
securityManager: SecurityManager,
appName: String,
startTime: Long): SparkUI = {
create(Some(sc), conf, listenerBus, securityManager, appName,
jobProgressListener = Some(jobProgressListener), startTime = startTime)
}
可以看到SparkUI createLiveUI方法又调用了create方法,create方法代码如下:
private def create(
sc: Option[SparkContext],
conf: SparkConf,
listenerBus: SparkListenerBus,
securityManager: SecurityManager,
appName: String,
basePath: String = "",
jobProgressListener: Option[JobProgressListener] = None,
startTime: Long): SparkUI = {
val _jobProgressListener: JobProgressListener = jobProgressListener.getOrElse {
val listener = new JobProgressListener(conf)
listenerBus.addListener(listener)
listener
}
val environmentListener = new EnvironmentListener
val storageStatusListener = new StorageStatusListener(conf)
val executorsListener = new ExecutorsListener(storageStatusListener, conf)
val storageListener = new StorageListener(storageStatusListener)
val operationGraphListener = new RDDOperationGraphListener(conf)
listenerBus.addListener(environmentListener)
listenerBus.addListener(storageStatusListener)
listenerBus.addListener(executorsListener)
listenerBus.addListener(storageListener)
listenerBus.addListener(operationGraphListener)
new SparkUI(sc, conf, securityManager, environmentListener, storageStatusListener,
executorsListener, _jobProgressListener, storageListener, operationGraphListener,
appName, basePath, startTime)
}
}
可以看到create方法里除了JobProgressListener是外部传入的之外,又增加了一些SparkListener,例如用于对JVM参数、Spark属性、Java系统属性、classpath等进行监控的EnvironmentListener;用于维护Executor的存储状态的StorageStatusListener;用于准备将Executor的信息展示在ExecutorsTab的ExecutorsListener;用于准备将Executor相关存储信息展示在BlockManagerUI的StorageListener;用于构建RDD的DAG(有向无关图)的RDDOperationGraphListener等。这5个SparkListener的实现添加到listenerBus的监听器列表中。最后创建SparkUI,SparkUI服务默认是可以kill掉的,可以通过设置spark.ui.killEnabled参数为false来修改。SparkUI的initialize方法会组织前端页面各个Tab和Page的展示及布局。代码如下:
def initialize() {
val jobsTab = new JobsTab(this)
attachTab(jobsTab)
val stagesTab = new StagesTab(this)
attachTab(stagesTab)
attachTab(new StorageTab(this))
attachTab(new EnvironmentTab(this))
attachTab(new ExecutorsTab(this))
attachHandler(createStaticHandler(SparkUI.STATIC_RESOURCE_DIR, "/static"))
attachHandler(createRedirectHandler("/", "/jobs/", basePath = basePath))
attachHandler(ApiRootResource.getServletHandler(this))
// These should be POST only, but, the YARN AM proxy won't proxy POSTs
attachHandler(createRedirectHandler(
"/jobs/job/kill", "/jobs/", jobsTab.handleKillRequest, httpMethods = Set("GET", "POST")))
attachHandler(createRedirectHandler(
"/stages/stage/kill", "/stages/", stagesTab.handleKillRequest,
httpMethods = Set("GET", "POST")))
}
initialize()
SparkUI页面的布局与展示
在上一节中,JobsTab标签绑定到SparkUI上之后,在JobsTab上绑定了AllJobsPage和JobPage类。AllJobsPage页面即访问SparkUI页面时列举出所有Job的那个页面,JobPage页面则是点击单个Job时跳转的页面。通过调用JobsTab从WebUITab继承的attachPage方法与JobsTab进行绑定。
private[ui] class JobsTab(parent: SparkUI) extends SparkUITab(parent, "jobs") {
val sc = parent.sc
val killEnabled = parent.killEnabled
val jobProgresslistener = parent.jobProgressListener
val executorListener = parent.executorsListener
val operationGraphListener = parent.operationGraphListener
def isFairScheduler: Boolean =
jobProgresslistener.schedulingMode == Some(SchedulingMode.FAIR)
def getSparkUser: String = parent.getSparkUser
attachPage(new AllJobsPage(this))
attachPage(new JobPage(this))
JobsTab创建之后,将被attachTab方法加入SparkUI的ArrayBuffer[WebUItab中],并且通过attachPage方法,给每一个page生成org.eclipse.jetty.servlet.ServletContextHandler,最后调用attachHandler方法将ServletContextHandler绑定到SparkUI,即加入到handlers:ArrayBuffer[ServletContextHandler]和样例类ServerInfo的rootHandler(ContextHandlerCollection)中
SparkUI之listenerBus
Spark定义了一个特质ListenerBus,可以接收事件并且将事件提交到对应事件的监听器。ListenerBus主要包括以下3个内容:
(1)事件阻塞队列:类型为LinkedBlockingQueue[SparkListenerEvent],固定大小10000。
(2)监听器数组:类型为ArrayBuffer[SparkListener],存放各类监听器SparkListener。
(3)事件匹配监听器的线程:此Thread不断拉取LinkedBlockingQueue中的事件,遍历监听器,调用监听器的方法。任何事件都会在LinkedBlockingQueue中存在一段时间,然后Thread处理了此事件后会将其清除。
(持续完善中。。。)
参考
《深入理解Spark核心思想与源码分析》 --耿嘉安
Spark UI界面原理:https://www.cnblogs.com/wuyida/p/6300237.html