Spark Streaming源码解读之Executor容错安全性

本节主要考虑：Executor的安全性

主要是数据的安全容错，计算是借助Spark Core的计算容错，本次暂不考虑。

数据容错天然方式就是数据副本，当前数据有问题就读取另外一份；十秒数据出问题，再次读取，支持数据重放。

天然借助BlockManager做数据备份，参照Spark Core，有不同的StoreageLevel备份策略：

lass StorageLevel private(
    private var _useDisk: Boolean,
    private var _useMemory: Boolean,
    private var _useOffHeap: Boolean,
    private var _deserialized: Boolean,
    private var _replication: Int = 1)
extends Externalizable {

receiver收到数据，存储是menemery_ser_2，指定二分副本：都放在内存，放不下就放到磁盘，例如：二分副本一份放在executor a中，另一份放在executor b中。

ReceiverSupervisorImpl：

private val host = SparkEnv.get.blockManager.blockManagerId.host

private val executorId = SparkEnv.get.blockManager.blockManagerId.executorId

private val receivedBlockHandler: ReceivedBlockHandler = {

  if (WriteAheadLogUtils.enableReceiverLog(env.conf)) {

    if (checkpointDirOption.isEmpty) {

      throw new SparkException(

        "Cannot enable receiver write-ahead log without checkpoint directory set. " +

          "Please use streamingContext.checkpoint() to set the checkpoint directory. " +

          "See documentation for more details.")

    }

    new WriteAheadLogBasedBlockHandler(env.blockManager, receiver.streamId,

      receiver.storageLevel, env.conf, hadoopConf, checkpointDirOption.get)

  } else {

    new BlockManagerBasedBlockHandler(env.blockManager, receiver.storageLevel)

  }

}

/** A helper class with utility functions related to the WriteAheadLog interface */

private[streaming] object WriteAheadLogUtils extends Logging {

  val RECEIVER_WAL_ENABLE_CONF_KEY = "spark.streaming.receiver.writeAheadLog.enable"

  val RECEIVER_WAL_CLASS_CONF_KEY = "spark.streaming.receiver.writeAheadLog.class"

  val RECEIVER_WAL_ROLLING_INTERVAL_CONF_KEY =

    "spark.streaming.receiver.writeAheadLog.rollingIntervalSecs"

  val RECEIVER_WAL_MAX_FAILURES_CONF_KEY = "spark.streaming.receiver.writeAheadLog.maxFailures"

  val RECEIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY =

    "spark.streaming.receiver.writeAheadLog.closeFileAfterWrite"



  val DRIVER_WAL_CLASS_CONF_KEY = "spark.streaming.driver.writeAheadLog.class"

  val DRIVER_WAL_ROLLING_INTERVAL_CONF_KEY =

    "spark.streaming.driver.writeAheadLog.rollingIntervalSecs"

  val DRIVER_WAL_MAX_FAILURES_CONF_KEY = "spark.streaming.driver.writeAheadLog.maxFailures"

  val DRIVER_WAL_BATCHING_CONF_KEY = "spark.streaming.driver.writeAheadLog.allowBatching"

  val DRIVER_WAL_BATCHING_TIMEOUT_CONF_KEY = "spark.streaming.driver.writeAheadLog.batchingTimeout"

  val DRIVER_WAL_CLOSE_AFTER_WRITE_CONF_KEY =

    "spark.streaming.driver.writeAheadLog.closeFileAfterWrite"



  val DEFAULT_ROLLING_INTERVAL_SECS = 60

  val DEFAULT_MAX_FAILURES = 3

 必须有ck目录，构建receiver的时候传进来的storagelevel是？

从业务代码的socketTextStream入手，找到storgaeleve=memory_and_disk_ser_2

/**

 * Create a input stream from TCP source hostname:port. Data is received using

 * a TCP socket and the receive bytes it interepreted as object using the given

 * converter.

 * @param hostname      Hostname to connect to for receiving data

 * @param port          Port to connect to for receiving data

 * @param converter     Function to convert the byte stream to objects

 * @param storageLevel  Storage level to use for storing the received objects

 * @tparam T            Type of the objects received (after converting bytes to objects)

 */

def socketStream[T: ClassTag](

    hostname: String,

    port: Int,

    converter: (InputStream) => Iterator[T],

    storageLevel: StorageLevel

  ): ReceiverInputDStream[T] = {

  new SocketInputDStream[T](this, hostname, port, converter, storageLevel)

}

/** Trait that represents a class that handles the storage of blocks received by receiver */

private[streaming] trait ReceivedBlockHandler {

def storeBlock(blockId: StreamBlockId, block: ReceivedBlock): ReceivedBlockStoreResult = {



  var numRecords = None: Option[Long]



  val putResult: Seq[(BlockId, BlockStatus)] = block match {

    case ArrayBufferBlock(arrayBuffer) =>

      numRecords = Some(arrayBuffer.size.toLong)

      blockManager.putIterator(blockId, arrayBuffer.iterator, storageLevel,

        tellMaster = true)

    case IteratorBlock(iterator) =>

      val countIterator = new CountingIterator(iterator)

      val putResult = blockManager.putIterator(blockId, countIterator, storageLevel,

        tellMaster = true)

      numRecords = countIterator.count

      putResult

    case ByteBufferBlock(byteBuffer) =>

      blockManager.putBytes(blockId, byteBuffer, storageLevel, tellMaster = true)

    case o =>

      throw new SparkException(

        s"Could not store $blockId to block manager, unexpected block type ${o.getClass.getName}")

  }

  if (!putResult.map { _._1 }.contains(blockId)) {

    throw new SparkException(

      s"Could not store $blockId to block manager with storage level $storageLevel")

  }

  BlockManagerBasedStoreResult(blockId, numRecords)

}

/**

 * Put the given block according to the given level in one of the block stores, replicating

 * the values if necessary.

 *

 * The effective storage level refers to the level according to which the block will actually be

 * handled. This allows the caller to specify an alternate behavior of doPut while preserving

 * the original level specified by the user.

 */

private def doPut(

    blockId: BlockId,

    data: BlockValues,

    level: StorageLevel,

    tellMaster: Boolean = true,

    effectiveStorageLevel: Option[StorageLevel] = None)

  : Seq[(BlockId, BlockStatus)] = {



  require(blockId != null, "BlockId is null")

  require(level != null && level.isValid, "StorageLevel is null or invalid")

  effectiveStorageLevel.foreach { level =>

    require(level != null && level.isValid, "Effective StorageLevel is null or invalid")

  }



  // Return value

  val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]



  /* Remember the block's storage level so that we can correctly drop it to disk if it needs

   * to be dropped right after it got put into memory. Note, however, that other threads will

   * not be able to get() this block until we call markReady on its BlockInfo. */

  val putBlockInfo = {

    val tinfo = new BlockInfo(level, tellMaster)

    // Do atomically !

    val oldBlockOpt = blockInfo.putIfAbsent(blockId, tinfo)

    if (oldBlockOpt.isDefined) {

      if (oldBlockOpt.get.waitForReady()) {

        logWarning(s"Block $blockId already exists on this machine; not re-adding it")

        return updatedBlocks

      }

      // TODO: So the block info exists - but previous attempt to load it (?) failed.

      // What do we do now ? Retry on it ?

      oldBlockOpt.get

    } else {

      tinfo

    }

  }



  val startTimeMs = System.currentTimeMillis



  /* If we're storing values and we need to replicate the data, we'll want access to the values,

   * but because our put will read the whole iterator, there will be no values left. For the

   * case where the put serializes data, we'll remember the bytes, above; but for the case where

   * it doesn't, such as deserialized storage, let's rely on the put returning an Iterator. */

  var valuesAfterPut: Iterator[Any] = null



  // Ditto for the bytes after the put

  var bytesAfterPut: ByteBuffer = null



  // Size of the block in bytes

  var size = 0L



  // The level we actually use to put the block

  val putLevel = effectiveStorageLevel.getOrElse(level)



  // If we're storing bytes, then initiate the replication before storing them locally.

  // This is faster as data is already serialized and ready to send.

  val replicationFuture = data match {

    case b: ByteBufferValues if putLevel.replication > 1 =>

      // Duplicate doesn't copy the bytes, but just creates a wrapper

      val bufferView = b.buffer.duplicate()

      Future {

        // This is a blocking action and should run in futureExecutionContext which is a cached

        // thread pool

        replicate(blockId, bufferView, putLevel)

      }(futureExecutionContext)

    case _ => null

  }

A机器接收数据、存储，同时通过C机器做备份，一旦A挂了会瞬间切换到C机器。

HBase也会做wal，默认做一份日志，后面出问题的话，做日志恢复，默认需要写CheckPoint中，看ReceiverSupervisorImpl的53-65行。
默认放在hdfs上日志CheckPoint默认有三分副本，浪费磁盘，但安全。

写log 的方式日志会有很多需要streamid，默认情况不需要此id。在看receiverblockhandler的121行，先写日志在放blockmanager；继续观察134行 effectiveStorgaelevel，假如做了wal，没有必要把sotagelevel的replication变成2份？没有必要，浪费磁盘，CheckPoint默认存在hdfs下，有3份副本。继续153行，封装线程池想并发存储数据，并发进行，然后就是166行的storeblock，186行，196行 write写到block中系列化，202行放到线程做，217写入目录、receiverddata，看40行Write是个接口方法 看下writeAheadLog类描述，read、write、clean、readall，wal写数据顺序的写，不能修改数据，所以

读的时候按照游标或指针读取record，读取数据在哪里，效率很高，顺序的写、随机的读，没有追加、修改、删除等操作；

Batch的时候会构建一个文件，writer之后会返回一个句柄，读数据的时候就根据这个句柄，看writeAcheadLog的 public

Writeaheadlogrecordhandle，其子类实现是一个case class，filebase。。。，路径、索引、长度读取wal中的数据。

Readall读取全部内容，clean根据时间清理数据，看下writeaheadlog的子类：FileBasedwriteaheadlog的注释，管理wal文件，周期性写文件、出错的时候读数据，写时用writer、读时用reader，默认不是hadoop读写方式注意；

在看def write描述：不严谨的hdfs，fileSegment 就是刚才说的path、offset、length，getLogWriter然后就writer，

看下getlogwriter会产生很多小文件纯粹是java对文件操作。

在看read部分：hasnext方法，继续看FileBasedWraterahedlograndomreader 的read方法随机读，在看filebasewriteaheadlogread的hasnext方法，没有操作句柄就只有迭代来读取数据。思想理解，然后就是具体代码。

二中方式：

1、blockmanger

2、wal的方式，考性能，如果容纳1分钟延迟可以考虑，但有可能会有数据丢失

数据重放：

考虑到kafka，receiver方式使用zk来管理元数据，数据重复消费问题，数据消费后没有来得及告诉zk，生产环境下

直接derect，这种方式能确保有且只有一次读取数据，directkafkainputdstream类，本身会负责offset的，管理元数据信息

，每次batch生成会调用leatsetleadersoffsets，找到offset的范围，就可以确定rdd的数据源，每个batch执行，看最新的

Offset-之前的offset，找到其范围，读取根据范围获取；在看kafkaRDD，其核心根据offset读取数据，基于读取的数据进行

计算，getpartitions根据offsetrange读取数据，实际上真正读数据 需要读kafka从集群读取，实际上是一个simpleConsumeer

消费者，再看simpleConsumer，它是kafka下的包，读取kafka中的信息。这里直接读取kafak的信息，确定of的范围，分配batch的数据，操作后会ck数据，把kafak当做底层文件系统。代码只有几百行非常简单。弊端是耗时，但不是所有情况下

都不可以丢数据的，一般5%的数据丢失范围。

假如从作业调度容错层面：

1000个block丢失一个block也算丢失，丢失也算拉起数据失败，通过wal来恢复，其他的999个也需要重新恢复，恢复力度太粗，可以修改其源码来控制只需要找回一个即可。

副本二中方式：在内存中二分数据；wal；数据重放；

明天需要讲解driver端的安全，也是编程的关键；

Spark发行版笔记12

posted @ 2016-05-22 18:08 VV一笑2016 阅读(717) 评论(0) 收藏举报

刷新页面返回顶部

永不服输2016

Spark Streaming源码解读之Executor容错安全性

公告