rbd-mirror源码分析

1. rbd mirror 流程从rbd-mirror二进制文件启动。

在看这部分逻辑之前,需要对librbd中的回调函数,回调类有足够的理解。

rbd-mirror从main函数开始启动,在main中调用Mirror::run()函数。
这个run函数中调用update_pool_replayers函数。

int r = mirror->init();
mirror->run();

在run函数中,调用update_pool_replayers(m_local_cluster_watcher->get_pool_peers());
继续在update_pool_replayers()函数调用pool_replayer->init();函数。
pool_replayer::init()函数,主要包含了更多的watcher函数以及启动pool_replayer的一个线程,用于监听admin socket。如下:

m_instance_replayer.reset(InstanceReplayer<>::create(
    m_threads, m_service_daemon, m_image_deleter, m_local_rados,
    local_mirror_uuid, m_local_pool_id));
  m_instance_replayer->set_throttle(m_bps_throttle);
  m_instance_replayer->init();
  m_instance_replayer->add_peer(m_peer.uuid, m_remote_io_ctx);

  m_instance_watcher.reset(InstanceWatcher<>::create(
    m_local_io_ctx, m_threads->work_queue, m_instance_replayer.get()));
  r = m_instance_watcher->init();
  if (r < 0) {
    derr << "error initializing instance watcher: " << cpp_strerror(r) << dendl;
    m_callout_id = m_service_daemon->add_or_update_callout(
      m_local_pool_id, m_callout_id, service_daemon::CALLOUT_LEVEL_ERROR,
      "unable to initialize instance messenger object");
    return;
  }

  m_leader_watcher.reset(new LeaderWatcher<>(m_threads, m_local_io_ctx,
                                             &m_leader_listener));
  r = m_leader_watcher->init();
  if (r < 0) {
    derr << "error initializing leader watcher: " << cpp_strerror(r) << dendl;
    m_callout_id = m_service_daemon->add_or_update_callout(
      m_local_pool_id, m_callout_id, service_daemon::CALLOUT_LEVEL_ERROR,
      "unable to initialize leader messenger object");
    return;
  }

注意这里有:m_instance_replayer->init(), m_instance_watcher->init(), m_leader_watcher->init()
其中:
m_instance_replayer->init()有如下调用:

  template <typename I>
int InstanceReplayer<I>::init() {
  C_SaferCond init_ctx;
  init(&init_ctx);
  return init_ctx.wait();
}
template <typename I>
void InstanceReplayer<I>::init(Context *on_finish) {
  dout(20) << dendl;
  Context *ctx = new FunctionContext(
    [this, on_finish] (int r) {
      {
        Mutex::Locker timer_locker(m_threads->timer_lock);
        schedule_image_state_check_task();
      }
      on_finish->complete(0);
    });
  // 这里将包含schedule_image_state_check_task函数的ctx对象扔到了线程池队列中
  // 通过定时任务就可以执行schedule_image_state_check_task这个函数。
  m_threads->work_queue->queue(ctx, 0);
}

schedule_image_state_check_task中通过设置定时器的方式调用了,queue_start_image_replayers从这里就可以开始进行以下的调用:start_image_replayer,最终调用到image_replayer->start()这个函数。
image_replayer->start()这个函数开始,经历如下的流程:

/**
  * @verbatim
  *                   (error)
  * <uninitialized> <------------------------------------ FAIL
  *    |                                                   ^
  *    v                                                   *
  * <starting>                                             *
  *    |                                                   *
  *    v                                                   *
  * WAIT_FOR_DELETION                                      *
  *    |                                                   *
  *    v                                           (error) *
  * PREPARE_LOCAL_IMAGE  * * * * * * * * * * * * * * * * * *
  *    |                                                   *
  *    v                                           (error) *
  * PREPARE_REMOTE_IMAGE * * * * * * * * * * * * * * * * * *
  *    |                                                   *
  *    v                                           (error) *
  * BOOTSTRAP_IMAGE  * * * * * * * * * * * * * * * * * * * *
  *    |                                                   *
  *    v                                           (error) *
  * INIT_REMOTE_JOURNALER  * * * * * * * * * * * * * * * * *
  *    |                                                   *
  *    v                                           (error) *
  * START_REPLAY * * * * * * * * * * * * * * * * * * * * * *
  *    |
  *    |  /--------------------------------------------\
  *    |  |                                            |
  *    v  v   (asok flush)                             |
  * REPLAYING -------------> LOCAL_REPLAY_FLUSH        |
  *    |       \                 |                     |
  *    |       |                 v                     |
  *    |       |             FLUSH_COMMIT_POSITION     |
  *    |       |                 |                     |
  *    |       |                 \--------------------/|
  *    |       |                                       |
  *    |       | (entries available)                   |
  *    |       \-----------> REPLAY_READY              |
  *    |                         |                     |
  *    |                         | (skip if not        |
  *    |                         v  needed)        (error)
  *    |                     REPLAY_FLUSH  * * * * * * * * *
  *    |                         |                     |   *
  *    |                         | (skip if not        |   *
  *    |                         v  needed)        (error) *
  *    |                     GET_REMOTE_TAG  * * * * * * * *
  *    |                         |                     |   *
  *    |                         | (skip if not        |   *
  *    |                         v  needed)        (error) *
  *    |                     ALLOCATE_LOCAL_TAG  * * * * * *
  *    |                         |                     |   *
  *    |                         v                 (error) *
  *    |                     PREPROCESS_ENTRY  * * * * * * *
  *    |                         |                     |   *
  *    |                         v                 (error) *
  *    |                     PROCESS_ENTRY * * * * * * * * *
  *    |                         |                     |   *
  *    |                         \---------------------/   *
  *    v                                                   *
  * REPLAY_COMPLETE  < * * * * * * * * * * * * * * * * * * *
  *    |
  *    v
  * JOURNAL_REPLAY_SHUT_DOWN
  *    |
  *    v
  * LOCAL_IMAGE_CLOSE
  *    |
  *    v
  * <stopped>
  *
  * @endverbatim
  */

以上流程主要集中在ImageReplayer中。从以上流程可以看出,从初始化开始后,整个流程集中在start这个函数。
void ImageReplayer<I>::start()函数一直到 ImageReplayer<I>::bootstrap()中,这里调用了request()函数

void ImageReplayer<I>::bootstrap()
{
    ....
    request->send();
    ....
}
 /**
   * @verbatim
   *
   * <start>
   *    |
   *    v
   * GET_REMOTE_TAG_CLASS * * * * * * * * * * * * * * * * * *
   *    |                                                   * (error)
   *    v                                                   *
   * OPEN_REMOTE_IMAGE  * * * * * * * * * * * * * * * * * * *
   *    |                                                   *
   *    |/--------------------------------------------------*---\
   *    v                                                   *   |
   * IS_PRIMARY * * * * * * * * * * * * * * * * * * * * *   *   |
   *    |                                               *   *   |
   *    | (remote image primary, no local image id)     *   *   |
   *    \----> UPDATE_CLIENT_IMAGE  * * * * * * * * * * *   *   |
   *    |         |                                     *   *   |
   *    |         v                                     *   *   |
   *    \----> CREATE_LOCAL_IMAGE * * * * * * * * * * * *   *   |
   *    |         |                                     *   *   |
   *    |         v                                     *   *   |
   *    | (remote image primary)                        *   *   |
   *    \----> OPEN_LOCAL_IMAGE * * * * * * * * * * * * *   *   |
   *    |         |   .                                 *   *   |
   *    |         |   . (image doesn't exist)           *   *   |
   *    |         |   . . > UNREGISTER_CLIENT * * * * * *   *   |
   *    |         |             |                       *   *   |
   *    |         |             v                       *   *   |
   *    |         |         REGISTER_CLIENT * * * * * * *   *   |
   *    |         |             |                       *   *   |
   *    |         |             \-----------------------*---*---/
   *    |         |                                     *   *
   *    |         v (skip if not needed)                *   *
   *    |      GET_REMOTE_TAGS  * * * * * * *           *   *
   *    |         |                         *           *   *
   *    |         v (skip if not needed)    v           *   *
   *    |      IMAGE_SYNC * * * > CLOSE_LOCAL_IMAGE     *   *
   *    |         |                         |           *   *
   *    |         \-----------------\ /-----/           *   *
   *    |                            |                  *   *
   *    |                            |                  *   *
   *    | (skip if not needed)       |                  *   *
   *    \----> UPDATE_CLIENT_STATE  *|* * * * * * * * * *   *
   *                |                |                  *   *
   *    /-----------/----------------/                  *   *
   *    |                                               *   *
   *    v                                               *   *
   * CLOSE_REMOTE_IMAGE < * * * * * * * * * * * * * * * *   *
   *    |                                                   *
   *    v                                                   *
   * <finish> < * * * * * * * * * * * * * * * * * * * * * * *
   *
   * @endverbatim
   */

image sync流程如下,经历过如下的流程后,一个image的所有数据都进行了同步。

 /**
   * @verbatim
   *
   * <start>
   *    |
   *    v
   * NOTIFY_SYNC_REQUEST
   *    |
   *    v
   * PRUNE_CATCH_UP_SYNC_POINT
   *    |
   *    v
   * CREATE_SYNC_POINT (skip if already exists and
   *    |               not disconnected)
   *    v
   * COPY_SNAPSHOTS
   *    |
   *    v
   * COPY_IMAGE . . . . . . . . . . . . . .
   *    |                                 .
   *    v                                 .
   * COPY_OBJECT_MAP (skip if object      .
   *    |             map disabled)       .
   *    v                                 .
   * REFRESH_OBJECT_MAP (skip if object   .
   *    |                map disabled)    .
   *    v                                 .
   * PRUNE_SYNC_POINTS                    . (image sync canceled)
   *    |                                 .
   *    v                                 .
   * COPY_METADATA                        .
   *    |                                 .
   *    v                                 .
   * <finish> < . . . . . . . . . . . . . .
   *
   * @endverbatim
   */

在我的DR环境中,找个rbdmirror的pod,进入到/var/run/ceph目录下,执行:
ceph --admin-daemon ./ceph-client.dr.asok rbd mirror start
以下是命令的执行流程:

void PoolReplayer::start()
{
  dout(20) << "enter" << dendl;

  Mutex::Locker l(m_lock);

  if (m_stopping) {
    return;
  }

  m_instance_replayer->start();
}

然后执行现在的:

template <typename I>
void InstanceReplayer<I>::start()
{
  dout(20) << dendl;

  Mutex::Locker locker(m_lock);

  m_manual_stop = false;

  for (auto &kv : m_image_replayers) {
    auto &image_replayer = kv.second;
    image_replayer->start(nullptr, true);
  }
}

从这里也可以看出,在m_image_replayers这个map中找到相关的image_replayer实例,这里有个疑问:m_image_replayers这个实例是怎么生成的?
答案:这个实例,是从m_instance_replayer->acquire_image(this, global_image_id, on_finish);这个流程开始的(从InstanceWatcher.cc这里面调用的。)。这里先不管acquire_image里面的流程,只需要知道这里面生成m_image_replayer对象。
需要说明的是,m_image_replayers这个map中,是一个global_image_id对应一个image_replayer实例。

auto &image_replayer = kv.second;
这个map的value对应的就是image_replayer对象。所以这里还是就开始跳到image replayer中去了。

进入到ImageReplayer::start()中,这部分流程就和以前的流程相似了。

posted @ 2020-02-20 17:06  Linux-inside  阅读(749)  评论(0编辑  收藏  举报