berkeley db replica机制 - 主从同步

 

repmgr/repmgr_net.c,

__repmgr_send(): 做send_broadcast, 然后根据policy 对DB_REP_PERMANENT的处理

__repmgr_send_broadcast(): 对每个site, send_connection().

 

MASTER 发送

log/log_put.c, log_put(), 

不接受 REP_CLIENT

__rep_send_message(env, DB_EID_BROADCAST, - REP_NEWFILE, REP_LOG

 

txn/txn_chkpt.c, __txn_checkpoint()

REP_CLIENT仅在recover时到这里, sync mp后马上退出

在master sync mp之前, 发送给client:

__rep_send_message(env, DB_EID_BROADCAST, REP_START_SYNC

等待 chkpt_delay 

写ckp log rec

 

 

 

DB_REP_REREQUEST vs  DB_REP_ANYWHERE

* Gap requests are "new" and can go anywhere, unless
* this is already a re-request.

repmgr_net.c, __repmgr_send() 

    if ((flags & (DB_REP_ANYWHERE | DB_REP_REREQUEST)) ==DB_REP_ANYWHERE &&

    (site = __repmgr_find_available_peer(env))

发完后, DB_REP_PERMANENT, 检查policy, 需要多少ack 才能返回(for durability). 

http://docs.oracle.com/cd/E17076_03/html/api_reference/C/repmgrset_ack_policy.html 

默认 DB_REPMGR_ACKS_QUORUM, repmgr_net.c, __repmgr_send(), 可见 (n - 1) / 2 + 1

确定了之后, __repmgr_await_cond(env, got_acks, &perm, rep->ack_timeout, &db_rep->ack_waiters);

  => while (got_acks(env, &perm)) {pthread_cond_timedwait(&db_rep->ack_waiters,rep->ack_timeout)}

repmgr_net.c,  got_acks().

 

 

dbinc/rep.h, 

struct __db_rep {}

REPMGR_RUNNABLE *selector, **messengers, **elect_threads;

 

 

WSAEventSelect

 

client接收

repmgr_method.c, __repmgr_start_int() - elect/msg/select threads
repmgr_method.c, __repmgr_start_selector()
repmgr_sel.c, __repmgr_select_thread()
repmgr_windows.c, __repmgr_select_loop()
repmgr_windows.c, handle_completion() -
repmgr_sel.c, __repmgr_read_from_site()
repmgr_sel.c, dispatch_msgin() -

放入db_rep->input_queue, __repmgr_signal(&db_rep->msg_avail)

   rep.h, struct __db_rep     -    cond_var_t check_election, gmdb_idle, msg_avail;

 

 


repmgr_method.c, __repmgr_start_int()
repmgr_method.c, __repmgr_start_msg_threads()
repmgr_msg.c, __repmgr_msg_thread()
message_loop()

     while ((ret = __repmgr_queue_get()... 

      __repmgr_queue_get - while(m = available_work(env)) == NULL), wait 在msg_avail 上
process_message()
repmgr_record.c, __rep_process_message_int()

对 REP_LOG 消息, 调用 

rep_log.c, __rep_log()

rep_record.c, __rep_apply():

    log.h,  struct log {}  - waiting_lsn,  max_wait_lsn, __db.rep.db, ready_lsn

     waiting_lsn:   It is the first LSN that we are holding without putting in the log, because we received one or more log records out of order.

       ready_lsn:  It is the next LSN we expect to receive. It's normally equal to "lsn", except at the beginning of a log file, at which point it's set to the LSN of the first record of the new file

若正是我们需要的 下一个log rec, call  __rep_process_rec(); __rep_remfirst/__rep_getnext 接着处理 tmp db里的log; __rep_loggap_req().

若 在我们需要的 log rec后面, 入tmp db, 更新 waiting_lsn, 发送__rep_loggap_req(). 

若 在我们需要的 log rec前面, 收到 重复 log rec. 

 

rep_record.c, __rep_process_rec(), newfile特殊处理退出. 除ckp,其他先直接写入log 文件

   - DB___txn_prepare:  直接flush log

   - DB___txn_regop:  __rep_process_txn(),  拿到 需要的写锁; 拿到txn对应的所有log rec, 排序, 读出,  db_dispatch(DB_TXN_APPLY

   - DB___txn_ckp: 首先在 rep_db(bookkeeping db)中写一个 rec, nooverwrite, 如果已经有, 则其他线程在做ckp, 退出.  sync mp; 写DB_LOG_CHKPNT log rec,  flush log

 

posted @ 2016-08-17 13:13  brayden  阅读(526)  评论(0编辑  收藏  举报