|NO.Z.00025|——————————|BigDataEnd|——|Hadoop&OLAP_ClickHouse.V06|——|ClickHouse.v06|ClickHouse:ClickHouse副本分片|ReplicatedMergeTree原理|

一、ReplicatedMergeTree原理
### --- 数据结构

[zk: localhost:2181(CONNECTED) 8] ls /clickhouse/tables/01/replicated_sales_5
[metadata, temp, mutations, log, leader_election, columns, blocks, nonincrement_block_numbers, replicas, quorum, block_numbers]
### --- 数据结构说明

~~~     # 元数据
~~~     metadata:元数信息: 主键、采样表达式、分区键
~~~     columns:列的字段的数据类型、字段名
~~~     replicats:副本的名称
~~~     # 标志:
~~~     leder_election:主副本的选举路径
~~~     blocks:hash值(复制数据重复插入)、partition_id
~~~     max_insert_block_size: 1048576行
~~~     block_numbers:在同一分区下block的顺序
~~~     quorum:副本的数据量
### --- 操作类:

~~~     # log:log-000000 常规操作
~~~     mutations: delete update
~~~     replicas:
~~~     # Entry:
~~~     LogEntry和MutationEntry
[zk: localhost:2181(CONNECTED) 14] get /clickhouse/tables/01/a1/log/log-0000000000

format version: 4
create_time: 2021-11-04 19:31:51
source replica: hadoop01
block_id: 202111_4775801442814045523_14663512626267065022
get 
202111_0_0_0
~~~     get:指令(获取数据的指令)
~~~     谁会获取这个指令? --- - hadoop02会获取,并执行
~~~     2021_11_04:分区信息、告诉hdp-2 你要获取哪一个分区的数据
二、副本协同的核心流程
### --- INSERT:在hadoop01机器上创建一个副本实例:

~~~     # 在hadoop01的clickhouse创建a1表
[root@hadoop01 ~]#  clickhouse-client -m
hadoop01 :) create table a1(
            id String,
            price Float64,
            create_time DateTime
            )ENGINE=ReplicatedMergeTree('/clickhouse/tables/01/a1','hadoop01')
            PARTITION BY toYYYYMM(create_time)
            ORDER BY id;
~~~输出参数
CREATE TABLE a1
(
    `id` String,
    `price` Float64,
    `create_time` DateTime
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/a1', 'hadoop01')
PARTITION BY toYYYYMM(create_time)
ORDER BY id

Ok.
~~~     # 参数说明:

~~~     根据zk_path初始化所有的zk节点
~~~     在replicas节点下注册自己的副本实例hadoop01
~~~     启动监听任务,监听/log日志节点
~~~     参与副本选举,选出主副本。
~~~     选举的方式是向leader_election/插入子节点,第一个插入成功的副本就是主副本
### --- 创建第二个副本实例:

~~~     # 在hadoop02的clickhouse创建a1表
[root@hadoop02 ~]# clickhouse-client -m
hadoop02 :) create table a1(
            id String,
            price Float64,
            create_time DateTime
            )ENGINE=ReplicatedMergeTree('/clickhouse/tables/01/a1','hadoop02')
            PARTITION BY toYYYYMM(create_time)
            ORDER BY id;
~~~输出参数:
CREATE TABLE a1
(
    `id` String,
    `price` Float64,
    `create_time` DateTime
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/01/a1', 'hadoop02')
PARTITION BY toYYYYMM(create_time)
ORDER BY id

Ok.
~~~     # 参数说明:

~~~     参与副本选举,hadoop01副本成为主副本。
### --- 向第一个副本实例插入数据:

~~~     # 在hadoop01下clickhouse.a1表中插入数据
hadoop01 :) insert into table a1 values('A001',100,'2021-11-02 08:00:00');
~~~输出参数
INSERT INTO a1 VALUES

Ok.
### --- 在hadoop01和hadoop02下查看插入的数据

~~~     # 在hadoop01查询a1的数据
hadoop01 :) select * from a1;
┌─id───┬─price─┬─────────create_time─┐
│ A001 │   1002021-11-02 08:00:00 │
└──────┴───────┴─────────────────────┘
~~~     # 在hadoop02查询a1的数据
hadoop02 :) select * from a1;
┌─id───┬─price─┬─────────create_time─┐
│ A001 │   1002021-11-02 08:00:00 │
└──────┴───────┴─────────────────────┘
### --- 插入命令执行后,在本地完成分区目录的写入,接着向block写入该分区的block_id

~~~     # 在zookeeper下查看数据:因为zookeeper三台是集群模式,数据都是可以查看到的
[zk: localhost:2181(CONNECTED) 0] ls /clickhouse/tables/01/a1/blocks
[202111_4775801442814045523_14663512626267065022]
三、配置参数说明
### --- inser_quorum参数
~~~     如果设置了inser_quorum参数,
~~~     且insert_quorum>=2,则hadoop01会进一步监控已完成写入操作的副本个数,
~~~     直到写入副本个数>= insert_quorum的时候,整个写入操作才算完成。
~~~     接下来,hadoop01副本发起向log日志推送操作日志[log-0000000000]

[zk: localhost:2181(CONNECTED) 1] ls /clickhouse/tables/01/a1/log   
[log-0000000000]
### --- 操作日式的内容为:

[zk: localhost:2181(CONNECTED) 2] get /clickhouse/tables/01/a1/log/log-0000000000
format version: 4
create_time: 2021-11-04 19:56:29
source replica: hadoop01
block_id: 202111_4775801442814045523_14663512626267065022
get
202111_0_0_0

cZxid = 0x1100000036
ctime = Thu Nov 04 19:56:29 CST 2021
mZxid = 0x1100000036
mtime = Thu Nov 04 19:56:29 CST 2021
pZxid = 0x1100000036
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 151
numChildren = 0
### --- LogEntry:

~~~     source replica: 发送这条Log指令的副本来源,对应replica_name
~~~     ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/table_name', '{replica_name}')
~~~     get: 操作指令类型
~~~     get:从远程副本下载分区
~~~     merge:合并分区
~~~     mutate:MUTATION操作
~~~     block_id:当前分区的blockId,对应/blocks路径下的子节点名称
~~~     202008_0_0_0: 当前分区目录的名称
~~~     从日志内容可以看到,操作类型为get下载,需要下载的分区是202008_0_0_0,
~~~     其余所有副本都会基于Log日志以相同的顺序执行。
### --- 接下来:第二个副本实例拉取Log日志:

~~~     hadoop会一直监听/log节点变化,
~~~     当hdp-1推送了/log/log-0000000000之后,hdp-2便会触发日志的拉取任务,并更新log_pointer,
[zk: localhost:2181(CONNECTED) 4] get /clickhouse/tables/01/a1/replicas/hadoop02/log_pointer
1
cZxid = 0x1100000027
ctime = Thu Nov 04 19:56:13 CST 2021
mZxid = 0x1100000037
mtime = Thu Nov 04 19:56:29 CST 2021
pZxid = 0x1100000027
cversion = 0
dataVersion = 2
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 1
numChildren = 0
### --- 在拉取LogEntry之后,它并不会立即执行,而是将其转成任务对象放入队列

[zk: localhost:2181(CONNECTED) 5] ls /clickhouse/tables/01/a1/replicas/hadoop02/queue
[queue-0000000000]

[zk: localhost:2181(CONNECTED) 6] get /clickhouse/tables/01/a1/replicas/hadoop02/queue/queue-0000000000
format version: 4
create_time: 2021-11-04 19:56:29
source replica: hadoop01
block_id: 202111_4775801442814045523_14663512626267065022
get
202111_0_0_0

cZxid = 0x1100000037
ctime = Thu Nov 04 19:56:29 CST 2021
mZxid = 0x1100000037
mtime = Thu Nov 04 19:56:29 CST 2021
pZxid = 0x1100000037
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 151
numChildren = 0
### --- 第二个副本实例向其他副本发起下载请求。

~~~     当看到type为get的时候,ReplicatedMergeTree会明白在远端的其它副本已经成功写入了数据分区,
~~~     并根据log_pointer下标做大的下载数据。
~~~     hadoop01的DataPartsExchange端口服务就收到调用请求,在得知对方来意之后,
~~~     根据参数做出相应,将本地的202111_0_0_0基于DataPartsExchange的服务相应发送给hadoop02

 
 
 
 
 
 
 
 
 

Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
                                                                                                                                                   ——W.S.Landor

 

 

posted on   yanqi_vip  阅读(46)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示