ClickHouse节点扩容、往集群中添加节点

 

ClickHouse节点扩容、往集群中添加节点

 

背景

最近公司ck数据量增长太快,几个T的硬盘快不够了,继续扩容。

扩容有三种方案:

        1.增加ck目前数据盘的大小,由于是服务器本地硬盘而且无法对/扩容因此此种方式不考虑。

        2.多路径策略:通过添加新盘符达到扩容目的,但是只对新表有用,旧表的storage_policy属性无法修改为新的磁盘策略(可对旧表分区移动数据)

        3.增加节点:通过将新数据以ck分片的方式分散到新的节点达到扩容目的

综合现状,详细测试第三种方式。

 

测试基本背景/步骤

扩容过程中,会影响Distributed表的读写,但是不会影响local表。

        1.目前2个节点的ck集群架构

        2.增加多1个节点ck服务器,基本的配置和设置保持和已有的节点一致(特别注意老旧节点的/etc/hosts配置)

        3.修改集群配置信息,添加新分片给新节点(有副本相对应修改),并且配置新分片的权重高于原来分片,注意每个节点macros标签的子标签值

        4.新节点对需要的表做Distributed表和local的创建(即将要修改的集群配置所涉及的Distributed表和local表)

 

环境构造

集群zkm(目前包含两个节点)配置如下:

##关键信息
        <zkm>
            <shard>
                <internal_replication>true</internal_replication>
                <weight>1</weight>
                <!-- 分片1,副本1 -->
                <replica>
                    <host>dev-app76</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>default</password>
                </replica>
            </shard>
            <shard>
                <internal_replication>true</internal_replication>
                <weight>1</weight>
                <!-- 分片2,副本1 -->
                <replica>
                    <host>dev-app77</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>default</password>
                </replica>
            </shard>
        </zkm>

 

CK01 :) select * from system.clusters where cluster='zkm';

SELECT *
FROM system.clusters
WHERE cluster = 'zkm'

Query id: 779eecd3-d4e5-474f-93e9-1d350e9e6fb7

┌─cluster─┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address──┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ zkm     │         111 │ dev-app76 │ 192.168.1.17190001default │                  │            000 │
│ zkm     │         211 │ dev-app77 │ 192.168.1.17290000default │                  │            000 │
└─────────┴───────────┴──────────────┴─────────────┴───────────┴───────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

2 rows in set. Elapsed: 0.003 sec. 

 

 

1,2节点均创建测试表

CK02 :) CREATE TABLE default.zkm on cluster zkm
        (
            `id` String DEFAULT 'NULL' COMMENT '用户编号',
            `repo` String DEFAULT 'NULL' COMMENT '仓库编号'
        )
        ENGINE = Distributed('zkm', 'default', 'zkm_local', rand());

Query id: fb7ad2c2-16b4-4c23-9cfe-b774ba9ba0dc

┌─host──────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ dev-app76 │ 90000 │       │                   10 │
│ dev-app77 │ 90000 │       │                   00 │
└───────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.118 sec. 

CK02 :) CREATE TABLE default.zkm_local on cluster zkm
        (
            `id` String DEFAULT 'NULL' COMMENT '用户编号',
            `repo` String DEFAULT 'NULL' COMMENT '仓库编号'
        )
        ENGINE = ReplicatedMergeTree('/clickhouse/tables/zkm-{shard}/default/zkm_local', '{replica}')
        ORDER BY id
        SETTINGS index_granularity = 8192;

Query id: 844c5f32-91d4-4935-9a41-d424e5c70ae2

┌─host──────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ dev-app76 │ 90000 │       │                   10 │
│ dev-app77 │ 90000 │       │                   00 │
└───────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.135 sec. 

 

增加数据(略)

insert into zkm values('a','b');
insert into zkm select * from zkm settings insert_deduplicate=0; --这条多次执行

 

数据情况如下:

CK01 :) SELECT
            host AS `主机名`,
            sum(rows) AS `总行数`,
            formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
            formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`,
            round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率`
        FROM `gv$zkm`
        WHERE (database = 'default') AND (table = 'zkm_local') AND (active = 1) AND (engine = 'ReplicatedMergeTree')
        GROUP BY host
        ORDER BY host ASC;

Query id: 938115af-473a-491f-997a-753deb3c5be9

┌─主机名────┬─总行数─┬─原始大小─┬─压缩大小──┬─压缩率─┐
│ dev-app76 │ 5241812.00 MiB │ 12.62 KiB │      1 │
│ dev-app77 │ 5243952.00 MiB │ 12.63 KiB │      1 │
└───────────┴────────┴──────────┴───────────┴────────┘

2 rows in set. Elapsed: 0.016 sec. 

 

 

新增节点3

安装ck数据库软件,配置OS,配合参数等。

 

修改集群

1,2节点添加新分片信息,新节点3添加集群。

注意每个节点macros标签的子标签值

新分片信息:

            <shard>
                <internal_replication>true</internal_replication>
                <weight>98</weight>
                <!-- 分片3,副本1 -->
                <replica>
                    <host>dev-app78</host>
                    <port>9000</port>
                    <user>default</user>
                    <password>default</password>
                </replica>
            </shard>

权重98,前边两分片权重均为1,表示约98%数据往新节点分配。

 

新集群信息:

CK01 :) select * from system.clusters where cluster='zkm';

SELECT *
FROM system.clusters
WHERE cluster = 'zkm'

Query id: b7683054-eea3-4404-b4c8-d022aa201a64

┌─cluster─┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address──┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ zkm     │         111 │ dev-app76 │ 192.168.1.17190001default │                  │            000 │
│ zkm     │         211 │ dev-app77 │ 192.168.1.17290000default │                  │            000 │
│ zkm     │         3991 │ dev-app78 │ 192.168.1.17390000default │                  │            000 │
└─────────┴───────────┴──────────────┴─────────────┴───────────┴───────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘

3 rows in set. Elapsed: 0.007 sec. 

 

 

 

3节点创建表

dev-app78 :) CREATE TABLE default.zkm
             (
                 `id` String DEFAULT 'NULL' COMMENT '用户编号',
                 `repo` String DEFAULT 'NULL' COMMENT '仓库编号'
             )
             ENGINE = Distributed('zkm', 'default', 'zkm_local', rand());

Query id: 4d1d55a1-0737-432d-aa37-75cf8a677d77

Ok.

0 rows in set. Elapsed: 0.006 sec. 

dev-app78 :) CREATE TABLE default.zkm_local
             (
                 `id` String DEFAULT 'NULL' COMMENT '用户编号',
                 `repo` String DEFAULT 'NULL' COMMENT '仓库编号'
             )
             ENGINE = ReplicatedMergeTree('/clickhouse/tables/zkm-{shard}/default/zkm_local', '{replica}')
             ORDER BY id
             SETTINGS index_granularity = 8192;

Query id: 3dc94a9e-080a-4d9a-8988-12337a669ca6

Ok.

0 rows in set. Elapsed: 0.060 sec. 

 

 

插入数据后,以及数据情况:

CK01 :) insert into zkm select * from zkm settings insert_deduplicate=0;

Query id: 439f23c8-19bb-4819-8738-9d9f08733a79

Ok.

0 rows in set. Elapsed: 0.267 sec. Processed 1.05 million rows, 20.97 MB (3.93 million rows/s., 78.61 MB/s.)

CK01 :) SELECT
            host AS `主机名`,
            sum(rows) AS `总行数`,
            formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`,
            formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`,
            round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率`
        FROM `gv$zkm`
        WHERE (database = 'default') AND (table = 'zkm_local') AND (active = 1) AND (engine = 'ReplicatedMergeTree')
        GROUP BY host
        ORDER BY host ASC;

Query id: 7752274a-2285-44fa-8292-4bd718ae53f9

┌─主机名────┬──总行数─┬─原始大小─┬─压缩大小──┬─压缩率─┐
│ dev-app76 │  5345112.04 MiB │ 12.85 KiB │      1 │
│ dev-app77 │  5347422.04 MiB │ 12.86 KiB │      1 │
│ dev-app78 │ 10278993.92 MiB │ 18.00 KiB │      0 │
└───────────┴─────────┴──────────┴───────────┴────────┘

3 rows in set. Elapsed: 0.025 sec. 

 

 

至此。

 

posted @ 2023-02-23 14:15  PiscesCanon  阅读(3185)  评论(2编辑  收藏  举报