ClickHouse 的 Parts 和 Partitions
2024-10-16 11:07 abce 阅读(234) 评论(0) 编辑 收藏 举报在 ClickHouse 中,磁盘上存储表的部分数据的物理文件称为 "part"。"partition "是使用分区键创建的表数据的逻辑划分。分区可用于提高大型表的性能和可管理性,允许用户更有效地存储、查询和操作数据子集。
Parts
如前所述,"Parts"是磁盘上的物理文件。默认情况下,所有数据相关文件都位于"/var/lib/clickhouse"目录下。ClickHouse 中的每个合并树表都有唯一的存储目录路径来存储 Parts。你可以从 "system.parts "表中获取parts的实际位置、parts名称、分区信息(如果有的话)以及其他一些有价值的信息。
以下是从 system.parts 表查询结果的示例。
SELECT substr(table, 1, 22), partition AS prt, name, part_type, path FROM system.parts WHERE database = 'mytest' ORDER BY table ASC, partition ASC, name ASC Query id: e2f55694-fa55-48f3-99f7-74bca59aca48 ┌─substring(table, 1, 22)─┬─prt──────┬─name─────────────────┬─part_type─┬─path──────────────────────────────────────────────────────────────────────────────────────┐ │ bucket_daily_stats_v2 │ tuple() │ all_1_1389_6_695 │ Wide │ /var/lib/clickhouse/store/7b0/7b0e6f30-114d-472d-8ff4-f4c9a4cf7cc1/all_1_1389_6_695/ │ │ corr_pair_value │ tuple() │ all_208611_209409_4 │ Wide │ /var/lib/clickhouse/store/5d7/5d732078-0f2e-4d04-9848-d216a1839b32/all_208611_209409_4/ │ │ corr_pair_value │ tuple() │ all_209410_209634_3 │ Wide │ /var/lib/clickhouse/store/5d7/5d732078-0f2e-4d04-9848-d216a1839b32/all_209410_209634_3/ │ │ corr_pair_value │ tuple() │ all_209635_210331_4 │ Wide │ /var/lib/clickhouse/store/5d7/5d732078-0f2e-4d04-9848-d216a1839b32/all_209635_210331_4/ │ │ mytest_of_ti │ 202408 │ 202408_4241_5001_5 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202408_4241_5001_5/ │ │ mytest_of_ti │ 202408 │ 202408_5002_5008_1 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202408_5002_5008_1/ │ │ mytest_of_ti │ 202408 │ 202408_5009_5045_2 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202408_5009_5045_2/ │ │ mytest_of_ti │ 202408 │ 202408_5046_5046_0 │ Compact │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202408_5046_5046_0/ │ │ mytest_of_ti │ 202409 │ 202409_5013_5624_5 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5013_5624_5/ │ │ mytest_of_ti │ 202409 │ 202409_5625_5661_2 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5625_5661_2/ │ │ mytest_of_ti │ 202409 │ 202409_5662_5699_2 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5662_5699_2/ │ │ mytest_of_ti │ 202409 │ 202409_5700_5708_1 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5700_5708_1/ │ │ mytest_of_ti │ 202409 │ 202409_5710_5793_4 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5710_5793_4/ │ │ mytest_of_ti │ 202409 │ 202409_5794_5794_0 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5794_5794_0/ │ │ mytest_of_ti │ 202409 │ 202409_5795_5795_0 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202409_5795_5795_0/ │ │ mytest_of_ti │ 202410 │ 202410_5703_5810_7 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5703_5810_7/ │ │ mytest_of_ti │ 202410 │ 202410_5811_5849_2 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5811_5849_2/ │ │ mytest_of_ti │ 202410 │ 202410_5850_5882_2 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5850_5882_2/ │ │ mytest_of_ti │ 202410 │ 202410_5883_5888_1 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5883_5888_1/ │ │ mytest_of_ti │ 202410 │ 202410_5889_5894_1 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5889_5894_1/ │ │ mytest_of_ti │ 202410 │ 202410_5895_5901_1 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5895_5901_1/ │ │ mytest_of_ti │ 202410 │ 202410_5902_5902_0 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5902_5902_0/ │ │ mytest_of_ti │ 202410 │ 202410_5903_5903_0 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5903_5903_0/ │ │ mytest_of_ti │ 202410 │ 202410_5904_5904_0 │ Wide │ /var/lib/clickhouse/store/32f/32ff94bb-43af-47c3-99b1-5c85e912483c/202410_5904_5904_0/ │ └─────────────────────────┴──────────┴──────────────────────┴───────────┴───────────────────────────────────────────────────────────────────────────────────────────┘
在 "part_type" 列中,"Wide "表示每一列都存储在文件系统中的一个单独文件中,而 "Compact "表示所有列都存储在文件系统的一个文件中。此外,"partition "列中的 "tuple() "表示该表未分区。
也可以在目录“/var/lib/clickhouse/data/<DBNAME>/<TABLENAME>”中查看表的 parts,会发现该目录下存放都是符号链接,通过链接可以查看表包含的parts。比如,进入表 mytest_of_ti 所在目录查看:
# ls -la ...... drwxr-x--- 2 clickhouse clickhouse 4096 Oct 10 15:10 202410_5811_5849_2 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 11 15:10 202410_5850_5882_2 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 11 15:10 202410_5883_5888_1 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 11 15:10 202410_5889_5894_1 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 13 15:10 202410_5895_5901_1 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 13 15:10 202410_5902_5902_0 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 13 15:10 202410_5903_5903_0 drwxr-x--- 2 clickhouse clickhouse 4096 Oct 13 15:10 202410_5904_5904_0 drwxr-x--- 2 clickhouse clickhouse 10 Nov 11 2023 detached -rw-r----- 1 clickhouse clickhouse 1 Nov 11 2023 format_version.txt -rw-r----- 1 clickhouse clickhouse 123 Dec 11 2023 mutation_252.txt -rw-r----- 1 clickhouse clickhouse 96 Dec 11 2023 mutation_259.txt -rw-r----- 1 clickhouse clickhouse 230 Dec 11 2023 mutation_264.txt
可以统计出表mytest_of_ti一共有个part:
# ls -la |grep 20 |wc -l 133
以列出的202410_5850_5882_2 该part为例,其中202410表示分区id;5850表示part中最小的块号;5882表示最大的块号;2表示块的级别(chunk level)。如果没有做分区,part 的名称以all_开头。
也可以从系统表 system.parts 中查看该part 的信息:
SELECT name, partition_id, min_block_number, max_block_number, level, data_version FROM system.parts WHERE (database = 'factor_data') AND (table = 'factor_value_of_trading') AND (name = '202410_5850_5882_2') Query id: e6678b2e-31e2-4a9d-9c9b-16ff8bf9ecf0 ┌─name───────────────┬─partition_id─┬─min_block_number─┬─max_block_number─┬─level─┬─data_version─┐ │ 202410_5850_5882_2 │ 202410 │ 5850 │ 5882 │ 2 │ 5850 │ └────────────────────┴──────────────┴──────────────────┴──────────────────┴───────┴──────────────┘ 1 row in set. Elapsed: 0.006 sec.
Partitions
与 parts 一样,也可以从 “system.parts ”表中访问合并树表的分区信息。不过,分区列现在不是用 “tuple() ”来表示了。要创建分区表,首先需要在创建表时使用 “PARTITION BY expr ”子句。例如,“PARTITION BY toYYYMMDD(start_time) ”子句创建了一个与 “start_time ”列相关的每日分区。在下面的示例中,可以看到Partitions名和parts名是不同的。分区类似于逻辑上的划分,但 parts是作为物理文件定位的。一个分区可以包含一个或多个 parts。
SELECT partition, name, active FROM system.parts WHERE (table = 'mytest_of_ti') AND (database = 'test') Query id: 9c8c15bc-df95-4713-b001-39447010a1ab ┌─partition─┬─name─────────────────┬─active─┐ │ 201411 │ 201411_1_1_1_264 │ 1 │ │ 201411 │ 201411_2_2_3_264 │ 1 │ │ 201412 │ 201412_3_3_3_264 │ 1 │ │ 201412 │ 201412_4_4_1_264 │ 1 │ │ 201412 │ 201412_5_5_1_264 │ 1 │ │ 202001 │ 202001_6_6_3_264 │ 1 │ │ 202002 │ 202002_7_7_3_264 │ 1 │ │ 202003 │ 202003_8_8_3_264 │ 1 │ │ 202004 │ 202004_9_9_1_264 │ 1 │ │ 202004 │ 202004_10_10_3_264 │ 1 │ ... │ 202410 │ 202410_5934_5934_0 │ 0 │ │ 202410 │ 202410_5935_5935_0 │ 0 │ │ 202410 │ 202410_5936_5936_0 │ 0 │ │ 202410 │ 202410_5937_5937_0 │ 1 │ │ 202410 │ 202410_5938_5938_0 │ 1 │ └───────────┴──────────────────────┴────────┘ 167 rows in set. Elapsed: 0.005 sec.
通常,分区是用来提升查询性能。便于用户灵活地管理数据子集。可以直接查询分区、删除分区等。
可以通过指定 where 子句或者使用隐藏列"_partition_id" 来查看指定的分区。当然最好使用官方推荐的方式,在where子句中添加分区条件比较好。 有些特殊情况下,需要使用隐藏列"_partition_id"。
现在,让我们看看分区表的查询示例。假设我们的表(test.mytest_of_ti)对 “created_at ”列进行了分区。我们可以使用分区键列和隐藏的“_partition_id ”列访问特定分区。此外,我们还可以使用“_partition_id ”查询前 10 个分区。
SELECT count() FROM recoDB.mytest_of_ti WHERE toDate(created_at) = '2023-01-21' ┌─count()─┐ │ 9731 │ └─────────┘ ##################################### SELECT count() FROM test.mytest_of_ti WHERE _partition_id = '20230121' ┌─count()─┐ │ 9731 │ └─────────┘
SELECT _partition_id, count() FROM test.mytest_of_ti GROUP BY _partition_id ORDER BY 2 DESC LIMIT 10
卸载或附加Parts/Partitions
detach 操作可以将指定的 parts/partitions 移动到 detached 目录。在重新附加之前,用户无法访问这些数据。缺省情况下,detached 目录位于"/var/lib/clickhouse/data/<DATABASE_NAME>/<TABLE_NAME>"中。
而 attach 操作可以将被卸载的parts/partitions 附加进来。语法如下所示:
#DETACH PART/PARTITION ALTER TABLE <DATABASE_NAME>.<TABLE_NAME> [ON CLUSTER <CLUSTER_NAME>] DETACH PARTITION|PART <PARTITION_EXPRESSION> #ATTACH PART/PARTITION ALTER TABLE <DATABASE_NAME>.<TABLE_NAME> [ON CLUSTER <CLUSTER_NAME>] ATTACH PARTITION|PART <PARTITION_EXPRESSION>
这里的 PARTITION_EXPRESSION 是分区名、分区id或表达式自身。
这里,分区表达式应该是分区名称、分区 id 或表达式本身。
假设我们要从 “test.mytest_of_ti ”表中卸载任何指定日期的分区。
首先,找到指定日期的分区和parts名称。如图所示,该日期位于分区 “20231013 ”中,该分区有两个part。
SELECT partition, name, partition_id FROM system.parts WHERE (database = 'test') AND (table = 'mytest_of_ti') AND (partition_id IN ( SELECT DISTINCT _partition_id FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' )) ┌─partition─┬─name─────────────┬─partition_id─┐ │ 20231013 │ 20231013_62_62_0 │ 20231013 │ │ 20231013 │ 20231013_78_78_0 │ 20231013 │ └───────────┴──────────────────┴──────────────┘
开始卸载分区"20231013",然后再附加进来。
-- Count related date before detached SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 78536 │ └─────────┘ -- detach partition ALTER TABLE test.mytest_of_ti DETACH PARTITION 20231013 Ok. -- Count related date after detached SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 0 │ └─────────┘
然后再重新附加进来:
--attach partition ALTER TABLE test.mytest_of_ti ATTACH PARTITION 20231013 Ok. --Count related date after attach partition SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 78536 │ └─────────┘
同样对单个part做如上的操作。
--find parts for given date SELECT partition, name, partition_id FROM system.parts WHERE (database = 'test') AND (table = 'mytest_of_ti') AND (partition_id IN ( SELECT DISTINCT _partition_id FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' )) ┌─partition─┬─name───────────────┬─partition_id─┐ │ 20231013 │ 20231013_142_142_0 │ 20231013 │ │ 20231013 │ 20231013_143_143_0 │ 20231013 │ └───────────┴────────────────────┴──────────────┘ -- Count before detach part operation SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 78536 │ └─────────┘ -- detach any of the parts ALTER TABLE test.mytest_of_ti DETACH PART '20231013_142_142_0' Ok. -- Count after detach part operation SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 78406 │ └─────────┘ -- Attach related part ALTER TABLE test.mytest_of_ti ATTACH PART '20231013_142_142_0' -- Count after attach the part SELECT count() FROM test.mytest_of_ti WHERE toDate(created_at) = '2023-10-13' ┌─count()─┐ │ 78536 │ └─────────┘
也可以将源表的分区附加到目标表上,如下例所示,附加后数据不会从源表或目标表被删除:
ALTER TABLE <DESTINATION_TABLE> [ON CLUSTER <CLUSTER_NAME>] ATTACH PARTITION <PARTITION EXPRESSION> FROM <SOURCE_TABLE>
不过,该操作需要目标表和源表满足以下条件:
·结构相同
·分区键、排序键和主键相同
·存储策略相同
删除Parts/Partitions
删除的parts/Partitions 在系统表 system.parts 中被标记为不活跃的,且删除操作完成后,对应的parts/Partitions 会在 "/var/lib/clickhouse/data/<database_name>/<table_name>/" 中保留十分钟。
-- Drop from table itself ALTER TABLE <DATABASE_NAME>.<TABLE_NAME> [ON CLUSTER <CLUSTER_NAME>] DROP PARTITION|PART 'PART/PARTITON EXPRESSION' -- Remove specified part/partition from detached folder ALTER TABLE <DATABASE_NAME>.<TABLE_NAME> [ON CLUSTER <CLUSTER_NAME>] DROP DETACHED PARTITION|PART 'PART/PARTITON EXPRESSION'
移动Parts/Partitions
可以将partitions移动到 Clickhouse 中的另一个表。在这种情况下,源表和目标表都具有相同的结构、分区键、按键排序、主键、存储策略和引擎系列。
对于合并树引擎表,移动的另一个选项是将parts或partitions移动到另一个磁盘或卷上。
-- Usage: ALTER TABLE <DATABASE_NAME.SOURCE_TABLE> [ON CLUSTER <CLUSTER_NAME>] MOVE PARTITION <PARTITION EXPRESSION> TO TABLE <DATABASE_NAME.DESTINATION_TABLE> -- 1. Create empty mytest_of_ti_new table same as the mytest_of_ti -- 2. Move 20231013 partition from mytest_of_ti to mytest_of_ti_new ALTER TABLE test.mytest_of_ti MOVE PARTITION '20231013' TO TABLE test.mytest_of_ti_new Ok. --3. Take partition count for 20231013 in source table SELECT count() FROM test.mytest_of_ti WHERE _partition_id = '20231013' ┌─count()─┐ │ 0 │ └─────────┘ --4 Take partition count for 20231013 in destination table SELECT count() FROM test.mytest_of_ti_new WHERE _partition_id = '20231013' Query id: 49319c73-c84f-4fbf-838d-ab787971eaad ┌─count()─┐ │ 78536 │ └─────────┘
移动 parts/partitions 到其它的磁盘或卷,需要设置一个存储策略,并使用创建的策略创建一个新表。这里不再讲存储策略。
-- Our table's(mytest_of_ti_with_storage_policy) policy is "vo1_to_vo2" -- and this volume contain 2 disks(vo1 and vo2) SELECT * FROM system.storage_policies WHERE policy_name = ( SELECT storage_policy FROM system.tables WHERE (database = 'test') AND (name = 'mytest_of_ti_with_storage_policy') ┌─policy_name─┬─volume_name─┬─volume_priority─┬─disks───┬─volume_type─┬─max_data_part_size─┬─move_factor─┬─prefer_not_to_merge─┐ │ vo1_to_vo2 │ vo1_volume │ 1 │ ['vo1'] │ JBOD │ 0 │ 0.1 │ 0 │ │ vo1_to_vo2 │ vo2_volume │ 2 │ ['vo2'] │ JBOD │ 0 │ 0.1 │ 0 │ └─────────────┴─────────────┴─────────────────┴─────────┴─────────────┴────────────────────┴─────────────┴─────────────────────┘ -- The 20231013 partition is stored under vo1 disk. SELECT partition, name, path FROM system.parts WHERE (database = 'test') AND (table = 'mytest_of_ti_with_storage_policy') AND (partition = '20231013') ┌─partition─┬─name───────────────┬─path────────────────────────────────────────────────────────────────────┐ │ 20231013 │ 20231013_107_107_0 │ /vo1/store/346/3465ekb6-231d-493d-94bc-518234d79b45/20231013_107_107_0/ │ └───────────┴────────────────────┴─────────────────────────────────────────────────────────────────────────┘ -- Let's move it to the vo2 disk ALTER TABLE test.mytest_of_ti_with_storage_policy MOVE PARTITION '20231013' TO DISK 'vo2' -Check the new volume SELECT partition, name, path FROM system.parts WHERE (database = 'test') AND (table = 'mytest_of_ti_with_storage_policy') AND (partition = '20231013') ┌─partition─┬─name───────────────┬─path────────────────────────────────────────────────────────────────────┐ │ 20231013 │ 20231013_107_107_0 │ /vo2/store/346/3465ekb6-231d-493d-94bc-518234d79b45/20231013_107_107_0/ │ └───────────┴────────────────────┴─────────────────────────────────────────────────────────────────────────┘ -- You can move parts and also you can move volume instead of disk