Kafka集群新增节点后数据如何重分配

新增节点的步骤

将其他节点的server.properties配置文件拷贝后修改以下参数

broker.id
log.dirs
zookeeper.connect

数据迁移原理

  1. 只有新增的Topic才会将数据分布在新节点上,如果要将现有数据也分配到新节点,需要将Topic中的数据迁移到新节点上。
  2. 数据迁移过程是手动启动的,但是是完全自动化的。Kafka会将新节点添加为要迁移的分区的追随者,并允许其完全复制该分区中的现有数据。新节点完全复制此分区的内容并加入同步副本后,现有副本之一将删除其分区的数据。

数据迁移工具介绍

分区重新分配工具可用于在代理之间移动分区。理想的分区分配将确保所有代理之间的数据负载和分区大小均匀。分区重新分配工具没有能力自动研究Kafka群集中的数据分布,并四处移动分区以实现均匀的负载分布。因此,必须弄清楚应该移动哪些主题或分区。

分区重新分配工具可以在3种模式下运行:

  • --generate:在此模式下,给定主题列表和代理列表,该工具会生成分区与副本重新分配的计划,以将指定主题的所有分区在所有节点上重新分配。在给定主题和目标代理的列表的情况下,此选项仅提供了一种方便的方式来生成分区重新分配计划。
  • --execute:在此模式下,该工具将根据用户提供的重新分配计划启动分区的重新分配。(使用--reassignment-json-file选项)。这可以是管理员手工制作的自定义重新分配计划,也可以使用--generate选项提供
  • --verify:在此模式下,该工具会验证上一次--execute期间列出的所有分区的重新分配状态。状态可以是成功完成,失败或进行中

示例:

现有5个节点的broker_id为1,2,3,4,5;新增节点broker_id为6

Topic:test 有6个分区,5个副本

创建要迁移的topic配置文件

topics-to-move.json

{
    "topics": [
        {"topic": "test"}
    ],
    "version": 1
}
生成重新分配计划
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate
Current partition replica assignment
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}

Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[5],"log_dirs":["any"]}]}

输出结果中有你当前的分区分配策略,也有 Kafka 期望的分配策略,在期望的分区分配策略里,kafka 已经尽可能的为你分配均衡。

我们先将 Current partition replica assignment 的内容备份,以便回滚到原来的分区分配状态。

然后将 Proposed partition reassignment configuration 的内容拷贝到一个新的文件中(文件名称、格式任意,但要保证内容为json格式)

以上命令将会产生以下内容,将 Proposed partition reassignment configuration 下的内容保存为reassignment.json文件

执行数据迁移
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute
Current partition replica assignment

{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started partition reassignments for test-0,test-1,test-2,test-3,test-4,test-5,test-6,test-7,test-8,test-9,test-10,test-11,test-12,test-13,test-14,test-15,test-16,test-17,test-18,test-19,test-20,test-21,test-22,test-23

检查重新分配的分区状态
[root@k8s-node50 ~]#  kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify
Status of partition reassignment:
Reassignment of partition test-0 is complete.
Reassignment of partition test-1 is complete.
Reassignment of partition test-2 is complete.
Reassignment of partition test-3 is complete.
Reassignment of partition test-4 is complete.
Reassignment of partition test-5 is complete.
Reassignment of partition test-6 is complete.
Reassignment of partition test-7 is complete.
Reassignment of partition test-8 is complete.
Reassignment of partition test-9 is complete.
Reassignment of partition test-10 is complete.
Reassignment of partition test-11 is complete.
Reassignment of partition test-12 is complete.
Reassignment of partition test-13 is complete.
Reassignment of partition test-14 is complete.
Reassignment of partition test-15 is complete.
Reassignment of partition test-16 is complete.
Reassignment of partition test-17 is complete.
Reassignment of partition test-18 is complete.
Reassignment of partition test-19 is complete.
Reassignment of partition test-20 is complete.
Reassignment of partition test-21 is complete.
Reassignment of partition test-22 is complete.
Reassignment of partition test-23 is complete.

Clearing broker-level throttles on brokers 5,1,2,3,4
Clearing topic-level throttles on topic test

查看,关注Isr
[root@k8s-node50 ~]#  kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic test
Topic: test	TopicId: iVbESSwBRlW-0yL-Lkyq6A	PartitionCount: 24	ReplicationFactor: 1	Configs: cleanup.policy=delete,flush.ms=50,segment.bytes=1073741824
	Topic: test	Partition: 0	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 1	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 2	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 3	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 4	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 5	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 6	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 7	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 8	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 9	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 10	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 11	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 12	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 13	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 14	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 15	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 16	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 17	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 18	Leader: 5	Replicas: 5	Isr: 5
	Topic: test	Partition: 19	Leader: 1	Replicas: 1	Isr: 1
	Topic: test	Partition: 20	Leader: 2	Replicas: 2	Isr: 2
	Topic: test	Partition: 21	Leader: 3	Replicas: 3	Isr: 3
	Topic: test	Partition: 22	Leader: 4	Replicas: 4	Isr: 4
	Topic: test	Partition: 23	Leader: 5	Replicas: 5	Isr: 5

批量多Topic脚本

[root@k8s-node50 ~]#  cat for_topic.sh 
list='
alg_ocrservice
alg_ocrservice_result
alg_politic_dbtopic
alg_politicservice
alg_politicservice_result
alg_porn
alg_pornservice
alg_pornservice_result
alg_terror
alg_terrorservice
alg_terrorservice_result
alg_vfp_dbtopic
alg_vfpservice
alg_vfpservice_result
alg_video_transcode
alg_videoservice
copyright_count
copyright_vfp_operation_result
handle_into_db
handle_into_db_media
handle_report
handle_report_all
hcy_all_record
hot_spot_count_key
keyword_tactic_count
media_download
media_download_result
office_helper
office_helper_result
politic_operation_copy_result
politic_operation_result
rd031_alg_keyword_tactic
rd031_alg_replace_word
rich_media_count
risk_control_log
syncapi-log
uiall_all
uiall_audit
uiall_report
uploader_frequency_count_key
vfp_operation_result
video_asr_result
video_capture_result
video_ocr_result
video_transcode_result
'

for topic in $list;do
cat <<EOF> topics-to-move.json
{
    "topics": [
        {"topic": "$topic"}
    ],
    "version": 1
}
EOF

echo "Topic: $topic 生成重新分配计划"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate| awk '/Proposed partition reassignment configuration/,/}/'|awk 'NR>1{print}' > reassignment.json

echo "Topic: $topic执行数据迁移"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute

echo "Topic: $topic 检查重新分配的分区状态"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify

echo "查看Topic: $topic"
kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic $topic
done

本文转自 https://cloud.tencent.com/developer/article/1964425,如有侵权,请联系删除。

posted @ 2024-12-24 16:43  broadviews  阅读(9)  评论(0编辑  收藏  举报