Kafka集群新增节点后数据如何重分配
新增节点的步骤
将其他节点的server.properties配置文件拷贝后修改以下参数
broker.id
log.dirs
zookeeper.connect
数据迁移原理
- 只有新增的Topic才会将数据分布在新节点上,如果要将现有数据也分配到新节点,需要将Topic中的数据迁移到新节点上。
- 数据迁移过程是手动启动的,但是是完全自动化的。Kafka会将新节点添加为要迁移的分区的追随者,并允许其完全复制该分区中的现有数据。新节点完全复制此分区的内容并加入同步副本后,现有副本之一将删除其分区的数据。
数据迁移工具介绍
分区重新分配工具可用于在代理之间移动分区。理想的分区分配将确保所有代理之间的数据负载和分区大小均匀。分区重新分配工具没有能力自动研究Kafka群集中的数据分布,并四处移动分区以实现均匀的负载分布。因此,必须弄清楚应该移动哪些主题或分区。
分区重新分配工具可以在3种模式下运行:
- --generate:在此模式下,给定主题列表和代理列表,该工具会生成分区与副本重新分配的计划,以将指定主题的所有分区在所有节点上重新分配。在给定主题和目标代理的列表的情况下,此选项仅提供了一种方便的方式来生成分区重新分配计划。
- --execute:在此模式下,该工具将根据用户提供的重新分配计划启动分区的重新分配。(使用--reassignment-json-file选项)。这可以是管理员手工制作的自定义重新分配计划,也可以使用--generate选项提供
- --verify:在此模式下,该工具会验证上一次--execute期间列出的所有分区的重新分配状态。状态可以是成功完成,失败或进行中
示例:
现有5个节点的broker_id为1,2,3,4,5;新增节点broker_id为6
Topic:test 有6个分区,5个副本
创建要迁移的topic配置文件
topics-to-move.json
{
"topics": [
{"topic": "test"}
],
"version": 1
}
生成重新分配计划
[root@k8s-node50 ~]# kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate
Current partition replica assignment
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}
Proposed partition reassignment configuration
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[5],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[3],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[4],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[5],"log_dirs":["any"]}]}
输出结果中有你当前的分区分配策略,也有 Kafka 期望的分配策略,在期望的分区分配策略里,kafka 已经尽可能的为你分配均衡。
我们先将 Current partition replica assignment 的内容备份,以便回滚到原来的分区分配状态。
然后将 Proposed partition reassignment configuration 的内容拷贝到一个新的文件中(文件名称、格式任意,但要保证内容为json格式)
以上命令将会产生以下内容,将 Proposed partition reassignment configuration 下的内容保存为reassignment.json文件
执行数据迁移
[root@k8s-node50 ~]# kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute
Current partition replica assignment
{"version":1,"partitions":[{"topic":"test","partition":0,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":1,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":2,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":3,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":4,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":5,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":6,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":7,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":8,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":9,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":10,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":11,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":12,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":13,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":14,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":15,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":16,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":17,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":18,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":19,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":20,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":21,"replicas":[2],"log_dirs":["any"]},{"topic":"test","partition":22,"replicas":[1],"log_dirs":["any"]},{"topic":"test","partition":23,"replicas":[2],"log_dirs":["any"]}]}
Save this to use as the --reassignment-json-file option during rollback
Successfully started partition reassignments for test-0,test-1,test-2,test-3,test-4,test-5,test-6,test-7,test-8,test-9,test-10,test-11,test-12,test-13,test-14,test-15,test-16,test-17,test-18,test-19,test-20,test-21,test-22,test-23
检查重新分配的分区状态
[root@k8s-node50 ~]# kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify
Status of partition reassignment:
Reassignment of partition test-0 is complete.
Reassignment of partition test-1 is complete.
Reassignment of partition test-2 is complete.
Reassignment of partition test-3 is complete.
Reassignment of partition test-4 is complete.
Reassignment of partition test-5 is complete.
Reassignment of partition test-6 is complete.
Reassignment of partition test-7 is complete.
Reassignment of partition test-8 is complete.
Reassignment of partition test-9 is complete.
Reassignment of partition test-10 is complete.
Reassignment of partition test-11 is complete.
Reassignment of partition test-12 is complete.
Reassignment of partition test-13 is complete.
Reassignment of partition test-14 is complete.
Reassignment of partition test-15 is complete.
Reassignment of partition test-16 is complete.
Reassignment of partition test-17 is complete.
Reassignment of partition test-18 is complete.
Reassignment of partition test-19 is complete.
Reassignment of partition test-20 is complete.
Reassignment of partition test-21 is complete.
Reassignment of partition test-22 is complete.
Reassignment of partition test-23 is complete.
Clearing broker-level throttles on brokers 5,1,2,3,4
Clearing topic-level throttles on topic test
查看,关注Isr
[root@k8s-node50 ~]# kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic test
Topic: test TopicId: iVbESSwBRlW-0yL-Lkyq6A PartitionCount: 24 ReplicationFactor: 1 Configs: cleanup.policy=delete,flush.ms=50,segment.bytes=1073741824
Topic: test Partition: 0 Leader: 2 Replicas: 2 Isr: 2
Topic: test Partition: 1 Leader: 3 Replicas: 3 Isr: 3
Topic: test Partition: 2 Leader: 4 Replicas: 4 Isr: 4
Topic: test Partition: 3 Leader: 5 Replicas: 5 Isr: 5
Topic: test Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: test Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: test Partition: 6 Leader: 3 Replicas: 3 Isr: 3
Topic: test Partition: 7 Leader: 4 Replicas: 4 Isr: 4
Topic: test Partition: 8 Leader: 5 Replicas: 5 Isr: 5
Topic: test Partition: 9 Leader: 1 Replicas: 1 Isr: 1
Topic: test Partition: 10 Leader: 2 Replicas: 2 Isr: 2
Topic: test Partition: 11 Leader: 3 Replicas: 3 Isr: 3
Topic: test Partition: 12 Leader: 4 Replicas: 4 Isr: 4
Topic: test Partition: 13 Leader: 5 Replicas: 5 Isr: 5
Topic: test Partition: 14 Leader: 1 Replicas: 1 Isr: 1
Topic: test Partition: 15 Leader: 2 Replicas: 2 Isr: 2
Topic: test Partition: 16 Leader: 3 Replicas: 3 Isr: 3
Topic: test Partition: 17 Leader: 4 Replicas: 4 Isr: 4
Topic: test Partition: 18 Leader: 5 Replicas: 5 Isr: 5
Topic: test Partition: 19 Leader: 1 Replicas: 1 Isr: 1
Topic: test Partition: 20 Leader: 2 Replicas: 2 Isr: 2
Topic: test Partition: 21 Leader: 3 Replicas: 3 Isr: 3
Topic: test Partition: 22 Leader: 4 Replicas: 4 Isr: 4
Topic: test Partition: 23 Leader: 5 Replicas: 5 Isr: 5
批量多Topic脚本
[root@k8s-node50 ~]# cat for_topic.sh
list='
alg_ocrservice
alg_ocrservice_result
alg_politic_dbtopic
alg_politicservice
alg_politicservice_result
alg_porn
alg_pornservice
alg_pornservice_result
alg_terror
alg_terrorservice
alg_terrorservice_result
alg_vfp_dbtopic
alg_vfpservice
alg_vfpservice_result
alg_video_transcode
alg_videoservice
copyright_count
copyright_vfp_operation_result
handle_into_db
handle_into_db_media
handle_report
handle_report_all
hcy_all_record
hot_spot_count_key
keyword_tactic_count
media_download
media_download_result
office_helper
office_helper_result
politic_operation_copy_result
politic_operation_result
rd031_alg_keyword_tactic
rd031_alg_replace_word
rich_media_count
risk_control_log
syncapi-log
uiall_all
uiall_audit
uiall_report
uploader_frequency_count_key
vfp_operation_result
video_asr_result
video_capture_result
video_ocr_result
video_transcode_result
'
for topic in $list;do
cat <<EOF> topics-to-move.json
{
"topics": [
{"topic": "$topic"}
],
"version": 1
}
EOF
echo "Topic: $topic 生成重新分配计划"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --topics-to-move-json-file topics-to-move.json --broker-list "1,2,3,4,5" --generate| awk '/Proposed partition reassignment configuration/,/}/'|awk 'NR>1{print}' > reassignment.json
echo "Topic: $topic执行数据迁移"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --execute
echo "Topic: $topic 检查重新分配的分区状态"
kafka-reassign-partitions.sh --bootstrap-server 10.19.29.50:9092 --reassignment-json-file reassignment.json --verify
echo "查看Topic: $topic"
kafka-topics.sh --bootstrap-server 10.19.29.50:9092 --describe --topic $topic
done
本文转自 https://cloud.tencent.com/developer/article/1964425,如有侵权,请联系删除。