1.Kafka 集群简介
1.1 Kafka 集群的基本组件
-
ZooKeeper:用于管理 Kafka 集群的元数据和协调节点。
-
Broker:Kafka 的服务器节点,负责存储和转发消息。
-
Producer:消息生产者,负责向 Kafka 集群发送消息。
-
Consumer:消息消费者,负责从 Kafka 集群中读取消息。
-
Topic:消息的主题,生产者将消息发送到特定的 Topic,消费者从 Topic 中读取消息。
-
Partition:Topic 的分区,用于提高并行度和吞吐量。
-
Replica:分区的副本,用于提高数据的可靠性和容错性。
1.2 Kafka 集群的工作原理
-
生产者发送消息:
-
生产者将消息发送到指定的 Topic。
-
消息会被写入到 Topic 的一个或多个 Partition 中。
-
-
Broker 存储消息:
-
每个 Partition 由一个 Broker 负责存储和管理。
-
Broker 会将消息存储在磁盘上,并维护消息的偏移量(Offset)。
-
-
消费者读取消息:
-
消费者从指定的 Topic 和 Partition 中读取消息。
-
消费者通过维护 Offset 来记录读取的位置。
-
-
副本机制:
-
每个 Partition 会有多个 Replica(副本),其中一个为主副本(Leader),其他为从副本(Follower)。
-
Leader 负责处理读写请求,Follower 从 Leader 同步数据。
-
如果 Leader 故障,Follower 会选举新的 Leader,保证服务的高可用性。
-
1.3 Kafka 集群的优点
-
高吞吐量:Kafka 通过分区和副本机制实现了高吞吐量的消息处理。
-
低延迟:Kafka 的设计使得消息的生产和消费都非常快速,适合实时数据处理。
-
可扩展性:可以通过增加 Broker 节点轻松扩展集群的容量和性能。
-
高可靠性:通过副本机制保证数据的可靠性和容错性。
-
持久化存储:消息会被持久化存储在磁盘上,避免数据丢失。
1.4 Kafka 集群的典型架构
+----------------+ +----------------+ +----------------+
| | | | | |
| Producer +-------> Broker 1 +-------> Consumer |
| | | | | |
+----------------+ +----------------+ +----------------+
|
|
|
|
+-------+-------+
| |
| Broker 2 |
| |
+-------+-------+
|
|
|
+-------+-------+
| |
| Broker 3 |
| |
+-------+-------+
-
Producer 将消息发送到 Broker。
-
Broker 存储消息,并通过副本机制保证数据的可靠性。
-
Consumer 从 Broker 中读取消息。
1.5 适用场景
-
实时数据管道:将数据从一个系统传输到另一个系统。
-
日志聚合:收集和存储日志数据。
-
事件驱动架构:构建基于事件的系统。
-
流处理:对实时数据进行处理和分析。
总结
2.部署环境
IP | 节点 | 操作系统 | k8s版本 | zookeeper版本 | kafka版本 | docker版本 |
172.16.4.85 | master1 | centos7.8 | 1.23.17 | 20.10.9 | ||
172.16.4.86 | node1 | centos7.8 | 1.23.17 | 3.8.0 | 3.9.0 | 20.10.9 |
172.16.4.87 | node2 | centos7.8 | 1.23.17 | 3.8.0 | 3.9.0 | 20.10.9 |
172.16.4.89 | node3 | centos7.8 | 1.23.17 | 3.8.0 | 3.9.0 | 20.10.9 |
3.zookeeper集群部署
3.1 部署文档
https://www.cnblogs.com/Leonardo-li/p/18720249
3.2 zookeeper集群地址
zk-service.zk.svc.cluster.local:32181 或 zk-service.zk:32181
4.kafka部署
4.1 nfs部署
- centos7安装nfs
yum install -y nfs-utils
- 创建nfs共享目录
mkdir -p /nfs_share/k8s/kafka/pv{1..3}
- nfs配置文件编辑
[root@localhost kafka]# cat /etc/exports
/nfs_share/k8s/kafka/pv1 *(rw,sync,no_subtree_check,no_root_squash)
/nfs_share/k8s/kafka/pv2 *(rw,sync,no_subtree_check,no_root_squash)
/nfs_share/k8s/kafka/pv3 *(rw,sync,no_subtree_check,no_root_squash)
- 启动nfs服务
# 启动 NFS 服务
systemctl start nfs-server
# 设置 NFS 服务在系统启动时自动启动
systemctl enable nfs-server
- 加载配置文件,并输出
[root@localhost kafka]# exportfs -r
[root@localhost kafka]# exportfs -v
/nfs_share/k8s/kafka/pv1
<world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
/nfs_share/k8s/kafka/pv2
<world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
/nfs_share/k8s/kafka/pv3
<world>(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
4.2 创建namespace
kubectl create ns kafka
4.3 pv部署
apiVersion: v1
kind: PersistentVolume
metadata:
name: kafka-nfs-pv-1
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: kafka-nfs-storage
nfs:
path: /nfs_share/k8s/kafka/pv1
server: 172.16.4.60
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: kafka-nfs-pv-2
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: kafka-nfs-storage
nfs:
path: /nfs_share/k8s/kafka/pv2
server: 172.16.4.60
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: kafka-nfs-pv-3
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: kafka-nfs-storage
nfs:
path: /nfs_share/k8s/kafka/pv3
server: 172.16.4.60
kubectl apply -f ka-pv.yaml
4.4 pvc部署
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: kafka-data-kafka-0
namespace: kafka
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: kafka-nfs-storage
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: kafka-data-kafka-1
namespace: kafka
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: kafka-nfs-storage
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: kafka-data-kafka-2
namespace: kafka
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: kafka-nfs-storage
kubectl apply -f ka-pvc.yaml
4.5 service + statefulset部署
- 公网镜像【busybox:latest】、【registry.cn-shenzhen.aliyuncs.com/library-base/bitnami_kafka:3.9.0】
- 镜像私有化【172.16.4.17:8090/ltzx/busybox:latest】、【172.16.4.17:8090/ltzx/registry.cn-shenzhen.aliyuncs.com/library-base/bitnami_kafka:3.9.0】
- 默认端口9092修改为39092
apiVersion: v1
kind: Service
metadata:
name: kafka
namespace: kafka
labels:
app: kafka
spec:
ports:
- port: 39092
name: kafka-port
clusterIP: None
selector:
app: kafka
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
namespace: kafka
spec:
selector:
matchLabels:
app: kafka
serviceName: kafka
replicas: 3
template:
metadata:
labels:
app: kafka
spec:
initContainers:
- name: configure-broker-id-and-listeners
image: 172.16.4.17:8090/ltzx/busybox:latest
command: ["/bin/sh", "-c"]
args:
- |
HOSTNAME=$(hostname)
BROKER_ID=$(echo $HOSTNAME | awk -F'-' '{print $NF}')
echo $BROKER_ID > /tmp/broker_id
ADVERTISED_LISTENERS="PLAINTEXT_INTERNAL://$HOSTNAME.kafka:39092"
echo "export KAFKA_ADVERTISED_LISTENERS=$ADVERTISED_LISTENERS" > /tmp/kafka_env.sh
volumeMounts:
- name: broker-id-volume
mountPath: /tmp
containers:
- name: kafka
image: 172.16.4.17:8090/ltzx/registry.cn-shenzhen.aliyuncs.com/library-base/bitnami_kafka:3.9.0
ports:
- containerPort: 39092
env:
- name: ALLOW_PLAINTEXT_LISTENER
value: "yes"
- name: KAFKA_ZOOKEEPER_CONNECT
value: zk-service.zk:32181
- name: KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP
value: "PLAINTEXT_INTERNAL:PLAINTEXT"
- name: KAFKA_LISTENERS
value: "PLAINTEXT_INTERNAL://:39092"
- name: KAFKA_BROKER_ID_COMMAND
value: "cat /tmp/broker_id"
- name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR
value: "3"
- name: KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR
value: "3"
- name: KAFKA_TRANSACTION_STATE_LOG_MIN_ISR
value: "2"
- name: KAFKA_LOG_MESSAGE_FORMAT_VERSION
value: "3.9"
- name: KAFKA_INTER_BROKER_LISTENER_NAME
value: "PLAINTEXT_INTERNAL"
command: ["/bin/sh", "-c"]
args:
- |
. /tmp/kafka_env.sh
/opt/bitnami/scripts/kafka/entrypoint.sh /opt/bitnami/scripts/kafka/run.sh
volumeMounts:
- name: kafka-data
mountPath: /bitnami/kafka/data
- name: broker-id-volume
mountPath: /tmp
volumes:
- name: broker-id-volume
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: kafka-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: kafka-nfs-storage
kubectl apply -f ka-ss.yaml
- statefulset部分的解释
(1)元数据部分
(2)副本数量
(3)Pod模版
(4)初始化容器
(5)主容器
(6)环境变量
(7)容器命令
(8)卷挂载
(9)卷定义
(10)存储卷声明模板
4.5 部署结果
[root@master1 ka-n9]# kubectl get pv |grep kafka
kafka-nfs-pv-1 10Gi RWO Retain Bound kafka/kafka-data-kafka-0 kafka-nfs-storage 6h7m
kafka-nfs-pv-2 10Gi RWO Retain Bound kafka/kafka-data-kafka-1 kafka-nfs-storage 6h7m
kafka-nfs-pv-3 10Gi RWO Retain Bound kafka/kafka-data-kafka-2 kafka-nfs-storage 6h7m
[root@master1 ka-n9]# kubectl get pvc -n kafka
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
kafka-data-kafka-0 Bound kafka-nfs-pv-1 10Gi RWO kafka-nfs-storage 5h51m
kafka-data-kafka-1 Bound kafka-nfs-pv-2 10Gi RWO kafka-nfs-storage 5h51m
kafka-data-kafka-2 Bound kafka-nfs-pv-3 10Gi RWO kafka-nfs-storage 5h51m
[root@master1 ka-n9]# kubectl get svc -n kafka
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kafka ClusterIP None <none> 39092/TCP 128m
[root@master1 ka-n9]# kubectl get sts -n kafka
NAME READY AGE
kafka 3/3 128m
[root@master1 ka-n9]# kubectl get pods -n kafka
NAME READY STATUS RESTARTS AGE
kafka-0 1/1 Running 0 127m
kafka-1 1/1 Running 0 127m
kafka-2 1/1 Running 0 127m
[root@master1 ka-n9]# kubectl get pods -n kafka -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kafka-0 1/1 Running 0 127m 10.244.135.35 node3 <none> <none>
kafka-1 1/1 Running 0 127m 10.244.104.31 node2 <none> <none>
kafka-2 1/1 Running 0 127m 10.244.166.156 node1 <none> <none>
4.6 集群状态验证
4.6.1 通过zookeeper集群验证kafka的注册情况
- 进入zookeeper pod,可以看到brokers的ids为[0,1,2],分别对应kafka三个集群节点,说明kafka注册zookeeper成功
[root@master1 ka-n9]# kubectl exec -it zk-test-0 -n zk -- /opt/bitnami/zookeeper/bin/zkCli.sh -server localhost:32181
[zk: localhost:32181(CONNECTED) 0] ls /brokers/ids
[0, 1, 2]
[zk: localhost:32181(CONNECTED) 1] get /brokers/ids/0
{"listener_security_protocol_map":{"PLAINTEXT_INTERNAL":"PLAINTEXT"},"endpoints":["PLAINTEXT_INTERNAL://kafka-0.kafka:39092"],"jmx_port":-1,"features":{},"host":"kafka-0.kafka","timestamp":"1740553754147","port":39092,"version":5}
[zk: localhost:32181(CONNECTED) 2] get /brokers/ids/1
{"listener_security_protocol_map":{"PLAINTEXT_INTERNAL":"PLAINTEXT"},"endpoints":["PLAINTEXT_INTERNAL://kafka-1.kafka:39092"],"jmx_port":-1,"features":{},"host":"kafka-1.kafka","timestamp":"1740553756558","port":39092,"version":5}
[zk: localhost:32181(CONNECTED) 3] get /brokers/ids/2
{"listener_security_protocol_map":{"PLAINTEXT_INTERNAL":"PLAINTEXT"},"endpoints":["PLAINTEXT_INTERNAL://kafka-1.kafka:39092"],"jmx_port":-1,"features":{},"host":"kafka-2.kafka","timestamp":"1740553756590","port":39092,"version":5}
4.6.2 通过kafka发送消息,并进行消费验证kafa集群
- 发送消息
[root@master1 ka-n9]# kubectl exec -it kafka-0 -n kafka -- /opt/bitnami/kafka/bin/kafka-console-producer.sh --bootstrap-server kafka.kafka.svc.cluster.local:39092 --topic test-topic
Defaulted container "kafka" out of: kafka, configure-broker-id-and-listeners (init)
>Hello, Kafka!
This is a test message.
[2025-02-26 08:08:39,963] WARN [Producer clientId=console-producer] The metadata response from the cluster reported a recoverable issue with correlation id 7 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
- 消费消息
[root@master1 ka-n9]# kubectl exec -it kafka-0 -n kafka -- /opt/bitnami/kafka/bin/kafka-console-consumer.sh --bootstrap-server kafka.kafka.svc.cluster.local:39092 --topic test-topic --from-beginning
Defaulted container "kafka" out of: kafka, configure-broker-id-and-listeners (init)
Hello, Kafka!
This is a test message.
4.7 kafka遇到的问题
4.7.1 KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP 未定义
问题描述:日志中提示 /opt/bitnami/scripts/libkafka.sh: line 377: KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: unbound variable,表明该环境变量未被正确设置。
解决办法:在 StatefulSet 的配置文件中添加 KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP 环境变量,并设置为 PLAINTEXT_INTERNAL:PLAINTEXT。
4.7.2 KAFKA_ADVERTISED_LISTENERS 配置错误
问题描述:错误信息 java.lang.IllegalArgumentException: requirement failed: Each listener must have a different name 显示,每个监听器需要有不同的名称,当前配置中使用了相同的协议且未为每个监听器指定不同的名称。
解决办法:为每个监听器指定不同的名称,将 KAFKA_ADVERTISED_LISTENERS 配置为 PLAINTEXT_INTERNAL://$(POD_NAME).kafka:39092,并通过 initContainer 进行变量替换。
4.7.3 source 命令找不到
问题描述:日志显示 /bin/sh: 1: source: not found,这是因为 /bin/sh 通常是 dash,不支持 source 命令。
解决办法:将 source /tmp/kafka_env.sh 替换为 . /tmp/kafka_env.sh。
4.7.4 inter.broker.listener.name 配置不符合要求
问题描述:错误信息 java.lang.IllegalArgumentException: requirement failed: inter.broker.listener.name must be a listener name defined in advertised.listeners 表明,inter.broker.listener.name 必须是 advertised.listeners 中定义的监听器名称之一。
解决办法:在 StatefulSet 的配置文件中添加 KAFKA_INTER_BROKER_LISTENER_NAME 环境变量,并设置为 PLAINTEXT_INTERNAL。
4.7.5 权限问题
问题描述:日志中出现 java.io.FileNotFoundException: /bitnami/kafka/data/meta.properties.tmp (Permission denied),说明容器内的用户没有足够的权限来访问或修改挂载路径。
解决办法:nfs上的挂载目录给777权限,chmod 777 /nfs_share/k8s/kafka/pv{1..3}
4.8 kafka集群地址
4.8.1 内部集群地址
- 业务应用和 Kafka 集群部署在同一个 Kubernetes 集群内,可直接使用 Kafka Service 的 DNS 名称连接,如
kafka.kafka.svc.cluster.local:39092
。
kafka.kafka.svc.cluster.local:39092
- python示例
from kafka import KafkaProducer
# 配置 Kafka 服务器地址
bootstrap_servers = 'kafka.kafka.svc.cluster.local:39092'
producer = KafkaProducer(bootstrap_servers=bootstrap_servers)
message = b'Hello from service connection!'
producer.send('test-topic', message)
producer.close()
4.8.2 外部集群地址
- 外部业务通过节点的 IP 地址和
nodePort
访问
apiVersion: v1
kind: Service
metadata:
name: kafka
namespace: kafka
labels:
app: kafka
spec:
type: NodePort
ports:
- port: 39092
targetPort: 39092
nodePort: 30092 # 可指定一个可用的端口范围
selector:
app: kafka
至此,kafka集群就部署完了!!!