Spark on K8S - Client Mode
配置 spark 用户
apiVersion: v1
kind: ServiceAccount
metadata:
name: spark
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: spark-role
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: spark-role-binding
namespace: default
subjects:
- kind: ServiceAccount
name: spark
namespace: default
roleRef:
kind: Role
name: spark-role
apiGroup: rbac.authorization.k8s.io
配置 spark 容器,会在这个容器里以 client 模式 submit spark 程序,所以这个容器也会作为 driver
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-client
spec:
replicas: 1
selector:
matchLabels:
app: spark-client
component: spark-client
template:
metadata:
labels:
app: spark-client
component: spark-client
spec:
containers:
- name: spark-client
image: spark-py:2.4.6
workingDir: /opt/spark
command: ["/bin/bash", "-c", "while true;do echo hello;sleep 6000;done"]
serviceAccountName: spark
配置 service,使得 spark executor 可以连接上 spark driver,任意端口都可以
apiVersion: v1
kind: Service
metadata:
namespace: default
name: spark-client-service
spec:
selector:
app: spark-client
ports:
- protocol: TCP
port: 7321
targetPort: 7321
clusterIP: None
登陆 spark 容器,以 client 模式提交 spark,指定 spark.driver.host 和 spark.driver.port
bin/spark-submit \
--master k8s://https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT_HTTPS} \
--deploy-mode client \
--name spark-test \
--conf spark.executor.instances=3 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=spark-py:2.4.6 \
--conf spark.driver.host=spark-client-service \
--conf spark.driver.port=7321 \
/opt/spark/examples/src/main/python/wordcount.py \
/opt/spark/examples/src/main/python/wordcount.py
分类:
Spark
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?