【云原生】Azkaban on k8s 讲解与实战操作

一、概述

大数据平台技术框架支持的开发语言多种多样,开发人员的背景差异也很大,这就产生出很多不同类型的程序(任务)运行在大数据平台之上,如:MapReduce、Hive、Pig、Spark、Java、Shell、Python 等。

在这里插入图片描述

官方文档:
https://azkaban.readthedocs.io/en/latest/
https://azkaban.github.io/azkaban/docs/latest/
GitHub地址:https://github.com/azkaban/azkaban
可以参考我之前的文章:大数据Hadoop之——任务调度器Azkaban(Azkaban环境部署)

二、开始部署

官方文档:https://azkaban.readthedocs.io/en/latest/getStarted.html

1)下载 azkaban

git clone https://github.com/azkaban/azkaban.git
# 构建Azkaban安装包
cd azkaban; ./gradlew build installDist

在这里插入图片描述

2)初始化azkaban表

这里的mysql也是部署在k8s,不清楚的小伙伴可以参考我这篇文章:【云原生】MySQL on k8s 环境部署

#【温馨提示】一般公司禁止mysql -u root -p123456这种方式连接,在history里有记录,存在安全隐患,小伙伴不要被公司安全审计哦,切记!!!
# 获取root密码
MYSQL_ROOT_PASSWORD=$(kubectl get secret --namespace mysql mysql -o jsonpath="{.data.mysql-root-password}" | base64 -d)

#登录pod
kubectl exec -it mysql-primary-0 -n mysql -- bash
# 连接myslq
mysql -u root -p$MYSQL_ROOT_PASSWORD

CREATE DATABASE azkaban;
CREATE USER 'azkaban'@'%' IDENTIFIED BY 'azkaban';
GRANT SELECT,INSERT,UPDATE,DELETE ON azkaban.* to 'azkaban'@'%' WITH GRANT OPTION;
flush privileges;

# 将宿主机上的sql文件copy到pod
kubectl cp azkaban-db/create-all-sql-3.91.0-386-ge35281d.sql mysql/mysql-primary-0:/tmp/
#登录pod
kubectl exec -it mysql-primary-0 -n mysql -- bash
mysql -u root -p$MYSQL_ROOT_PASSWORD
use azkaban;
# 可能版本不一样,sql文件也不太一样,create-all-sql-*.sql
source /tmp/create-all-sql-3.91.0-386-ge35281d.sql

【温馨提示】最好是在启动服务的时候通过脚本去初始化sql。sql文件如下:

/*
-- 如果多次执行建议打开,但是有风险
drop database azkaban;
delete from mysql.db where user="azkaban";
delete from mysql.user where user="azkaban";
flush privileges;
*/
CREATE DATABASE IF NOT EXISTS azkaban;
CREATE USER 'azkaban'@'%' IDENTIFIED BY 'azkaban';
GRANT SELECT,INSERT,UPDATE,DELETE ON azkaban.* to 'azkaban'@'%' WITH GRANT OPTION;
flush privileges;
use azkaban
source /opt/apache/azkaban/azkaban-db/create-all-sql-3.91.0-386-ge35281d.sql

执行测试

# 登录pod
kubectl exec -it mysql-primary-0 -n mysql -- bash
# SQL初始化
mysql -u root -pWyfORdvwVm -h mysql-primary.mysql </opt/apache/azkaban/azkaban-db/init.sql

3)构建镜像

docker-entrypoint.sh

#!/bin/bash

### init sql
if [ ! `mysql -u ${MYSQL_USER_NAME} -p${MYSQL_USER_PASSWORD} -h${MYSQL_HOST} -e "show databases;"|grep "${MYSQL_DB}"` ];then
  mysql -u${MYSQL_USER_NAME} -p${MYSQL_USER_PASSWORD} -h${MYSQL_HOST} < ${AZKABAN_HOME}/azkaban-db/init.sql
fi

funStartExec(){
  ### start azkaban exec
  echo "start azkaban exec..."

  {
     funActivateExec
  }&

  cd ${AZKABAN_HOME}/azkaban-exec-server/;
  ./bin/internal/internal-start-executor.sh 2>&1 |tee -a executorServerLog__`date +%F+%T`.out
}

funStartWeb(){
  ### start azkaban web
  echo "start azkaban web..."
  cd ${AZKABAN_HOME}/azkaban-web-server/;
  ./bin/internal/internal-start-web.sh 2>&1 |tee -a webServerLog_`date +%F+%T`.out
}

funActivateExec(){
  until netstat -ntlp|grep -q :12321; do echo waiting for azkaban-exec; sleep 1; done
  curl -G "`hostname`:12321/executor?action=activate" && echo
}

if [ "$1" = "exec" ];then
   funStartExec
elif [ "$1" = "web" ];then
   funStartWeb
elif [ "$1" = "all" ];then
   funStartExec
   funStartWeb
else
   echo "please input args [exec|web|all]"
fi

【温馨提示】web启动必须cd到azkaban-web-server目录下再执行启动脚本。

deleteExec.sh

#!/bin/bash
HOSTNAME=`hostname -A`
mysql -u${MYSQL_USER_NAME} -p${MYSQL_USER_PASSWORD} -h${MYSQL_HOST} ${MYSQL_DB} -e "DELETE FROM ${MYSQL_DB} WHERE host=\"${HOSTNAME}.azkaban-exe.azkaban.svc.cluster.local\""

Dockerfile

FROM myharbor.com/bigdata/centos:7.9.2009
RUN rm -f /etc/localtime && ln -sv /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" > /etc/timezone
RUN export LANG=zh_CN.UTF-8

### install tools
RUN yum install -y vim tar wget curl less telnet net-tools lsof mysql

RUN mkdir -p /opt/apache

### JDK
ADD jdk-8u212-linux-x64.tar.gz  /opt/apache/
ENV JAVA_HOME /opt/apache/jdk1.8.0_212
ENV PATH $JAVA_HOME/bin:$PATH

### Azkaban
RUN mkdir /opt/apache/azkaban
ENV AZKABAN_HOME /opt/apache/azkaban
ADD azkaban-exec-server.tar.gz $AZKABAN_HOME
ADD azkaban-web-server.tar.gz $AZKABAN_HOME
ADD azkaban-db.tar.gz $AZKABAN_HOME
COPY init.sql $AZKABAN_HOME

COPY docker-entrypoint.sh /opt/apache
RUN chmod +x /opt/apache/docker-entrypoint.sh

RUN groupadd --system --gid=9999 admin && useradd --system --home-dir /opt/home --uid=9999 --gid=admin admin

RUN chown -R admin:admin /opt/apache

#设置的工作目录
WORKDIR $AZKABAN_HOME

# 执行脚本,构建镜像时不执行,运行实例才会执行
ENTRYPOINT ["/opt/apache/docker-entrypoint.sh"]

开始构建镜像

docker build -t myharbor.com/bigdata/azkaban:4.0 . --no-cache

# 上传镜像
docker push myharbor.com/bigdata/azkaban:4.0

# 删除镜像
docker rmi myharbor.com/bigdata/azkaban:4.0
crictl rmi myharbor.com/bigdata/azkaban:4.0

4)编排yaml

1、configmap

azkaban-exec-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: azkaban-exec
  name: azkaban-exec-cm
data:
  azkaban.properties: |-
    # Azkaban Personalization Settings
    azkaban.name=Test
    azkaban.label=My Local Azkaban
    azkaban.color=#FF3601
    azkaban.default.servlet.path=/index
    web.resource.dir=web/
    default.timezone.id=Asia/Shanghai
    # Azkaban UserManager class
    user.manager.class=azkaban.user.XmlUserManager
    user.manager.xml.file=conf/azkaban-users.xml
    # Loader for projects
    executor.global.properties=conf/global.properties
    azkaban.project.dir=projects
    # Velocity dev mode
    velocity.dev.mode=false
    # Azkaban Jetty server properties.
    jetty.use.ssl=false
    jetty.maxThreads=25
    jetty.port=8081
    # Where the Azkaban web server is located
    azkaban.webserver.url=http://azkaban-web.azkaban:8081
    # mail settings
    mail.sender=
    mail.host=
    # User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.
    # enduser -> myazkabanhost:443 -> proxy -> localhost:8081
    # when this parameters set then these parameters are used to generate email links.
    # if these parameters are not set then jetty.hostname, and jetty.port(if ssl configured jetty.ssl.port) are used.
    # azkaban.webserver.external_hostname=myazkabanhost.com
    # azkaban.webserver.external_ssl_port=443
    # azkaban.webserver.external_port=8081
    job.failure.email=
    job.success.email=
    lockdown.create.projects=false
    cache.directory=cache
    # JMX stats
    jetty.connector.stats=true
    executor.connector.stats=true
    # Azkaban plugin settings
    azkaban.jobtype.plugin.dir=plugins/jobtypes
    # Azkaban mysql settings by default. Users should configure their own username and password.
    database.type=mysql
    mysql.port=3306
    mysql.host=mysql-primary.mysql
    mysql.database=azkaban
    mysql.user=azkaban
    mysql.password=azkaban
    mysql.numconnections=100
    # Azkaban Executor settings
    executor.maxThreads=50
    executor.flow.threads=30
    azkaban.executor.runtimeProps.override.eager=false
    executor.port=12321

azkaban-web-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: azkaban-web
  name: azkaban-web-cm
data:
  azkaban.properties: |-
    # Azkaban Personalization Settings
    azkaban.name=Test
    azkaban.label=My Local Azkaban
    azkaban.color=#FF3601
    azkaban.default.servlet.path=/index
    web.resource.dir=web/
    default.timezone.id=Asia/Shanghai
    # Azkaban UserManager class
    user.manager.class=azkaban.user.XmlUserManager
    user.manager.xml.file=conf/azkaban-users.xml
    # Loader for projects
    executor.global.properties=conf/global.properties
    azkaban.project.dir=projects
    # Velocity dev mode
    velocity.dev.mode=false
    # Azkaban Jetty server properties.
    jetty.use.ssl=false
    jetty.maxThreads=25
    jetty.port=8081
    # Azkaban Executor settings
    # mail settings
    mail.sender=
    mail.host=
    # User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users.
    # enduser -> myazkabanhost:443 -> proxy -> localhost:8081
    # when this parameters set then these parameters are used to generate email links.
    # if these parameters are not set then jetty.hostname, and jetty.port(if ssl configured jetty.ssl.port) are used.
    # azkaban.webserver.external_hostname=myazkabanhost.com
    # azkaban.webserver.external_ssl_port=443
    # azkaban.webserver.external_port=8081
    job.failure.email=
    job.success.email=
    lockdown.create.projects=false
    cache.directory=cache
    # JMX stats
    jetty.connector.stats=true
    executor.connector.stats=true
    # Azkaban mysql settings by default. Users should configure their own username and password.
    database.type=mysql
    mysql.port=3306
    mysql.host=mysql-primary.mysql
    mysql.database=azkaban
    mysql.user=azkaban
    mysql.password=azkaban
    mysql.numconnections=100
    #Multiple Executor
    azkaban.use.multiple.executors=true
    azkaban.executorselector.filters=StaticRemainingFlowSize,CpuStatus
    azkaban.executorselector.comparator.NumberOfAssignedFlowComparator=1
    azkaban.executorselector.comparator.Memory=1
    azkaban.executorselector.comparator.LastDispatched=1
    azkaban.executorselector.comparator.CpuUsage=1
  azkaban-users.xml: |-
    <azkaban-users>
      <user groups="azkaban" password="azkaban" roles="admin" username="azkaban"/>
      <user password="metrics" roles="metrics" username="metrics"/>

      <role name="admin" permissions="ADMIN"/>
      <role name="metrics" permissions="METRICS"/>
    </azkaban-users>

3、secret

secret.yaml

apiVersion: v1
data:
  # echo -n 'WyfORdvwVm' | base64
  mysql-root-password: V3lmT1JkdndWbQ==
kind: Secret
metadata:
  labels:
    app.kubernetes.io/name: azkaban
  name: azkaban-secret
type: Opaque

3、service

  • azkaban-exec-svc.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: exec
    app.kubernetes.io/name: azkaban
  name: azkaban-exec
spec:
  ports:
  - name: azkaban-exec
    port: 12321
    protocol: TCP
  selector:
    app.kubernetes.io/component: exec
    app.kubernetes.io/name: azkaban
  type: ClusterIP
  • azkaban-web-svc.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: web
    app.kubernetes.io/name: azkaban
  name: azkaban-web
spec:
  ports:
  - name: azkaban-web-http
    nodePort: 30081
    port: 8081
    protocol: TCP
  selector:
    app.kubernetes.io/component: web
    app.kubernetes.io/name: azkaban
  type: NodePort

4、控制器

  • azkaban-exec-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: azkaban-exec
spec:
  serviceName: azkaban-exec
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: exec
      app.kubernetes.io/name: azkaban
  template:
    metadata:
      labels:
        app.kubernetes.io/component: exec
        app.kubernetes.io/name: azkaban
    spec:
      containers:
        - name: azkaban-exec
          image: myharbor.com/bigdata/azkaban:4.0
          #command: ["/opt/apache/docker-entrypoint.sh"]
          args: ["exec"]
          imagePullPolicy: IfNotPresent
          ports:
            - name: azkaban-exec
              containerPort: 12321
              protocol: TCP
          env:
            - name: MYSQL_HOST
              value: mysql-primary.mysql
            - name: MYSQL_USER_NAME
              value: root
            - name: MYSQL_DB
              value: azkaban
            - name: MYSQL_USER_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: azkaban-secret
                  key: mysql-root-password
          volumeMounts:
            - name: azkaban-exec-volume
              mountPath: /opt/apache/azkaban/azkaban-exec-server/conf/azkaban.properties
              subPath: azkaban.properties
          readinessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            tcpSocket:
              port: azkaban-exec
          livenessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            tcpSocket:
              port: azkaban-exec
          lifecycle:
            preStop: # 删掉mysql记录
              exec:
                command: ["/opt/apache/deleteExec.sh"]
          securityContext:
            runAsUser: 9999
            privileged: true
      volumes:
        - name: azkaban-exec-volume
          configMap:
            name: azkaban-exec-cm
  • azkaban-web-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: azkaban-web
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: web
      app.kubernetes.io/name: azkaban
  template:
    metadata:
      labels:
        app.kubernetes.io/component: web
        app.kubernetes.io/name: azkaban
    spec:
      initContainers:
        - name: waiting-exec
          image: myharbor.com/bigdata/azkaban:4.0
          command: ['sh', '-c', "until (echo 'q')|telnet -e 'q' azkaban-exec.azkaban 12321 >/dev/null 2>&1; do echo waiting for exec; sleep 1; done"]
      containers:
        - name: azkaban-web
          image: myharbor.com/bigdata/azkaban:4.0
          #command: ["/opt/apache/docker-entrypoint.sh"]
          args: ["web"]
          imagePullPolicy: IfNotPresent
          ports:
            - name: azkaban-web
              containerPort: 8081
              protocol: TCP
          env:
            - name: MYSQL_HOST
              value: mysql-primary.mysql
            - name: MYSQL_USER_NAME
              value: root
            - name: MYSQL_DB
              value: azkaban
            - name: MYSQL_USER_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: azkaban-secret
                  key: mysql-root-password
          volumeMounts:
            - name: azkaban-web-volume
              mountPath: /opt/apache/azkaban/azkaban-web-server/conf/azkaban.properties
              subPath: azkaban.properties
            - name: azkaban-users-volume
              mountPath: /opt/apache/azkaban/azkaban-web-server/conf/azkaban-users.xml
              subPath: azkaban-users.xml
          readinessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            tcpSocket:
              port: azkaban-web
          livenessProbe:
            initialDelaySeconds: 10
            periodSeconds: 5
            tcpSocket:
              port: azkaban-web
          securityContext:
            runAsUser: 9999
            privileged: true
      volumes:
        - name: azkaban-web-volume
          configMap:
            name: azkaban-web-cm
        - name: azkaban-users-volume
          configMap:
            name: azkaban-web-cm

5)开始部署

kubectl create ns azkaban

kubectl apply -f azkaban-exec-cm.yaml -n azkaban
kubectl apply -f azkaban-web-cm.yaml -n azkaban
kubectl apply -f secret.yaml -n azkaban

kubectl apply -f azkaban-exec-svc.yaml -n azkaban
kubectl apply -f azkaban-web-svc.yaml -n azkaban

kubectl apply -f azkaban-exec-statefulset.yaml -n azkaban
kubectl apply -f azkaban-web-deployment.yaml -n azkaban

查看

kubectl get pods,svc -n azkaban -owide

在这里插入图片描述
web:http://192.168.182.110:30081/
账号/密码:admin/admin
在这里插入图片描述

6)测试验证

官方文档:https://azkaban.readthedocs.io/en/latest/createFlows.html
【示例】
1、新建helloworld.project文件,编辑内容如下:

cat >helloworld.project<<EOF
azkaban-flow-version: 2.0
EOF

2、新建helloworld.flow文件,内容如下:

cat > helloworld.flow <<EOF
nodes:
  - name: jobA
    type: command
    config:
      command: echo "Hello World"
EOF

3、将上面两个文件压缩成一个zip文件,目前只支持zip文件,文件名称必须是英文。
4、web上新建Project
在这里插入图片描述
5、上传zip包
在这里插入图片描述
6、开始调度执行
在这里插入图片描述
在这里插入图片描述
这里只是演示一个很简单很简单的示例,只是为了验证azkaban的可用性,一般企业是通过azkaban去调度spark,flink等任务,因为机器资源有限,无法把所有的服务都起来,有疑问的小伙伴可以给我留言。

7)卸载

kubectl delete ns azkaban --force

8)通过helm 部署

因为没有现成的模板,所以这里需要创建一个空模板。

helm create azkaban

这里就不贴yaml文件内容了,最下面会给出git下载地址,有疑问的小伙伴欢迎给我留言。下面直接安装了
在这里插入图片描述

helm install azkaban ./azkaban -n azkaban --create-namespace
kubectl get pods,svc -n azkaban -owide

NOTES

NAME: azkaban
LAST DEPLOYED: Fri Oct  7 15:21:22 2022
NAMESPACE: azkaban
STATUS: deployed
REVISION: 1
NOTES:
1. Get the application URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace azkaban -o jsonpath="{.spec.ports[0].nodePort}" services azkaban-web)
  export NODE_IP=$(kubectl get nodes --namespace azkaban -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT

在这里插入图片描述

web:http://192.168.182.110:31081/index
在这里插入图片描述

9)helm 卸载

helm uninstall azkaban -n azkaban
kubectl delete ns azkaban --force

git地址:https://gitee.com/hadoop-bigdata/azkaban-on-k8s
azkaban 已编译部署包百度云盘下载地址:

链接:https://pan.baidu.com/s/1TqmMSCT1--z_LcqBlAancA?pwd=9y17
提取码:9y17

Azkaban on k8s 讲解与实战操作就先到这里了,有疑问的小伙伴欢迎给我留言,后续会持续更新【云原生+大数据】相关的教程,请小伙伴耐心等待~

posted @ 2022-10-07 15:46  大数据老司机  阅读(343)  评论(0编辑  收藏  举报