Flink 1.10 on yarn集群搭建(hadoop 3.1.3)
2020双十 阿里云服务器ECS团购特惠链接
前提条件:需先安装zookeeper与hadoop
zookeeper安装
hadoop安装
一、系统配置
3台虚拟机
hadoop1: 4g内存 2核 80g硬盘
hadoop2 1g内存 1核 8g硬盘
hadoop3: 1g内存 1核 8g硬盘
二、基础概念
1、Flink Client
2、JobManager
3、TaskManager
三、flink集群搭建(on yarn)
集群搭建的主要内容为JobManager高可用配置
1、下载与解压
(1)下载安装包flink-1.10.0-bin-scala_2.12.tgz至hadoop1节点
curl -O http://apache.mirrors.hoobly.com/flink/flink-1.10.0/flink-1.10.0-bin-scala_2.12.tgz
(2)远程传输到hadoop2、hadoop3节点
scp flink-1.10.0-bin-scala_2.12.tgz root@hadoop2:`pwd`
scp flink-1.10.0-bin-scala_2.12.tgz root@hadoop3:`pwd`
(3)解压安装包
tar zxvf flink-1.10.0-bin-scala_2.12.tgz
安装结束
2、JobManager高可用配置
(1)yarn-site.xml
文件目录:/usr/local/softwareinstall/hadoop-3.1.3/etc/hadoop/yarn-site.xml(hadoop安装目录下的yarn集群配置文件)
添加如下配置
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>4</value>
<description>
The maximum number of application master execution attempts.
</description>
</property>
(2)flink配置文件flink-conf.yaml
文件目录:(/usr/local/softwareinstall/flink-1.10.0/conf/flink-conf.yaml)
高可用设置
high-availability: zookeeper
high-availability.zookeeper.quorum: hadoop1:2181,hadoop2:2181,hadoop3:2181
high-availability.storageDir: hdfs:///flink/recovery
high-availability.zookeeper.path.root: /flink
yarn.application-attempts: 10
JobManager与TaskManager配置
(3)添加flink环境变量/etc/profile
vim /etc/profile
#flink环境变量配置
export FLINK_HOME=/usr/local/softwareinstall/flink-1.10.0
export PATH=$PATH:$FLINK_HOME/bin
#flink on yarn配置
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_CLASSPATH=`hadoop classpath`
注:如果hadoop是集成在CDH中,配置HADOOP_CONF_DIR=/etc/hadoop/conf
source /etc/profile #使环境变量生效
3、向yarn集群申请资源并提交任务
3.1 申请资源
(1)申请资源
bin/yarn-session.sh -nm wordCount -n 2
(2)查看申请的资源
使用命令查看
yarn application --list
yarn集群ui界面查看
yarn.webapp.ui2界面查看
3.2 提交wordCount计算作业
(1)提交作业到application id为application_1589204773282_0003的yarn session中
cd /usr/local/softwareinstall/flink-1.10.0
./bin/flink run -yid application_1589204773282_0003 ./examples/batch/WordCount.jar
(2)flink job ui界面观察作业执行信息
参考:
(1)YARN Setup
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/yarn_setup.html
(2)JobManager High Availability (HA)
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/jobmanager_high_availability.html