Flink Yarn 模式
虽然flink 本身支持Standalone模式,无需其他框架也可以运行,但是这种方式降低了和其他第三方资源调度框架的耦合性,独立性很强。但是Flink是大数据计算框架,资源调度并非其强项;所以大多数时候需要让专业的框架做资源调度,比如说Yarn和K8s,这里我们就以Yarn 模式来演示Flink是如何与第三方资源框架进行整合的。整体来看,Yarn 上部署程序的过程是:客户端把Flink应用提交到tyarn的resourceManager,resourceManager会向NodeManager申请容器,在这些容器上,flink会部署JobManager和taskManger的实例,从而启动集群,flink会根据运行在jobManager的作业所需要的slot数量动态分配taskkManger资源;
1、环境准备
检查环境变量配置
export HADOOP_HOME=/opt/module/hadoop-2.7.2 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin export HADOOP_CLASSPATH=`hadoop classpath` export HADOOP_CONF_DIR=/opt/module/hadoop-2.7.2/etc/hadoop
启动hadoop
[hui@hadoop103 flink-local]$ jps.sh ------------------- hui@hadoop103 -------------- 6403 JobHistoryServer 6677 Jps 6328 NodeManager 6073 DataNode 5934 NameNode ------------------- hui@hadoop104 -------------- 5075 DataNode 5751 Jps 5210 ResourceManager 5324 NodeManager ------------------- hui@hadoop105 -------------- 3956 SecondaryNameNode 3878 DataNode 4457 Jps 4075 NodeManager
flink-conf.yam 修改配置
jobmanager.memory.process.size: 1600 mtaskmanager.memory.process.size: 1728 mtaskmanager.numberOfTaskSlots: 8 parallelism.default: 1
2、会话模式部署
yarn 会话模式需要先申请一个yarn 会话--yarn session来启动flink集群
1、启动hadoop
[hui@hadoop103 flink-local]$ super.sh start
2、执行申请yarn session 命令,
[hui@hadoop103 flink-local]$ bin/yarn-session.sh -nm yransession -d
参数说明
-d:分离模式,如果你不想让FlinkYARN客户端一直前台运行,可以使用这个参数,即使关掉当前对话窗口,YARN session也可以后台运行。 -jm(--jobManagerMemory):配置JobManager所需内存,默认单位MB。 -nm(--name):配置在YARNUI界面上显示的任务名。 -qu(--queue):指定YARN队列名。 -tm(--taskManager):配置每个TaskManager所使用内存。
启动之后提示一个web 地址
2022-06-11 09:21:23,670 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1654904280347_0002 2022-06-11 09:21:23,708 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted application application_1654904280347_0002 2022-06-11 09:21:23,709 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be allocated 2022-06-11 09:21:23,711 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current state ACCEPTED 2022-06-11 09:21:37,398 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been deployed successfully. 2022-06-11 09:21:37,399 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop103:38675 of application 'application_1654904280347_0002'. JobManager Web Interface: http://hadoop103:38675
提交作业
[hui@hadoop103 flink-local]$ bin/flink run -c org.wdh01.wc.StreamWordCountNoArgs flink01-1.0-SNAPSHOT.jar SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/flink-local/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/phoenix-4.14.0-HBase-1.3-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2022-06-11 09:26:19,249 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:26:19,249 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:26:20,171 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/module/flink-local/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file. 2022-06-11 09:26:20,372 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop104/192.168.124.132:8032 2022-06-11 09:26:20,626 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2022-06-11 09:26:20,759 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop103:38675 of application 'application_1654904280347_0002'. Job has been submitted with JobID f2f0abcb070b90db0c2c50eb3c1145d5
此时作业提交成功
查看 yarn 界面
http://hadoop104:8088/cluster/apps/RUNNING
3、单作业模式部署
在YARN环境中,由于有了外部平台做资源调度,所以我们也可以直接向YARN提交一个单独的作业,从而启动一个Flink集群
[hui@hadoop103 flink-local]$ bin/flink run -c org.wdh01.wc.StreamWordCountNoArgs flink01-1.0-SNAPSHOT.jar SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/flink-local/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/phoenix-4.14.0-HBase-1.3-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2022-06-11 09:26:19,249 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:26:19,249 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:26:20,171 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/module/flink-local/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file. 2022-06-11 09:26:20,372 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop104/192.168.124.132:8032 2022-06-11 09:26:20,626 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2022-06-11 09:26:20,759 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop103:38675 of application 'application_1654904280347_0002'. Job has been submitted with JobID f2f0abcb070b90db0c2c50eb3c1145d5 ^C[hui@hadoop103 flink-local]$ [hui@hadoop103 flink-local]$ [hui@hadoop103 flink-local]$ bin/flink run -d -t yarn-per-job -c org.wdh01.wc.StreamWordCountNoArgs flink01-1.0-SNAPSHOT.jar SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/flink-local/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/phoenix-4.14.0-HBase-1.3-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2022-06-11 09:43:58,346 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:43:58,346 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:43:59,205 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/module/flink-local/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file. 2022-06-11 09:43:59,382 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop104/192.168.124.132:8032 2022-06-11 09:43:59,622 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2022-06-11 09:43:59,832 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink. 2022-06-11 09:43:59,834 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink. 2022-06-11 09:43:59,836 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=1} 2022-06-11 09:44:06,142 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1654904280347_0003 2022-06-11 09:44:06,194 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted application application_1654904280347_0003 2022-06-11 09:44:06,195 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be allocated 2022-06-11 09:44:06,198 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current state ACCEPTED 2022-06-11 09:44:19,827 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been deployed successfully. 2022-06-11 09:44:19,832 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The Flink YARN session cluster has been started in detached mode. In order to stop Flink gracefully, use the following command: $ echo "stop" | ./bin/yarn-session.sh -id application_1654904280347_0003 If this should not be possible, then you can also kill Flink via YARN's web interface or via: $ yarn application -kill application_1654904280347_0003 Note that killing Flink might not clean up all job artifacts and temporary files. 2022-06-11 09:44:19,849 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop104:50764 of application 'application_1654904280347_0003'. Job has been submitted with JobID 62f46059f39e1704d25a616f2379b9fb
日志信息提示了JobID 和web 地址
4、应用模式部署
[hui@hadoop103 flink-local]$ bin/flink run-application -t yarn-application -c org.wdh01.wc.StreamWordCountNoArgs flink01-1.0-SNAPSHOT.jar SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/flink-local/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/phoenix-4.14.0-HBase-1.3-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2022-06-11 09:55:05,822 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:55:05,822 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-hui. 2022-06-11 09:55:06,172 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/module/flink-local/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file. 2022-06-11 09:55:06,302 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop104/192.168.124.132:8032 2022-06-11 09:55:06,619 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2022-06-11 09:55:06,889 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink. 2022-06-11 09:55:06,890 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink. 2022-06-11 09:55:06,890 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=1} 2022-06-11 09:55:13,834 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1654904280347_0004 2022-06-11 09:55:13,906 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted application application_1654904280347_0004 2022-06-11 09:55:13,906 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be allocated 2022-06-11 09:55:13,912 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current state ACCEPTED 2022-06-11 09:55:25,372 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been deployed successfully. 2022-06-11 09:55:25,373 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop104:52235 of application 'application_1654904280347_0004'.
http://hadoop104:8088/cluster/apps/RUNNING
http://hadoop104:8088/proxy/application_1654904280347_0004/#/overview
5、yarn 模式高可用
yarn-site 添加配置:jobManager 重启次数上限,分发后重启yarn
<property> <name>yarn.resourcemanager.am.max-attempts</name> <value>4</value> <description>The maximum number of application master execution attempts.</description> </property>
yarn.application-attempts: 3 high-availability: zookeeper high-availability.storageDir: hdfs://hadoop103:9820/flink/yarn/ha high-availability.zookeeper.quorum: hadoop103:2181,hadoop104:2181,hadoop105:2181 high-availability.zookeeper.path.root: /flink-yarn
启动yarn-session后killJobManager,查看复活情况;