Flink1.10.1集成Hadoop3.0.0源码编译实战
详细的死磕到底系列,可关注微信公众号:大数据从业者
https://mp.weixin.qq.com/s/saCIS5XCfTZisXlEeSHsuA
直接从github拉取flink-1.10.1版本代码
git clone -b release-1.10.1 https://github.com/apache/flink.git
flink-1.10.1修复了1.10中158处bug,并进行了优化。
官方也是强烈建议所有用户升级到1.10.1。
废话不多说,直接进入正题,开始flink-1.10.1版本的源码编译。
编译环境需要提前准备好maven和java。
最简单的编译命令如下:
mvn clean install -DskipTests
如果想加快源码编译进度,可以在编译过程中跳过tests、QA插件、文档等
编译命令如下:
mvn clean install -DskipTests -Dfast
这两种方式说的都是不考虑flink与Hadoop整合的使用场景。
如果需要与Hadoop整合使用,需要提供hadoop classes依赖。
这里有两种方式:
方式1:将hadoop classpath添加到flink。这种方式工作量比较小,简单方便,但是hadoop class依赖众多,可能与flink class有jar冲突的情况
方式2:直接将Hadoop classes依赖打包放置到flink/lib文件夹下
官方有打好的几个hadoop版本对应的包
Pre-bundled Hadoop 2.4.1:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.4.1-10.0/flink-shaded-hadoop-2-uber-2.4.1-10.0.jar
Pre-bundled Hadoop 2.6.5:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.6.5-10.0/flink-shaded-hadoop-2-uber-2.6.5-10.0.jar
Pre-bundled Hadoop 2.7.5:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.7.5-10.0/flink-shaded-hadoop-2-uber-2.7.5-10.0.jar
Pre-bundled Hadoop 2.8.3:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-10.0/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
当然,如果这里没有你需要的hadoop版本,你就得自己打包了。
打包需要使用另外一个项目:flink-shaded
git clone -b release-9.0 https://github.com/apache/flink-shaded.git
注意:这里一定要使用9.0,因为flink-1.10.1项目中依赖的为9.0
flink-release-1.10.1\flink-release-1.10.1\pom.xml中可以查看或者更改版本
<flink.shaded.version>9.0</flink.shaded.version>
编译打包命令:
mvn clean install -Dhadoop.version=3.0.0
注意:如果需要指定hadoop的发行商,需要使用-Pvendor-repos。当然前提是maven仓库增加了对应发行商的仓库地址,详见https://maven.apache.org/guides/mini/guide-multiple-repositories.html
如:mvn clean install -Pvendor-repos -Dhadoop.version=3.0.0-cdh6.2.0
执行上述步骤,将打好的包flink-shaded-hadoop-2-uber-3.0.0-9.0.jar拷贝到flink/lib文件夹下
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:56 min
[INFO] Finished at: 2020-06-01T18:28:42+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project flink-hadoop-fs: Could not resolve dependencies for project org.apache.flink:flink-hadoop-fs:jar:1.10.1: Could not find artifact org.apache.flink:flink-shaded-hadoop-2:jar:3.0.0-9.0 in nexus (http://maven.aliyun.com/nexus/content/groups/public/) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :flink-hadoop-fs
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ flink-shaded-hadoop-2-uber ---
[INFO] Installing /root/Desktop/sourceCodes/flink-shaded-release-9.0/flink-shaded-hadoop-2-uber/target/flink-shaded-hadoop-2-uber-3.0.0-9.0.jar to /root/.m2/repository/org/apache/flink/flink-shaded-hadoop-2-uber/3.0.0-9.0/flink-shaded-hadoop-2-uber-3.0.0-9.0.jar
[INFO] Installing /root/Desktop/sourceCodes/flink-shaded-release-9.0/flink-shaded-hadoop-2-uber/target/dependency-reduced-pom.xml to /root/.m2/repository/org/apache/flink/flink-shaded-hadoop-2-uber/3.0.0-9.0/flink-shaded-hadoop-2-uber-3.0.0-9.0.pom
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] flink-shaded 9.0 ................................... SUCCESS [ 3.026 s]
[INFO] flink-shaded-force-shading ......................... SUCCESS [ 0.903 s]
[INFO] flink-shaded-asm-7 7.1-9.0 ......................... SUCCESS [ 1.076 s]
[INFO] flink-shaded-guava-18 18.0-9.0 ..................... SUCCESS [ 1.928 s]
[INFO] flink-shaded-netty-4 4.1.39.Final-9.0 .............. SUCCESS [ 4.613 s]
[INFO] flink-shaded-netty-tcnative-dynamic 2.0.25.Final-9.0 SUCCESS [ 0.915 s]
[INFO] flink-shaded-jackson-parent 2.10.1-9.0 ............. SUCCESS [ 0.108 s]
[INFO] flink-shaded-jackson-2 2.10.1-9.0 .................. SUCCESS [ 1.929 s]
[INFO] flink-shaded-jackson-module-jsonSchema-2 2.10.1-9.0 SUCCESS [ 1.529 s]
[INFO] flink-shaded-hadoop-2 3.0.0-9.0 .................... SUCCESS [ 28.785 s]
[INFO] flink-shaded-hadoop-2-uber 3.0.0-9.0 ............... SUCCESS [ 57.217 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:42 min
[INFO] Finished at: 2020-06-01T18:37:42+08:00
[INFO] ------------------------------------------------------------------------
mvn clean install -DskipTests -Dfast -Dhadoop.version=3.0.0
[INFO] another-dummy-fs ................................... SUCCESS [ 0.336 s]
[INFO] flink-tpch-test .................................... SUCCESS [ 4.647 s]
[INFO] flink-streaming-kinesis-test ....................... SUCCESS [ 20.212 s]
[INFO] flink-elasticsearch7-test .......................... SUCCESS [ 6.283 s]
[INFO] flink-end-to-end-tests-common-kafka ................ SUCCESS [ 3.538 s]
[INFO] flink-tpcds-test ................................... SUCCESS [ 0.910 s]
[INFO] flink-statebackend-heap-spillable .................. SUCCESS [ 0.594 s]
[INFO] flink-contrib ...................................... SUCCESS [ 0.063 s]
[INFO] flink-connector-wikiedits .......................... SUCCESS [ 18.538 s]
[INFO] flink-yarn-tests ................................... SUCCESS [01:11 min]
[INFO] flink-fs-tests ..................................... SUCCESS [ 1.067 s]
[INFO] flink-docs ......................................... SUCCESS [ 5.033 s]
[INFO] flink-ml-parent .................................... SUCCESS [ 0.056 s]
[INFO] flink-ml-api ....................................... SUCCESS [ 0.412 s]
[INFO] flink-ml-lib ....................................... SUCCESS [01:05 min]
[INFO] flink-walkthroughs ................................. SUCCESS [ 0.066 s]
[INFO] flink-walkthrough-common ........................... SUCCESS [ 1.145 s]
[INFO] flink-walkthrough-table-java ....................... SUCCESS [ 0.205 s]
[INFO] flink-walkthrough-table-scala ...................... SUCCESS [ 0.254 s]
[INFO] flink-walkthrough-datastream-java .................. SUCCESS [ 0.273 s]
[INFO] flink-walkthrough-datastream-scala 1.10.1 .......... SUCCESS [ 0.165 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 55:02 min
[INFO] Finished at: 2020-06-02T16:13:57+08:00
[INFO] ------------------------------------------------------------------------