【Flink系列十九】Flink 作业Hadoop 依赖冲突解决NoSuchMethodError
问题
Flink 提交作业,直接报错:
java.lang.NoSuchMethodError: org.apache.hadoop.tracing.TraceUtils.wrapHadoopConf(Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/htrace/core/HTraceConfiguration;
at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:689)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:673)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:155)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.<init>(ChRootedFileSystem.java:103)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:173)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:167)
at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:261)
at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:333)
at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:167)
at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:167)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:770)
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:458)
at org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)
at org.apache.flink.client.cli.CliFrontend.runApplication(CliFrontend.java:213)
at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1057)
at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
可疑的Library
- hbase-shaded-client-1.4.3.jar
分析
查看 hadoop-common 2.7.x 的源码如下:
import org.apache.htrace.HTraceConfiguration;
/**
* This class provides utility functions for tracing.
*/
@InterfaceAudience.Private
public class TraceUtils {
private static List<ConfigurationPair> EMPTY = Collections.emptyList();
public static HTraceConfiguration wrapHadoopConf(final String prefix,
final Configuration conf) {
return wrapHadoopConf(prefix, conf, EMPTY);
}
发现报错的是
不仔细看,还真看不出来不是方法没有,是返回值的包不对。
NoSuchMethod | TraceUtils实际的Method |
---|---|
public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf) | public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf) |
Lorg/apache/htrace/core/HTraceConfiguration; | 返回值 org.apache.htrace.HTraceConfiguration |
hadoop 的发型版差异:
- Apache Hadoop Common:2.6.x 没有TraceUtils这个类,2.7.x 才有。
- Apache Hadoop Common: 2.6.0-CDH5.12.1 有这个类。
查看 2.6.0-CDH5.12.1的 TraceUtils
import org.apache.htrace.core.HTraceConfiguration;
查看 2.7.x 的TraceUtils
import org.apache.htrace.HTraceConfiguration;
所以错误发生在,2.6.0-cdh5.12.1的FsTracer通过hadoop-common 加载TraceUtils,而实际上加载了2.7.x的TraceUtils。
解决
第一类:从 jar 包入手,手动排除依赖
- 方案1:删除2.7.x的 hadoop-common,或者shade了这个包的library。
- 方案2:增加2.7.x的 hadoop-hdfs,它会加载相应的 2.7.x的hadoop-common,或者shade了这个包的 library。
- 方案3:不要使用 hbase-shaded-client,内部包含了hadoop依赖,容易和集群冲突。更换为 hbase-client。
第二类:通过打包工具精确排除字节码
- 用 shade 插件 的filter 排除所有 hadoop, htrace 依赖。
<filters>
<filter>
<artifact>org.apache.hbase:hbase-shaded-client</artifact>
<includes>
<include>META-INF/**</include>
<include>org/apache/hadoop/hbase/**</include>
<include>hbase-default.xml</include>
</includes>
</filter>
<filter>
<!-- Do not copy the signatures in the META-INF folder.
Otherwise, this might cause SecurityExceptions when using the JAR. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
自问自答
为什么要费这么大劲儿解决这个问题?
- 因为用户使用的flink-shaded-hadoop-2-uber-xxxx-xxx 这种jar,如果内含hadoop-hdfs-2.8.2 之前的版本,则会遇到HDFS-9276的BUG。
- 且Flink从1.11+起就放弃更新flink-shaded-hadoop-2-uber-jar。