怎样在Spark、Flink应用中使用Protobuf 3的包

如果在在Spark、Flink应用中使用Protobuf 3的包,因为Spark默认使用的是2.5版本的包,提交任务时,可能会报如下异常:

com.google.protobuf.CodedInputStream.readStringRequireUtf8()Ljava/lang/String;

针对Spark,可以使用SPARK_CLASSPATH或是指定

--conf spark.executor.extraClassPath

的方式解决,今天在调试Flink程序时,发现还有一种解决方式:

https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html

If the uber JAR is reused as a dependency of some other project, directly including classes from the artifact's dependencies in the uber JAR can cause class loading conflicts due to duplicate classes on the class path. To address this issue, one can relocate the classes which get included in the shaded artifact in order to create a private copy of their bytecode:

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>com.google.protobuf</pattern>
<shadedPattern>shaded.com.google.protobuf</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

 

posted @ 2017-08-29 15:15  静若清池  阅读(3877)  评论(0编辑  收藏  举报