使用IDEA+MVN 编译Spark 1.5.2 without hive

编译前可以参考下官方文档http://spark.apache.org/docs/1.5.0/building-spark.html

参考文档：http://mmicky.blog.163.com/blog/static/1502901542014312101657612/

经多次测试以及和别人的交流，确认spark 1.5和hive 1.2.1兼容有问题，请使用spark 1.4.1编译，步骤都一样

======================= 编译步骤 =======================

说明：

   使用IDEA编译，建议还是用命令行方式，编译出来的不会有问题；

   建议用scala 2.10版本，spark当前对2.11支持的不好，也可以参考官网的说明，执行 dev/chage-version-to-2.11.sh使用2.11版本;

   当前只支持3.3.3以上版本mvn；

命令：    mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Dscala-2.10 -DskipTests clean package –e

【步骤1】导入spark 源码工程，其中要设置默认的mvn版本

【步骤2】根据个人需求选则profiles

【步骤3】修改源码根目录下的pom.xml，选择正确的版本

说明：可以通过修改modules来只选择自己想要的modules

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    <akka.group>com.typesafe.akka</akka.group>
    <akka.version>2.3.11</akka.version>
    <java.version>1.8</java.version>
    <maven.version>3.3.3</maven.version>
    <sbt.project.name>spark</sbt.project.name>
    <mesos.version>0.21.1</mesos.version>
    <mesos.classifier>shaded-protobuf</mesos.classifier>
    <slf4j.version>1.7.10</slf4j.version>
    <log4j.version>1.2.17</log4j.version>
    <hadoop.version>2.6.0</hadoop.version>
    <protobuf.version>2.5.0</protobuf.version>
    <yarn.version>${hadoop.version}</yarn.version>
    <hbase.version>0.98.7-hadoop2</hbase.version>
    <hbase.artifact>hbase</hbase.artifact>
    <flume.version>1.6.0</flume.version>
    <zookeeper.version>3.4.5</zookeeper.version>
    <curator.version>2.4.0</curator.version>
    <hive.group>org.spark-project.hive</hive.group>
    
    <hive.version>1.2.1.spark</hive.version>

【步骤4】进入终端命令行，先切换到源码目录，然后直接执行命令：

mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Dscala-2.10 -DskipTests clean package –e

【步骤5】等待结束

======================= 其他方式编译补充 =======================

说明：自己编译成功，但是生成的包缺少东西，各位看官请自己把握

【1】打开maven

【2】打开maven后，可以根据自己的需求来修改profile

【3】点击倒数第3个，可以跳过测试；点击最后一个，可以修改mvn的一些配置项

说明：可以通过添加“http://maven.oschina.net/content/groups/public/”，国内代理，应该会快一些，感谢oschina

【4】选中Parent Pom，即可编译所有包【建议先clean在执行package】

选中package后，执行第5个按钮

【5】也可以直选中单个包进行编译

【6】也可以点击右上角执行

======================= 遗留问题 =======================

1、其它编译方式后面有时间了需要试下

2、编译后的包怎么打成一个完整的工程包还待研究

======================= -Dscala-2.11 编译报错 =======================

现象：

[ERROR] missing or invalid dependency detected while loading class file 'WebUI.class'.
Could not access term eclipse in package org,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'WebUI.class' was compiled against an incompatible version of org.
[ERROR] missing or invalid dependency detected while loading class file 'WebUI.class'.
Could not access term jetty in value org.eclipse,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'WebUI.class' was compiled against an incompatible version of org.eclipse.

原因分析：

spark基于JDBC对scala 2.11不支持导致的，详细的可以参考http://spark.apache.org/docs/1.5.0/building-spark.html，"Spark does not yet support its JDBC component for Scala 2.11."

解决方法：

使用scala 2.10版本，或者去掉 “-Dscala-2.11”

posted @ 2015-11-12 15:06 yifan888 阅读(1392) 评论(0) 编辑收藏举报

刷新页面返回顶部

yifan888

使用IDEA+MVN 编译Spark 1.5.2 without hive

公告