运行spark exmaple 代码2.1.1

运行spark exmaple 代码

以管理员权限运行eclipse

以JavaSparkHiveExample为例

package :org.apache.spark.examples.sql

搭建代码环境

Figure 1新建maven项目,名称为spark2.1.1example

修改jdk版本,取消Enable project specific settings

修改jdk库为1.8。选中JRE System Library[J2SE-1.5],点击remove,点击Add Library/JRE System Library

改后

替换pom.xml为

 

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark2.1.1example</groupId>
<artifactId>spark2.1.1example</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<java.version>1.7</java.version>
</properties>
<dependencies>

<!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.10 -->
<!---->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.10.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-flume_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-flume_2.10</artifactId>
<version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka_2.10 -->

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.11</artifactId>
<version>1.6.3</version>1.5.2
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10_2.11 -->
<!--<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.1.1</version>
</dependency> -->

<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.35</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper -->
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>3.4.8</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.10</artifactId>
<version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-flume_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-flume_2.10</artifactId>
<version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flume/flume-ng-embedded-agent -->
<dependency>
<groupId>org.apache.flume</groupId>
<artifactId>flume-ng-embedded-agent</artifactId>
<version>1.6.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.derby/derby -->
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
<version>10.13.1.1</version>
</dependency>


</dependencies>
<build>
<sourceDirectory>src</sourceDirectory>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source/>
<target/>
</configuration>
</plugin>
</plugins>
</build>
</project>

下载spark-2.1.1-bin-hadoop2.7.tgz

http://spark.apache.org/downloads.html

解压缩spark-2.1.1-bin-hadoop2.7.tgz

Figure 2新建User libraries

鼠标单击选中spark2.1.1jars,单机Add External JARS

打开刚才解压缩的目录

Figure 3添加jars下所有文件

Figure 4添加examples/jars下所有

Figure 5 libraries下包含JDK,Maven,spark2.1.1jars三类

下载winutils

https://github.com/steveloughran/winutils

我们只需要其中hadoop-2.7.1部分。

Figure 6解压缩后:

Figure 7右键 Run AS/Java Application

忽略报错。这一步创建运行配置文件,下一步修改运行配置文件后报错自动消失。

Figure 8右键Run As/Run Configuration

Figure 9切换到Environment标签

Figure 10新建HADOOP_HOME指向yourdir\winutils-master\hadoop-2.7.1

Figure 11选中replace native environment

在project下新建三层目录

examples/src/main/resources

Figure 12拷贝此目录下文件到刚新建的目录下

Figure 13为了在eclipse中运行,修改了标记//HERE的行

Figure 14查看运行结果

posted @ 2017-06-06 10:23  阿梁的新博客  阅读(391)  评论(0编辑  收藏  举报