基于IDEA环境下的Spark2.X程序开发
我们选择在线安装
这个是windows下的scala,直接双击安装就可以了
安装好之后可以验证一下
这个是我本地的jdk1.8安装包,直接双击安装
安装完成后可以验证一下
https://archive.apache.org/dist/maven/maven-3/3.3.9/binaries/
解压
我的本地是win10系统
配置好环境变量我们可以验证一下
修改这个文件
这个是默认的
改成这样子
把本地的maven配置进来
接下来就是等待自动把相应的架包下载下来
把scala添加进来了
接下来我们创建目录
在scala目录下建包
在这个包里面创建一个scala的类
输入以下代码
配置maven的 pom.xml文件
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.spark</groupId>
<artifactId>sparkStu</artifactId>
<packaging>war</packaging>
<version>1.0-SNAPSHOT</version>
<name>sparkStu Maven Webapp</name>
<url>http://maven.apache.org</url>
<properties>
<hadoop.version>2.6.0</hadoop.version>
<scala.binary.version>2.11</scala.binary.version>
<spark.version>2.2.0</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!--
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<finalName>sparkStu</finalName>
</build>
</project>
在Test.scala里加上这段内容
我们编写一个简单的代码
package com.spark.test import org.apache.spark.sql.SparkSession object Test { def main(args: Array[String]): Unit = { val spark= SparkSession .builder .appName("HdfsTest") .getOrCreate() val filePart = "E://Mycode/datas/stu.txt" val rdd= spark.sparkContext.textFile(filePart) val lines= rdd.flatMap(x => x.split(" ")).collect().toList println(lines) } }
运行一下
结果报错了
E:\software\jdk1.8\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=59010:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF-8 -classpath E:\software\jdk1.8\jre\lib\charsets.jar;E:\software\jdk1.8\jre\lib\deploy.jar;E:\software\jdk1.8\jre\lib\ext\access-bridge-64.jar;E:\software\jdk1.8\jre\lib\ext\cldrdata.jar;E:\software\jdk1.8\jre\lib\ext\dnsns.jar;E:\software\jdk1.8\jre\lib\ext\jaccess.jar;E:\software\jdk1.8\jre\lib\ext\jfxrt.jar;E:\software\jdk1.8\jre\lib\ext\localedata.jar;E:\software\jdk1.8\jre\lib\ext\nashorn.jar;E:\software\jdk1.8\jre\lib\ext\sunec.jar;E:\software\jdk1.8\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.8\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.8\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.8\jre\lib\ext\zipfs.jar;E:\software\jdk1.8\jre\lib\javaws.jar;E:\software\jdk1.8\jre\lib\jce.jar;E:\software\jdk1.8\jre\lib\jfr.jar;E:\software\jdk1.8\jre\lib\jfxswt.jar;E:\software\jdk1.8\jre\lib\jsse.jar;E:\software\jdk1.8\jre\lib\management-agent.jar;E:\software\jdk1.8\jre\lib\plugin.jar;E:\software\jdk1.8\jre\lib\resources.jar;E:\software\jdk1.8\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11.0.jar;E:\software\Scala\lib\scala-actors-migration_2.11-1.1.0.jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.11-1.0.4.jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.11-1.0.2.jar;E:\software\Scala\lib\scala-xml_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.11\2.2.0\spark-core_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.7\avro-mapred-1.7.7-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.11\0.8.0\chill_2.11-0.8.0.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.3\kryo-shaded-3.0.3.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.0\chill-java-0.8.0.jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.11\2.2.0\spark-launcher_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.11\2.2.0\spark-network-common_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.5\jackson-annotations-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.11\2.2.0\spark-network-shuffle_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.11\2.2.0\spark-unsafe_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.3\jets3t-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.3\httpcore-4.3.3.jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.2\mx4j-3.0.2.jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.7\mail-1.4.7.jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.8\base64-2.3.8.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.1\guava-16.0.1.jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.16\slf4j-log4j12-1.7.16.jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.0\lz4-1.3.0.jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.11\RoaringBitmap-0.5.11.jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.8\scala-library-2.11.8.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.11\3.2.11\json4s-jackson_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.11\3.2.11\json4s-core_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.11\3.2.11\json4s-ast_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.0\scalap-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.0\scala-compiler-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.11\1.0.1\scala-xml_2.11-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0.43.Final\netty-all-4.0.43.Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.2\metrics-core-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.2\metrics-jvm-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.2\metrics-json-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.2\metrics-graphite-3.1.2.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.5\jackson-core-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.6.5\jackson-module-scala_2.11-2.6.5.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.7\scala-reflect-2.11.7.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.5\jackson-module-paranamer-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;E:\software\maven3.3.9\repository\oro\oro\2.0.8\oro-2.0.8.jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.4\py4j-0.10.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.11\2.2.0\spark-tags_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.11\2.2.0\spark-sql_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.1\univocity-parsers-2.2.1.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.11\2.2.0\spark-sketch_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.11\2.2.0\spark-catalyst_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.0\janino-3.0.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.0\commons-compiler-3.0.0.jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.3\antlr4-runtime-4.5.3.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.2\parquet-column-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.2\parquet-common-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.2\parquet-encoding-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.2\parquet-hadoop-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.1\parquet-format-2.3.1.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.2\parquet-jackson-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.11\2.2.0\spark-streaming_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.11\2.2.0\spark-hive_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.0\parquet-hadoop-bundle-1.6.0.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2.1.spark2\hive-exec-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.1\javolution-5.5.1.jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.17\apache-log4j-extras-1.2.17.jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.1\stringtemplate-3.2.1.jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.7\antlr-2.7.7.jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.4\ST4-4.0.4.jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.2\JavaEWAH-0.3.2.jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.1\stax-api-1.0.1.jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2.1.spark2\hive-metastore-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8.0.RELEASE\bonecp-0.8.0.RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.6\datanucleus-api-jdo-3.2.6.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.9\datanucleus-rdbms-3.2.9.jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.4\commons-pool-1.5.4.jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.1\jdo-api-3.0.1.jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.0-incubating\calcite-avatica-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.0-incubating\calcite-core-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.0-incubating\calcite-linq4j-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.5\eigenbase-properties-1.1.5.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.3\joda-time-2.9.3.jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.2\jodd-core-3.5.2.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.10\datanucleus-core-3.2.10.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.3\libfb303-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka-0-10_2.11\2.2.0\spark-streaming-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.11\0.10.0.1\kafka_2.11-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.0\metrics-core-2.2.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.0.4\scala-parser-combinators_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka-0-10_2.11\2.2.0\spark-sql-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.0\hadoop-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.0\hadoop-common-2.6.0.jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.1\commons-collections-3.2.1.jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.0\hadoop-auth-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.0\hadoop-hdfs-2.6.0.jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.0\hadoop-mapreduce-client-app-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.0\hadoop-mapreduce-client-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.0\hadoop-yarn-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.0\hadoop-yarn-server-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.0\hadoop-mapreduce-client-shuffle-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.0\hadoop-yarn-api-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.0\hadoop-mapreduce-client-core-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.0\hadoop-yarn-common-2.6.0.jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.13\jackson-xc-1.9.13.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.0\hadoop-mapreduce-client-jobclient-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.0\hadoop-annotations-2.6.0.jar com.spark.test.Test Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 18/03/14 17:01:07 INFO SparkContext: Running Spark version 2.2.0 18/03/14 17:01:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/03/14 17:01:08 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:104) at org.apache.hadoop.security.Groups.<init>(Groups.java:86) at org.apache.hadoop.security.Groups.<init>(Groups.java:66) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:280) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:271) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:248) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:763) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:748) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:621) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2430) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2430) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2430) at org.apache.spark.SparkContext.<init>(SparkContext.scala:295) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at com.spark.test.Test$.main(Test.scala:11) at com.spark.test.Test.main(Test.scala) 18/03/14 17:01:08 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:376) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at com.spark.test.Test$.main(Test.scala:11) at com.spark.test.Test.main(Test.scala) 18/03/14 17:01:08 INFO SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:376) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at com.spark.test.Test$.main(Test.scala:11) at com.spark.test.Test.main(Test.scala) Process finished with exit code 1
这是因为我本地没有配置好hadoop,现在我们配一个
这个是我本地的hadoop/bin
下面把本地win10的环境变量配置一下
再重启一下idea,再运行一下程序
报了另外一个错误,但是可以确定的是前面的错误我们解决了
E:\software\jdk1.8\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=60011:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF-8 -classpath E:\software\jdk1.8\jre\lib\charsets.jar;E:\software\jdk1.8\jre\lib\deploy.jar;E:\software\jdk1.8\jre\lib\ext\access-bridge-64.jar;E:\software\jdk1.8\jre\lib\ext\cldrdata.jar;E:\software\jdk1.8\jre\lib\ext\dnsns.jar;E:\software\jdk1.8\jre\lib\ext\jaccess.jar;E:\software\jdk1.8\jre\lib\ext\jfxrt.jar;E:\software\jdk1.8\jre\lib\ext\localedata.jar;E:\software\jdk1.8\jre\lib\ext\nashorn.jar;E:\software\jdk1.8\jre\lib\ext\sunec.jar;E:\software\jdk1.8\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.8\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.8\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.8\jre\lib\ext\zipfs.jar;E:\software\jdk1.8\jre\lib\javaws.jar;E:\software\jdk1.8\jre\lib\jce.jar;E:\software\jdk1.8\jre\lib\jfr.jar;E:\software\jdk1.8\jre\lib\jfxswt.jar;E:\software\jdk1.8\jre\lib\jsse.jar;E:\software\jdk1.8\jre\lib\management-agent.jar;E:\software\jdk1.8\jre\lib\plugin.jar;E:\software\jdk1.8\jre\lib\resources.jar;E:\software\jdk1.8\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11.0.jar;E:\software\Scala\lib\scala-actors-migration_2.11-1.1.0.jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.11-1.0.4.jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.11-1.0.2.jar;E:\software\Scala\lib\scala-xml_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.11\2.2.0\spark-core_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.7\avro-mapred-1.7.7-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.11\0.8.0\chill_2.11-0.8.0.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.3\kryo-shaded-3.0.3.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.0\chill-java-0.8.0.jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.11\2.2.0\spark-launcher_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.11\2.2.0\spark-network-common_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.5\jackson-annotations-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.11\2.2.0\spark-network-shuffle_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.11\2.2.0\spark-unsafe_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.3\jets3t-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.3\httpcore-4.3.3.jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.2\mx4j-3.0.2.jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.7\mail-1.4.7.jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.8\base64-2.3.8.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.1\guava-16.0.1.jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.16\slf4j-log4j12-1.7.16.jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.0\lz4-1.3.0.jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.11\RoaringBitmap-0.5.11.jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.8\scala-library-2.11.8.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.11\3.2.11\json4s-jackson_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.11\3.2.11\json4s-core_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.11\3.2.11\json4s-ast_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.0\scalap-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.0\scala-compiler-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.11\1.0.1\scala-xml_2.11-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0.43.Final\netty-all-4.0.43.Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.2\metrics-core-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.2\metrics-jvm-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.2\metrics-json-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.2\metrics-graphite-3.1.2.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.5\jackson-core-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.6.5\jackson-module-scala_2.11-2.6.5.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.7\scala-reflect-2.11.7.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.5\jackson-module-paranamer-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;E:\software\maven3.3.9\repository\oro\oro\2.0.8\oro-2.0.8.jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.4\py4j-0.10.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.11\2.2.0\spark-tags_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.11\2.2.0\spark-sql_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.1\univocity-parsers-2.2.1.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.11\2.2.0\spark-sketch_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.11\2.2.0\spark-catalyst_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.0\janino-3.0.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.0\commons-compiler-3.0.0.jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.3\antlr4-runtime-4.5.3.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.2\parquet-column-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.2\parquet-common-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.2\parquet-encoding-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.2\parquet-hadoop-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.1\parquet-format-2.3.1.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.2\parquet-jackson-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.11\2.2.0\spark-streaming_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.11\2.2.0\spark-hive_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.0\parquet-hadoop-bundle-1.6.0.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2.1.spark2\hive-exec-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.1\javolution-5.5.1.jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.17\apache-log4j-extras-1.2.17.jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.1\stringtemplate-3.2.1.jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.7\antlr-2.7.7.jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.4\ST4-4.0.4.jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.2\JavaEWAH-0.3.2.jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.1\stax-api-1.0.1.jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2.1.spark2\hive-metastore-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8.0.RELEASE\bonecp-0.8.0.RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.6\datanucleus-api-jdo-3.2.6.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.9\datanucleus-rdbms-3.2.9.jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.4\commons-pool-1.5.4.jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.1\jdo-api-3.0.1.jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.0-incubating\calcite-avatica-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.0-incubating\calcite-core-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.0-incubating\calcite-linq4j-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.5\eigenbase-properties-1.1.5.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.3\joda-time-2.9.3.jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.2\jodd-core-3.5.2.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.10\datanucleus-core-3.2.10.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.3\libfb303-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka-0-10_2.11\2.2.0\spark-streaming-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.11\0.10.0.1\kafka_2.11-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.0\metrics-core-2.2.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.0.4\scala-parser-combinators_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka-0-10_2.11\2.2.0\spark-sql-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.0\hadoop-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.0\hadoop-common-2.6.0.jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.1\commons-collections-3.2.1.jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.0\hadoop-auth-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.0\hadoop-hdfs-2.6.0.jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.0\hadoop-mapreduce-client-app-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.0\hadoop-mapreduce-client-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.0\hadoop-yarn-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.0\hadoop-yarn-server-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.0\hadoop-mapreduce-client-shuffle-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.0\hadoop-yarn-api-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.0\hadoop-mapreduce-client-core-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.0\hadoop-yarn-common-2.6.0.jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.13\jackson-xc-1.9.13.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.0\hadoop-mapreduce-client-jobclient-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.0\hadoop-annotations-2.6.0.jar com.spark.test.Test Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 18/03/14 17:34:56 INFO SparkContext: Running Spark version 2.2.0 18/03/14 17:34:57 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:376) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at com.spark.test.Test$.main(Test.scala:11) at com.spark.test.Test.main(Test.scala) 18/03/14 17:34:57 INFO SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:376) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at com.spark.test.Test$.main(Test.scala:11) at com.spark.test.Test.main(Test.scala) Process finished with exit code 1
这里的错误是说要指明你的程序运行在什么地方
在程序里加上这一句,指明我们现在在本地运行
我们再运行一次,可以看到没问题了
我们继续修改程序,加上这一句
再次运行看看结果
把相同的单词进行累加
我们看看运行结果
刚刚我们使用的是rdd的方式,接下来我们使用dataSet的方式
dataSet可以近似的理解为数据库的一张张表
我们运行的结果
用空格切分单词
运行结果
E:\software\jdk1.8\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=62232:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF-8 -classpath E:\software\jdk1.8\jre\lib\charsets.jar;E:\software\jdk1.8\jre\lib\deploy.jar;E:\software\jdk1.8\jre\lib\ext\access-bridge-64.jar;E:\software\jdk1.8\jre\lib\ext\cldrdata.jar;E:\software\jdk1.8\jre\lib\ext\dnsns.jar;E:\software\jdk1.8\jre\lib\ext\jaccess.jar;E:\software\jdk1.8\jre\lib\ext\jfxrt.jar;E:\software\jdk1.8\jre\lib\ext\localedata.jar;E:\software\jdk1.8\jre\lib\ext\nashorn.jar;E:\software\jdk1.8\jre\lib\ext\sunec.jar;E:\software\jdk1.8\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.8\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.8\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.8\jre\lib\ext\zipfs.jar;E:\software\jdk1.8\jre\lib\javaws.jar;E:\software\jdk1.8\jre\lib\jce.jar;E:\software\jdk1.8\jre\lib\jfr.jar;E:\software\jdk1.8\jre\lib\jfxswt.jar;E:\software\jdk1.8\jre\lib\jsse.jar;E:\software\jdk1.8\jre\lib\management-agent.jar;E:\software\jdk1.8\jre\lib\plugin.jar;E:\software\jdk1.8\jre\lib\resources.jar;E:\software\jdk1.8\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11.0.jar;E:\software\Scala\lib\scala-actors-migration_2.11-1.1.0.jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.11-1.0.4.jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.11-1.0.2.jar;E:\software\Scala\lib\scala-xml_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.11\2.2.0\spark-core_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.7\avro-1.7.7.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.1\commons-compress-1.4.1.jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.7\avro-mapred-1.7.7-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.7\avro-ipc-1.7.7-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.11\0.8.0\chill_2.11-0.8.0.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.3\kryo-shaded-3.0.3.jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.0\chill-java-0.8.0.jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.11\2.2.0\spark-launcher_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.11\2.2.0\spark-network-common_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.5\jackson-annotations-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.11\2.2.0\spark-network-shuffle_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.11\2.2.0\spark-unsafe_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.3\jets3t-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.3\httpcore-4.3.3.jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.2\mx4j-3.0.2.jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.7\mail-1.4.7.jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.8\base64-2.3.8.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.1\guava-16.0.1.jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.16\slf4j-log4j12-1.7.16.jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.0\lz4-1.3.0.jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.11\RoaringBitmap-0.5.11.jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.8\scala-library-2.11.8.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.11\3.2.11\json4s-jackson_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.11\3.2.11\json4s-core_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.11\3.2.11\json4s-ast_2.11-3.2.11.jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.0\scalap-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.0\scala-compiler-2.11.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.11\1.0.1\scala-xml_2.11-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0.43.Final\netty-all-4.0.43.Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.2\metrics-core-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.2\metrics-jvm-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.2\metrics-json-3.1.2.jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.2\metrics-graphite-3.1.2.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.5\jackson-databind-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.5\jackson-core-2.6.5.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.11\2.6.5\jackson-module-scala_2.11-2.6.5.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.7\scala-reflect-2.11.7.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.5\jackson-module-paranamer-2.6.5.jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;E:\software\maven3.3.9\repository\oro\oro\2.0.8\oro-2.0.8.jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.4\py4j-0.10.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.11\2.2.0\spark-tags_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.11\2.2.0\spark-sql_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.1\univocity-parsers-2.2.1.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.11\2.2.0\spark-sketch_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.11\2.2.0\spark-catalyst_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.0\janino-3.0.0.jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.0\commons-compiler-3.0.0.jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.3\antlr4-runtime-4.5.3.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.2\parquet-column-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.2\parquet-common-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.2\parquet-encoding-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.2\parquet-hadoop-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.1\parquet-format-2.3.1.jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.2\parquet-jackson-1.8.2.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.11\2.2.0\spark-streaming_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.11\2.2.0\spark-hive_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.0\parquet-hadoop-bundle-1.6.0.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2.1.spark2\hive-exec-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.1\javolution-5.5.1.jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.17\apache-log4j-extras-1.2.17.jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.1\stringtemplate-3.2.1.jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.7\antlr-2.7.7.jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.4\ST4-4.0.4.jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.2\JavaEWAH-0.3.2.jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.1\stax-api-1.0.1.jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2.1.spark2\hive-metastore-1.2.1.spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8.0.RELEASE\bonecp-0.8.0.RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.6\datanucleus-api-jdo-3.2.6.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.9\datanucleus-rdbms-3.2.9.jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.4\commons-pool-1.5.4.jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.1\jdo-api-3.0.1.jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.0-incubating\calcite-avatica-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.0-incubating\calcite-core-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.0-incubating\calcite-linq4j-1.2.0-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.5\eigenbase-properties-1.1.5.jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.2\httpclient-4.5.2.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.3\joda-time-2.9.3.jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.2\jodd-core-3.5.2.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.10\datanucleus-core-3.2.10.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.3\libfb303-0.9.3.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka-0-10_2.11\2.2.0\spark-streaming-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.11\0.10.0.1\kafka_2.11-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.0\metrics-core-2.2.0.jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.11\1.0.4\scala-parser-combinators_2.11-1.0.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka-0-10_2.11\2.2.0\spark-sql-kafka-0-10_2.11-2.2.0.jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.0\hadoop-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.0\hadoop-common-2.6.0.jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.1\commons-collections-3.2.1.jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.0\hadoop-auth-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.0\hadoop-hdfs-2.6.0.jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.0\hadoop-mapreduce-client-app-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.0\hadoop-mapreduce-client-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.0\hadoop-yarn-client-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.0\hadoop-yarn-server-common-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.0\hadoop-mapreduce-client-shuffle-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.0\hadoop-yarn-api-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.0\hadoop-mapreduce-client-core-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.0\hadoop-yarn-common-2.6.0.jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.13\jackson-xc-1.9.13.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.0\hadoop-mapreduce-client-jobclient-2.6.0.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.0\hadoop-annotations-2.6.0.jar com.spark.test.Test Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 18/03/14 20:41:05 INFO SparkContext: Running Spark version 2.2.0 18/03/14 20:41:06 INFO SparkContext: Submitted application: HdfsTest 18/03/14 20:41:06 INFO SecurityManager: Changing view acls to: Brave 18/03/14 20:41:06 INFO SecurityManager: Changing modify acls to: Brave 18/03/14 20:41:06 INFO SecurityManager: Changing view acls groups to: 18/03/14 20:41:06 INFO SecurityManager: Changing modify acls groups to: 18/03/14 20:41:06 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Brave); groups with view permissions: Set(); users with modify permissions: Set(Brave); groups with modify permissions: Set() 18/03/14 20:41:07 INFO Utils: Successfully started service 'sparkDriver' on port 62269. 18/03/14 20:41:07 INFO SparkEnv: Registering MapOutputTracker 18/03/14 20:41:07 INFO SparkEnv: Registering BlockManagerMaster 18/03/14 20:41:07 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 18/03/14 20:41:07 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 18/03/14 20:41:07 INFO DiskBlockManager: Created local directory at C:\Users\Brave\AppData\Local\Temp\blockmgr-2ad95228-3532-4a24-b6b6-b09973c4a4ff 18/03/14 20:41:07 INFO MemoryStore: MemoryStore started with capacity 1998.3 MB 18/03/14 20:41:07 INFO SparkEnv: Registering OutputCommitCoordinator 18/03/14 20:41:07 INFO Utils: Successfully started service 'SparkUI' on port 4040. 18/03/14 20:41:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.56.1:4040 18/03/14 20:41:07 INFO Executor: Starting executor ID driver on host localhost 18/03/14 20:41:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62282. 18/03/14 20:41:07 INFO NettyBlockTransferService: Server created on 192.168.56.1:62282 18/03/14 20:41:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 18/03/14 20:41:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.56.1, 62282, None) 18/03/14 20:41:07 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:62282 with 1998.3 MB RAM, BlockManagerId(driver, 192.168.56.1, 62282, None) 18/03/14 20:41:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.56.1, 62282, None) 18/03/14 20:41:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.56.1, 62282, None) 18/03/14 20:41:08 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/E:/Mycode/SparkStu/spark-warehouse/'). 18/03/14 20:41:08 INFO SharedState: Warehouse path is 'file:/E:/Mycode/SparkStu/spark-warehouse/'. 18/03/14 20:41:09 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 18/03/14 20:41:11 INFO FileSourceStrategy: Pruning directories with: 18/03/14 20:41:11 INFO FileSourceStrategy: Post-Scan Filters: 18/03/14 20:41:11 INFO FileSourceStrategy: Output Data Schema: struct<value: string> 18/03/14 20:41:11 INFO FileSourceScanExec: Pushed Filters: 18/03/14 20:41:12 INFO CodeGenerator: Code generated in 321.911944 ms 18/03/14 20:41:12 INFO CodeGenerator: Code generated in 9.798824 ms 18/03/14 20:41:12 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 213.6 KB, free 1998.1 MB) 18/03/14 20:41:12 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.2 KB, free 1998.1 MB) 18/03/14 20:41:12 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.56.1:62282 (size: 20.2 KB, free: 1998.3 MB) 18/03/14 20:41:12 INFO SparkContext: Created broadcast 0 from show at Test.scala:17 18/03/14 20:41:13 INFO FileSourceScanExec: Planning scan with bin packing, max size: 4194417 bytes, open cost is considered as scanning 4194304 bytes. 18/03/14 20:41:13 INFO SparkContext: Starting job: show at Test.scala:17 18/03/14 20:41:13 INFO DAGScheduler: Got job 0 (show at Test.scala:17) with 1 output partitions 18/03/14 20:41:13 INFO DAGScheduler: Final stage: ResultStage 0 (show at Test.scala:17) 18/03/14 20:41:13 INFO DAGScheduler: Parents of final stage: List() 18/03/14 20:41:13 INFO DAGScheduler: Missing parents: List() 18/03/14 20:41:13 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[5] at show at Test.scala:17), which has no missing parents 18/03/14 20:41:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 13.0 KB, free 1998.1 MB) 18/03/14 20:41:13 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.1 KB, free 1998.1 MB) 18/03/14 20:41:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.56.1:62282 (size: 6.1 KB, free: 1998.3 MB) 18/03/14 20:41:13 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 18/03/14 20:41:13 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[5] at show at Test.scala:17) (first 15 tasks are for partitions Vector(0)) 18/03/14 20:41:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 18/03/14 20:41:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 5268 bytes) 18/03/14 20:41:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 18/03/14 20:41:13 INFO CodeGenerator: Code generated in 13.617205 ms 18/03/14 20:41:13 INFO FileScanRDD: Reading File path: file:///E:/Mycode/datas/stu.txt, range: 0-113, partition values: [empty row] 18/03/14 20:41:13 INFO CodeGenerator: Code generated in 11.971125 ms 18/03/14 20:41:13 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1745 bytes result sent to driver 18/03/14 20:41:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 258 ms on localhost (executor driver) (1/1) 18/03/14 20:41:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 18/03/14 20:41:13 INFO DAGScheduler: ResultStage 0 (show at Test.scala:17) finished in 0.284 s 18/03/14 20:41:13 INFO DAGScheduler: Job 0 finished: show at Test.scala:17, took 0.483521 s 18/03/14 20:41:13 INFO CodeGenerator: Code generated in 23.334109 ms +------+ | value| +------+ |hadoop| |hadoop| | java| | java| | spark| | spark| | hive| | hbase| | sqoop| | sqoop| | mysql| | redit| | flume| | flume| | join| | hue| | scala| |python| +------+ 18/03/14 20:41:13 INFO SparkContext: Invoking stop() from shutdown hook 18/03/14 20:41:13 INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4040 18/03/14 20:41:13 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 18/03/14 20:41:13 INFO MemoryStore: MemoryStore cleared 18/03/14 20:41:13 INFO BlockManager: BlockManager stopped 18/03/14 20:41:14 INFO BlockManagerMaster: BlockManagerMaster stopped 18/03/14 20:41:14 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 18/03/14 20:41:14 INFO SparkContext: Successfully stopped SparkContext 18/03/14 20:41:14 INFO ShutdownHookManager: Shutdown hook called 18/03/14 20:41:14 INFO ShutdownHookManager: Deleting directory C:\Users\Brave\AppData\Local\Temp\spark-2c920b38-6a7f-4914-a2ef-9ee345492414 Process finished with exit code 0
增加这个字段
运行结果
分组统计
运行结果
把最后的代码放上来
package com.spark.test import org.apache.spark.sql.SparkSession import org.apache.spark.{SparkConf, SparkContext} object Test { def main(args: Array[String]): Unit = { val spark= SparkSession .builder .master("local") .appName("HdfsTest") .getOrCreate() val filePart = "E://Mycode/datas/stu.txt" // val rdd= spark.sparkContext.textFile(filePart) // val lines= rdd.flatMap(x => x.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).collect().toList // println(lines) import spark.implicits._ val dataSet= spark.read.textFile(filePart) .flatMap(x => x.split(" ")) .map(x=>(x,1)).groupBy("_1").count() .show() } }
现在我们把程序打包
我们把代码稍微改一下
package com.spark.test import org.apache.spark.sql.SparkSession import org.apache.spark.{SparkConf, SparkContext} object Test { def main(args: Array[String]): Unit = { val spark= SparkSession .builder .master("local") .appName("HdfsTest") .getOrCreate() val filePart = args(0) // val filePart = "E://Mycode/datas/stu.txt" // val rdd= spark.sparkContext.textFile(filePart) // val lines= rdd.flatMap(x => x.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).collect().toList // println(lines) import spark.implicits._ val dataSet= spark.read.textFile(filePart) .flatMap(x => x.split(" ")) .map(x=>(x,1)).groupBy("_1").count() .show() } }
把这些都剔除掉
剩下这两个
打包完成了
把这个包上传到我们的集群上
这个是我们的数据文件
我们把数据文件上传的hdfs上面去,先启动hdfs
同时记得把zookeeper也启动了,不然会出问题的
现在hdfs上创建一个目录
把本地的文件上传
我们在集群上跑一下
bin/spark-submit --master local[2] /opt/jars/sparkStu.jar hdfs://bigdata-pro01.kfk.com:9000/user/datas/stu.txt
可以看到跑下来的结果