Linux单机安转Spark
1. 创建目录
> mkdir /opt/spark
> cd /opt/spark
2. 解压缩、创建软连接
> tar zxvf spark-2.3.0-bin-hadoop2.7.tgz
> link -s spark-2.3.0-bin-hadoop2.7 spark
4. 编辑 /etc/profile
> vi /etc/profile
输入下面内容
export SPARK_HOME=/opt/spark/spark
export PATH=$PATH:$SPARK_HOME/bin
> source /etc/profile
5. 进入配置文件夹
> cd /opt/spark/spark/conf
6. 配置spark-env.sh
> cp spark-env.sh.template spark-env.sh
spark-env.sh 中输入以下内容
export SCALA_HOME=/opt/scala/scala
export JAVA_HOME=/opt/java/jdk
export SPARK_HOME=/opt/spark/spark
export SPARK_MASTER_IP=hserver1
export SPARK_EXECUTOR_MEMORY=1G
注意:上面的路径应该根据自己的路径配置
7. 配置slaves
> cp slaves.template slaves
slaves 中输入以下内容
localhost
8. 运行spark示例
> cd /opt/spark/spark
> ./bin/run-example SparkPi 10
会显示下面信息
[aston@localhost spark]$ ./bin/run-example SparkPi 10
2018-06-04 22:37:25 WARN Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.199.150 instead (on interface wlp8s0b1)
2018-06-04 22:37:25 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-06-04 22:37:25 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-06-04 22:37:25 INFO SparkContext:54 - Running Spark version 2.3.0
2018-06-04 22:37:25 INFO SparkContext:54 - Submitted application: Spark Pi
2018-06-04 22:37:26 INFO SecurityManager:54 - Changing view acls to: aston
2018-06-04 22:37:26 INFO SecurityManager:54 - Changing modify acls to: aston
2018-06-04 22:37:26 INFO SecurityManager:54 - Changing view acls groups to:
2018-06-04 22:37:26 INFO SecurityManager:54 - Changing modify acls groups to:
2018-06-04 22:37:26 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aston); groups with view permissions: Set(); users with modify permissions: Set(aston); groups with modify permissions: Set()
2018-06-04 22:37:26 INFO Utils:54 - Successfully started service 'sparkDriver' on port 34729.
2018-06-04 22:37:26 INFO SparkEnv:54 - Registering MapOutputTracker
2018-06-04 22:37:26 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-06-04 22:37:26 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-06-04 22:37:26 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-06-04 22:37:26 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-4d51d515-85db-4a8c-bb45-219fd96be3c6
2018-06-04 22:37:26 INFO MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2018-06-04 22:37:26 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2018-06-04 22:37:26 INFO log:192 - Logging initialized @2296ms
2018-06-04 22:37:26 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2018-06-04 22:37:26 INFO Server:414 - Started @2382ms
2018-06-04 22:37:26 INFO AbstractConnector:278 - Started ServerConnector@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-06-04 22:37:26 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f212d84{/jobs,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27ead29e{/jobs/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4c060c8f{/jobs/job,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@383f3558{/jobs/job/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49b07ee3{/stages,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@352e612e{/stages/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65f00478{/stages/stage,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28486680{/stages/stage/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d7e7435{/stages/pool,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4a1e3ac1{/stages/pool/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e78fcf5{/storage,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@56febdc{/storage/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b8ee898{/storage/rdd,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7d151a{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@294bdeb4{/environment,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5300f14a{/environment/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1f86099a{/executors,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@77bb0ab5{/executors/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@f2c488{/executors/threadDump,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54acff7d{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7bc9e6ab{/static,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@37d00a23{/,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@433e536f{/api,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@988246e{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62515a47{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-06-04 22:37:26 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://192.168.199.150:4040
2018-06-04 22:37:26 INFO SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
2018-06-04 22:37:26 INFO SparkContext:54 - Added JAR file:///opt/spark/spark/examples/jars/scopt_2.11-3.7.0.jar at spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
2018-06-04 22:37:26 INFO Executor:54 - Starting executor ID driver on host localhost
2018-06-04 22:37:26 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45436.
2018-06-04 22:37:26 INFO NettyBlockTransferService:54 - Server created on 192.168.199.150:45436
2018-06-04 22:37:26 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-06-04 22:37:26 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO BlockManagerMasterEndpoint:54 - Registering block manager 192.168.199.150:45436 with 366.3 MB RAM, BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:26 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, 192.168.199.150, 45436, None)
2018-06-04 22:37:27 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@65bcf7c2{/metrics/json,null,AVAILABLE,@Spark}
2018-06-04 22:37:27 INFO SparkContext:54 - Starting job: reduce at SparkPi.scala:38
2018-06-04 22:37:27 INFO DAGScheduler:54 - Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
2018-06-04 22:37:27 INFO DAGScheduler:54 - Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
2018-06-04 22:37:27 INFO DAGScheduler:54 - Parents of final stage: List()
2018-06-04 22:37:27 INFO DAGScheduler:54 - Missing parents: List()
2018-06-04 22:37:27 INFO DAGScheduler:54 - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
2018-06-04 22:37:27 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1832.0 B, free 366.3 MB)
2018-06-04 22:37:28 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 1181.0 B, free 366.3 MB)
2018-06-04 22:37:28 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 192.168.199.150:45436 (size: 1181.0 B, free: 366.3 MB)
2018-06-04 22:37:28 INFO SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-06-04 22:37:28 INFO DAGScheduler:54 - Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
2018-06-04 22:37:28 INFO TaskSchedulerImpl:54 - Adding task set 0.0 with 10 tasks
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO Executor:54 - Running task 2.0 in stage 0.0 (TID 2)
2018-06-04 22:37:28 INFO Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2018-06-04 22:37:28 INFO Executor:54 - Running task 3.0 in stage 0.0 (TID 3)
2018-06-04 22:37:28 INFO Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-06-04 22:37:28 INFO Executor:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 1528123046780
2018-06-04 22:37:28 INFO TransportClientFactory:267 - Successfully created connection to /192.168.199.150:34729 after 34 ms (0 ms spent in bootstraps)
2018-06-04 22:37:28 INFO Utils:54 - Fetching spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8606784681518533462.tmp
2018-06-04 22:37:28 INFO Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/scopt_2.11-3.7.0.jar to class loader
2018-06-04 22:37:28 INFO Executor:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 1528123046779
2018-06-04 22:37:28 INFO Utils:54 - Fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar to /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchFileTemp8435156876449095794.tmp
2018-06-04 22:37:28 INFO Executor:54 - Adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/userFiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/spark-examples_2.11-2.3.0.jar to class loader
2018-06-04 22:37:28 INFO Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 2.0 in stage 0.0 (TID 2). 867 bytes result sent to driver
2018-06-04 22:37:28 INFO Executor:54 - Running task 4.0 in stage 0.0 (TID 4)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO Executor:54 - Running task 5.0 in stage 0.0 (TID 5)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 3.0 in stage 0.0 (TID 3). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO Executor:54 - Running task 7.0 in stage 0.0 (TID 7)
2018-06-04 22:37:28 INFO Executor:54 - Running task 6.0 in stage 0.0 (TID 6)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 362 ms on localhost (executor driver) (1/10)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 3.0 in stage 0.0 (TID 3) in 385 ms on localhost (executor driver) (2/10)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 418 ms on localhost (executor driver) (3/10)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 2.0 in stage 0.0 (TID 2) in 388 ms on localhost (executor driver) (4/10)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 5.0 in stage 0.0 (TID 5). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 8.0 in stage 0.0 (TID 8, localhost, executor driver, partition 8, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 5.0 in stage 0.0 (TID 5) in 79 ms on localhost (executor driver) (5/10)
2018-06-04 22:37:28 INFO Executor:54 - Running task 8.0 in stage 0.0 (TID 8)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 4.0 in stage 0.0 (TID 4). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Starting task 9.0 in stage 0.0 (TID 9, localhost, executor driver, partition 9, PROCESS_LOCAL, 7853 bytes)
2018-06-04 22:37:28 INFO Executor:54 - Running task 9.0 in stage 0.0 (TID 9)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 4.0 in stage 0.0 (TID 4) in 99 ms on localhost (executor driver) (6/10)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 7.0 in stage 0.0 (TID 7). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 7.0 in stage 0.0 (TID 7) in 98 ms on localhost (executor driver) (7/10)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 6.0 in stage 0.0 (TID 6). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 6.0 in stage 0.0 (TID 6) in 107 ms on localhost (executor driver) (8/10)
2018-06-04 22:37:28 INFO Executor:54 - Finished task 9.0 in stage 0.0 (TID 9). 824 bytes result sent to driver
2018-06-04 22:37:28 INFO Executor:54 - Finished task 8.0 in stage 0.0 (TID 8). 867 bytes result sent to driver
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 9.0 in stage 0.0 (TID 9) in 39 ms on localhost (executor driver) (9/10)
2018-06-04 22:37:28 INFO TaskSetManager:54 - Finished task 8.0 in stage 0.0 (TID 8) in 57 ms on localhost (executor driver) (10/10)
2018-06-04 22:37:28 INFO TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool
2018-06-04 22:37:28 INFO DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:38) finished in 0.800 s
2018-06-04 22:37:28 INFO DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:38, took 0.945853 s
Pi is roughly 3.14023914023914
2018-06-04 22:37:28 INFO AbstractConnector:318 - Stopped Spark@779dfe55{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-06-04 22:37:28 INFO SparkUI:54 - Stopped Spark web UI at http://192.168.199.150:4040
2018-06-04 22:37:28 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-06-04 22:37:28 INFO MemoryStore:54 - MemoryStore cleared
2018-06-04 22:37:28 INFO BlockManager:54 - BlockManager stopped
2018-06-04 22:37:28 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2018-06-04 22:37:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-06-04 22:37:28 INFO SparkContext:54 - Successfully stopped SparkContext
2018-06-04 22:37:28 INFO ShutdownHookManager:54 - Shutdown hook called
2018-06-04 22:37:28 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e
2018-06-04 22:37:28 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-16300765-9872-4542-91ed-1a7a0f8285d9
9. 运行spark shell
> cd /opt/spark/spark
> ./bin/spark-shell