每日随笔——Spark
今天学习如何使用Spark技术。
一、下载spark
下载spark-3.4.0-bin-without-hadoop.tgz文件,百度网盘链接:https://pan.baidu.com/s/181shkgg-i0WEytQMqeeqxA(提取码:9ekc )
二、安装hadoop和Javajdk(这些在之前博客中已经发布,默认已经安装成功)
三、安装spark
sudo tar -zxf /export/server/spark-3.4.0-bin-without-hadoop.tgz -C /export/server/ cd /export/server/ sudo mv ./spark-3.4.0-bin-without-hadoop/ ./spark sudo chown -R hadoop:hadoop ./spark
修改Spark的配置文件spark-env.sh
cd /export/server/spark
cp ./conf/spark-env.sh.template ./conf/spark-env.sh
编辑spark-env.sh文件(vim ./conf/spark-env.sh),在第一行添加以下配置信息:
export SPARK_DIST_CLASSPATH=$(/export/server/hadoop/bin/hadoop classpath)
四、验证Spark是否安装成功。
cd /export/server/spark bin/run-example SparkPi 2>&1 | grep "Pi is"