|NO.Z.00022|——————————|^^ 修订 ^^|——|Hadoop&PB级离线数仓.v01|——|Griffin.v01|Griffin部署&spark&livy&ES|
一、安装概述
### --- 相关依赖:重点讲解 Griffin,不对依赖组件做过多讲解,所有组件均采用单机模式安装。
~~~ JDK (1.8 or later versions)
~~~ MySQL(version 5.6及以上)
~~~ Hadoop (2.6.0 or later)
~~~ Hive (version 2.x)
~~~ Maven
~~~ Spark (version 2.2.1)
~~~ Livy(livy-0.5.0-incubating)
~~~ ElasticSearch (5.0 or later versions)
### --- 依赖组件说明
~~~ Spark:计算批量、实时指标
~~~ Livy:为服务提供 RESTful API 调用 Apache Spark
~~~ ElasticSearch:存储指标数据
~~~ MySQL:服务元数据
二、Spark安装
### --- 下载并解压缩spark版本包
[root@hadoop02 ~]# ll /opt/yanqi/software/spark-2.2.1-bin-hadoop2.7.tgz
-rw-r--r-- 1 root root 200934340 Aug 26 2020 /opt/yanqi/software/spark-2.2.1-bin-hadoop2.7.tgz
[root@hadoop02 ~]# cd /opt/yanqi/software/
[root@hadoop02 software]# tar -zxvf spark-2.2.1-bin-hadoop2.7.tgz -C ../servers/
[root@hadoop02 ~]# cd /opt/yanqi/servers/
[root@hadoop02 servers]# mv spark-2.2.1-bin-hadoop2.7/ spark-2.2.1
### --- 设置$SPARK_HOME环境变量
[root@hadoop02 ~]# vim /etc/profile
##SPARK_HOME
export SPARK_HOME=/opt/yanqi/servers/spark-2.2.1/
export PATH=$PATH:$SPARK_HOME/bin
~~~ # 使环境变量生效
[root@hadoop02 ~]# source /etc/profile
### --- 修改配置文件 $SPARK_HOME/conf/spark-defaults.conf
~~~ # 准备配置文件
[root@hadoop02 ~]# cd $SPARK_HOME/conf/
[root@hadoop02 conf]# cp spark-defaults.conf.template spark-defaults.conf
[root@hadoop02 conf]# cp spark-env.sh.template spark-env.sh
[root@hadoop02 ~]# vim $SPARK_HOME/conf/spark-defaults.conf
~~~ 修改配置参数:可以直接粘贴在里面
spark.master yarn
spark.eventLog.enabled true
spark.eventLog.dir hdfs://hadoop01:9000/spark/logs
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.yarn.jars hdfs://hadoop01:9000/spark/spark_2.2.1_jars/*
### --- 拷贝 MySQL 驱动
[root@hadoop02 ~]# cp $HIVE_HOME/lib/mysql-connector-java-5.1.46.jar \
$SPARK_HOME/jars/
[root@hadoop02 ~]# ls $SPARK_HOME/jars/mysql-connector-java-5.1.46.jar
/opt/yanqi/servers/spark-2.2.1//jars/mysql-connector-java-5.1.46.jar
### --- 将 Spark 的 jar 包上传到 hdfs://hadoop1:9000/spark/spark_2.2.1_jars/
[root@hadoop02 ~]# hdfs dfs -mkdir -p /spark/logs
[root@hadoop02 ~]# hdfs dfs -mkdir -p /spark/spark_2.2.1_jars/
[root@hadoop02 ~]# hdfs dfs -put /opt/yanqi/servers/spark-2.2.1/jars/*.jar /spark/spark_2.2.1_jars/
### --- 修改配置文件spark-env.sh
[root@hadoop02 ~]# vim $SPARK_HOME/conf/spark-env.sh
~~~ 输入参数
export JAVA_HOME=/opt/yanqi/servers/jdk1.8.0_231/
export HADOOP_HOME=/opt/yanqi/servers/hadoop-2.9.2/
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
### --- yarn-site.xml 添加配置
~~~ yarn.nodemanager.vmem-check-enabled:是否检查虚拟内存。
~~~ 修改所有节点,并重启yarn服务。
~~~ 不添加该配配置启动spark-shell,有如下错误:Yarn application has already ended! It might have been killed or unable to launch application master.
[root@hadoop02 ~]# vim $HADOOP_HOME/etc/hadoop/yarn-site.xml
<!-- Spark参数配置:不去检查真实内存参数 -->
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
~~~ # 重启yarn服务
[root@hadoop01 ~]# stop-yarn.sh
[root@hadoop01 ~]# start-yarn.sh
### --- 测试spark:spark-shell
~~~ # 准备测试文件
[root@hadoop02 ~]# hdfs dfs -ls /wcinput/wc.txt # /wcinput/wc.txt : HDFS上的文件
-rw-r--r-- 5 root supergroup 109 2020-06-20 13:58 /wcinput/wc.txt
~~~ # 进入spark-shell
[root@hadoop02 ~]# spark-shell
~~~ 输出参数
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.1
/_/
~~~ # 查看启动的进程
[root@hadoop02 ~]# jps
9422 SparkSubmit
~~~ # 打开该文件读取内容
scala> val lines = sc.textFile("/wcinput/wc.txt")
scala> lines.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect()
~~~ 输出参数
res1: Array[(String, Int)] = Array((#在文件中输入如下内容,1), (yanqi,3), (mapreduce,3), (yarn,2), (hadoop,2), (hdfs,1))
三、Livy安装
### --- 下载版本包并解压缩
[root@hadoop02 ~]# ls /opt/yanqi/software/livy-0.5.0-incubating-bin.zip
/opt/yanqi/software/livy-0.5.0-incubating-bin.zip
[root@hadoop02 ~]# cd /opt/yanqi/software/
[root@hadoop02 software]# unzip livy-0.5.0-incubating-bin.zip
[root@hadoop02 software]# mv livy-0.5.0-incubating-bin ../servers/livy-0.5.0
### --- 设置环境变量 $LIVY_HOME
[root@hadoop02 ~]# vim /etc/profile
## LIVY_HOME
export LIVY_HOME=/opt/yanqi/servers/livy-0.5.0
export PATH=$PATH:$LIVY_HOME/bin
[root@hadoop02 ~]# source /etc/profile
### --- 修改配置文件 conf/livy.conf
~~~ # 准备配置文件
[root@hadoop02 ~]# cd $LIVY_HOME/conf
[root@hadoop02 conf]# cp livy-env.sh.template livy-env.sh
[root@hadoop02 conf]# cp livy.conf.template livy.conf
[root@hadoop02 ~]# vim $LIVY_HOME/conf/livy.conf
~~~ 修改配置参数
livy.server.host = 127.0.0.1
livy.spark.master = yarn
livy.spark.deployMode = cluster
livy.repl.enable-hive-context = true
### --- 修改配置文件 conf/livy-env.sh
[root@hadoop02 ~]# vim $LIVY_HOME/conf/livy-env.sh
~~~ 修改配置参数
export SPARK_HOME=/opt/yanqi/servers/spark-2.2.1
export HADOOP_HOME=/opt/yanqi/servers/hadoop-2.9.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
### --- 启动服务
[root@hadoop02 ~]# cd $LIVY_HOME/
[root@hadoop02 livy-0.5.0]# mkdir logs
[root@hadoop02 livy-0.5.0]# nohup bin/livy-server &
~~~ # OR
[root@hadoop02 ~]# nohup $LIVY_HOME/bin/livy-server &
~~~ # 查看启动的服务
[root@hadoop02 ~]# jps
7897 SparkSubmit
8139 LivyServer
四、ES安装
### --- 下载并解压缩ES版本包
[root@hadoop02 ~]# ls /opt/yanqi/software/elasticsearch-5.6.0.tar.gz
/opt/yanqi/software/elasticsearch-5.6.0.tar.gz
[root@hadoop02 software]# tar -zxvf elasticsearch-5.6.0.tar.gz -C ../servers/
### --- 创建 elasticsearch用户组 及 elasticsearch 用户。
~~~ # 不能使用root用户启动ES程序,需要创建单独的用户去启动ES 服务;
~~~ 创建用户组
[root@hadoop02 ~]# groupadd elasticsearch
~~~ 创建用户
[root@hadoop02 ~]# useradd elasticsearch -g elasticsearch
~~~ 修改安装目录的宿主
[root@hadoop02 ~]# cd /opt/yanqi/servers/
[root@hadoop02 servers]# chown -R elasticsearch:elasticsearch elasticsearch-5.6.0/
### --- 修改linux系统文件 /etc/security/limits.conf
[root@hadoop02 ~]# vim /etc/security/limits.conf
~~~ 修改配置参数
elasticsearch hard nofile 1000000
elasticsearch soft nofile 1000000
* soft nproc 4096
* hard nproc 4096
### --- 修改系统文件 /etc/sysctl.conf
~~~ # 修改配置参数
[root@hadoop02 ~]# vim /etc/sysctl.conf
~~~ 文件末尾增加:
vm.max_map_count=262144
~~~ # 执行以下命令,修改才能生效
[root@hadoop02 ~]# sysctl -p
vm.max_map_count = 262144
### --- 修改es配置文件
~~~ # 修改es的配置参数
[root@hadoop02 ~]# vim /opt/yanqi/servers/elasticsearch-5.6.0/config/elasticsearch.yml
~~~ 修改配置参数
network.host: 0.0.0.0
~~~ # jvm内存的分配,原来都是2g,修改为1g
~~~ # 修改jvm内存参数
[root@hadoop02 ~]# vim /opt/yanqi/servers/elasticsearch-5.6.0/config/jvm.options
~~~ 修改内存参数
-Xms1g
-Xmx1g
### --- 启动ES服务
### --- 通过浏览器访问es服务:http://hadoop02:9200/
~~~ # 到ES安装目录下,执行命令(-d表示后台启动)
[root@hadoop02 ~]# su elasticsearch
[elasticsearch@hadoop02 ~]$ cd /opt/yanqi/servers/elasticsearch-5.6.0/
[elasticsearch@hadoop02 elasticsearch-5.6.0]$ bin/elasticsearch -d

### --- 在ES里创建griffin索引
~~~ # hadoop02 为 ES 服务所在节点
[elasticsearch@hadoop02 ~]$ curl -XPUT http://hadoop02:9200/griffin -d '
{
"aliases": {},
"mappings": {
"accuracy": {
"properties": {
"name": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
},
"tmst": {
"type": "date"
}
}
}
},
"settings": {
"index": {
"number_of_replicas": "2",
"number_of_shards": "5"
}
}
}
'
~~~ 输出参数
{"acknowledged":true,"shards_acknowledged":true,"index":"griffin"}
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
dov001-PB离线数仓
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」