[Spark]Spark-sql与hive连接配置
一.在Mysql中配置hive数据库
- 创建hive数据库,刷新root用户权限
create database hive; grant all on *.* to root@'%' identified by'111111'; flush privileges;
- 修改hive目录下/bin/hive
vim /usr/local/src/apache-hive-1.2.2-bin/bin/hive
修改前 # add Spark assembly jar to the classpath if [[ -n "$SPARK_HOME" ]] then sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar` CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}" 修改后 # add Spark assembly jar to the classpath if [[ -n "$SPARK_HOME" ]] then sparkAssemblyPath=`ls ${SPARK_HOME}/jars/*.jar` CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}" fi
二.hadoop目录下存在老版的jline,替换掉
cd /usr/local/src
cp apache-hive-1.2.2-bin/lib/jline-2.12.jar hadoop-2.6.1/share/hado
op/yarn/lib/
三. 运行spark-shell
在spark目录下./spark-shell 运行后
scala> import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.hive.HiveContext scala> val priors = spark.sql("select * from badou.orders")
报错:
18/11/01 02:50:13 ERROR metastore.RetryingHMSHandler: AlreadyExistsException(message:Database default already exists) 18/11/01 02:49:50 WARN component.AbstractLifeCycle: FAILED org.spark_project.jetty.server.Server@7b1e5e55: java.net.BindException: Address already in use java.net.BindException: Address already in use
解决办法
步骤一 将hive-site.xml拷贝到spark/conf里
cp /usr/local/src/apache-hive-1.2.2-bin/conf/hive-site.xml /usr/local/src/spark-2.0.2-bin-hadoop2.6/conf/ scp /usr/local/src/apache-hive-1.2.2-bin/conf/hive-site.xml root@slave1:/usr/local/src/spark-2.0.2-bin-hadoop2.6/conf/ scp /usr/local/src/apache-hive-1.2.2-bin/conf/hive-site.xml root@slave2:/usr/local/src/spark-2.0.2-bin-hadoop2.6/conf/
步骤二 将mysql驱动拷贝到spark/jar里
cp /usr/local/src/apache-hive-1.2.2-bin/lib/mysql-connector-java-5.1.47-bin.jar /usr/local/src/spark-2.0.2-bin-hadoop2.6/jars/ scp /usr/local/src/apache-hive-1.2.2-bin/lib/mysql-connector-java-5.1.47-bin.jar root@slave1:/usr/local/src/spark-2.0.2-bin-hadoop2.6/jars/ scp /usr/local/src/apache-hive-1.2.2-bin/lib/mysql-connector-java-5.1.47-bin.jar root@slave2:/usr/local/src/spark-2.0.2-bin-hadoop2.6/jars/
数据驱动变革-云将skyell。用Flask+Nginx+uWsgi搭建的个人博客:http://www.skyell.cn/