[mac] hadoop hive hbase spark 安装琐碎

所有项目都来自cdh5.8.0

hadoop
每次重启机器后启动hadoop,发现http://localhost:50070访问不了,jps发现namenode没有启动,查看 ~/opt/cdh5/hadoop-2.6.0-cdh5.8.0/logs/hadoop-fanhuan-namenode-fanhuandeMacBook-Pro.local.log 日志发现报错:
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
Directory /private/tmp/hadoop-fanhuan/dfs/name is in an inconsistent state: 
storage directory does not exist or is not accessible.

core-site.xml增加配置
《property》
  《name》hive.hwi.war.file《/name》
  《value》lib/hive-hwi-1.1.1.war《/value》
《/property》

如果将hadoop配置成伪分布模式,则Hadoop会将各种信息存入\tmp目录中,所以当系统重启之后,这些信息会丢失,使得用户不得不重新执行hadoop namenode -format命令。为了避免这种情况,可以在hdfs-site.xml文件中添加一个属性,属性名为dfs.name.dir,值为你想存的目录,只要不存在\tmp下,就不会遇到每次重启之后元数据丢失的情况。

启动时报警
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… 
using builtin-Java classes where applicable。

需要编译hadoop native 也可以网上下载 放到$HADOOP_HOME/lib/native下

hive
启动hive –service hwi 报错
ls:/Users/fanhuan/opt/cdh5/hive-1.1.0-cdh5.8.0/lib/hive-hwi-*.war: 
No such file or directory

下载hive源码,进入hwi目录,编译war包
jar cfM hive-hwi-1.1.1.war -C web .

讲生成的war cp到  $HIVE_HOME/lib下
修改hive-site.xml
《property》
  《name》hive.hwi.war.file《/name》
  《value》lib/hive-hwi-1.1.1.war《/value》
《/property》

就可以访问 http://localhost:9999/hwi 

有如下报错
Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre"

运行如下命令,即可。
ln -s $JAVA_HOME/lib/tools.jar $HIVE_HOME/lib/

hbase
hbase-site.xml
《configuration》
  《property》
    《name》hbase.rootdir《/name》
    《value》hdfs://localhost:9000/hbase《/value》
  《/property》
  《property》
    《name》dfs.replication《/name》
    《value》1《/value》
    《/property》
《/configuration》

hbase shell
修改hbase_env.sh 使用自带的zookeeper
export HBASE_MANAGES_ZK=true

Spark
pyspark 报错
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:117)

修改conf/spark-env.sh,增加hadoop classpath
export SPARK_DIST_CLASSPATH=$(hadoop classpath)

又报错
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/Module

py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/Versioned

fanhuan@bogon:~$ hadoop classpath
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/etc/hadoop:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/common/lib/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/common/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/hdfs:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/hdfs/lib/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/hdfs/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/yarn/lib/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/yarn/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/mapreduce/lib/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/share/hadoop/mapreduce/*:
/Users/fanhuan/opt/cdh5/hadoop-2.6.0-cdh5.8.0/contrib/capacity-scheduler/*.jar

缺少jackson-core-2.2.3.jar jackson-databind-2.2.3.jar包,发现hadoop classpath里没有,tools/lib里有,加进去后解决
export SPARK_DIST_CLASSPATH=$HADOOP_HOME/share/hadoop/tools/lib/*:$(hadoop classpath)

posted @ 2016-12-30 21:07  vanuan  阅读(300)  评论(0编辑  收藏  举报