ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1伪分布式环境部署
目录:
一、hadoop2.2.0、zookeeper3.4.5、hbase0.96.2、hive0.13.1都是什么?
二、这些软件在哪里下载?
三、如何安装
1、安装JDK
2、用parallels克隆3台机器
3、安装Zookeeper-3.4.5
4、安装hadoop2.2.0
5、启动zookeeper
6、启动JournalNode集群
7、Hbase-0.96.2-hadoop2(启动双HMaster的配置,m1是主HMaster,m2是从HMaster)
8、在ubuntu12.04的m1上面安装mysql5.5.x
9、hive 0.13.1安装
10、hive to hbase (Hive中的表数据导入到Hbase中去)
11、hbase to hive (Hbase中的表数据导入到Hive)
四、常见问题
五、参考资料
一、hadoop2.2.0、zookeeper3.4.5、hbase0.96.2、hive0.13.1都是什么?
hadoop2.2.0的介绍以及特性,参考这里:http://blog.yidooo.net/archives/hadoop-2-2-0-new-features.html
zookeeper的介绍,参考这里:http://baike.baidu.com/view/3061646.htm
hbase的介绍,参考这里:http://baike.baidu.com/view/1993870.htm
hive0.13的介绍以及特性,参考这里:http://www.csdn.net/article/2014-04-22/2819438-Cloud-Hive
四款软件打包后的文件,我放到了这里:http://pan.baidu.com/s/1i35PlI1
我想能够看这篇文章的人,都会具备一些基础知识,这里就不多介绍了。
BTW:我是用MAC10.09+Parallels9虚拟的4个ubuntu。分别为m1,m2两个主,s1,s2两个从,共四台机器。
二、这些软件在哪里下载?
JDK1.7.0_65:使用apt-get方式安装
这里hadoop2.2.0使用的是源码包,因为我使用的是64bit的ubuntu,而hadoop官方提供的,只有32bit可用。如果在64bit上运行会报错util.NativeCodeLoader - Unable to load native-hadoop library for your platform..错误,所以需要重新在64bit上编辑,后面我会单独写一篇文章介绍如何编译64bit的hadoop。
三、如何安装
1、安装JDK(当前主机名为m1)
1)执行以下命令
1
|
root@m1:/home/hadoop# sudo apt-get install oracle-java7-installer |
2)配置JAVA环境变量
1
|
root@m1:/home/hadoop# sudo vi /etc/environment |
在第一行的PASH最后加上java的bin路径。
1
|
PATH=" /usr/local/sbin : /usr/local/bin : /usr/sbin : /usr/bin : /sbin : /bin : /usr/games : /usr/lib/jvm/java-7oracle/bin ” |
在PATH的后面加上下面三行
1
2
3
|
CLASSPATH=" /usr/lib/jvm/java-7-oracle/lib ” JAVA_HOME=" /usr/lib/jvm/java-7-oracle ” JRE_HOME=" /usr/lib/jvm/java-7-oracle/jre ” |
告诉系统,我们使用的sun的JDK,而非OpenJDK了
1
2
3
|
root@m1:/home/hadoop# sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/java-7-oracle/bin/java 300 root@m1:/home/hadoop# sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/java-7-oracle/bin/javac 300 root@m1:/home/hadoop# sudo update-alternatives --config java |
这时会有几个选项,如下图选择2,然后再执行java -version就可以看到最新版本
2、用parallels克隆3台机器
1)在parallels的硬件网络中选择如下所示,这个时候这个ping www.163.com就会ping通了
2)点击Parallels左上角=》文件=》克隆,克隆三台虚拟机名字分别命名为:m2,s1,s2(克隆前要先停止虚拟机)
执行sudo vi /etc/hostname ,修改各自的主机名称,如果生效需要重启。
在m1、m2、s1、s2上分别执行ifconfig查看被分配到的IP地址,然后执行sudo vi /etc/hosts,我的机器修改如下图,然后执行”sudo /etc/init.d/networking restart"生效:
3)配置shhd无验证登录(我使用的是root帐号)
安装SSH工具,(如果默认执行ssh存在,就不用安装了)
1
|
root@m1:/home/hadoop# sudo apt-get install ssh openssh-server |
在每台机器分别输入ssh-keygen,一路回车,然后会在用户的.ssh目录生成id_rsa和id_rsa.pub文件。
在m1上执行:
1
2
3
4
5
6
7
8
9
10
|
root@m1:/home/hadoop# scp -r root@m2:/root/.ssh/id_rsa.pub ~/.ssh/m2.pub root@m1:/home/hadoop# scp -r root@s1:/root/.ssh/id_rsa.pub ~/.ssh/s1.pub root@m1:/home/hadoop# scp -r root@s2:/root/.ssh/id_rsa.pub ~/.ssh/s2.pub root@m1:/home/hadoop# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys root@m1:/home/hadoop# cat ~/.ssh/m2.pub >> ~/.ssh/authorized_keys root@m1:/home/hadoop# cat ~/.ssh/s1.pub >> ~/.ssh/authorized_keys root@m1:/home/hadoop# cat ~/.ssh/s2.pub >> ~/.ssh/authorized_keys root@m1:/home/hadoop# scp -r ~/.ssh/authorized_keys root@m2:~/.ssh/ root@m1:/home/hadoop# scp -r ~/.ssh/authorized_keys root@s1:~/.ssh/ root@m1:/home/hadoop# scp -r ~/.ssh/authorized_keys root@s2:~/.ssh/ |
3、安装Zookeeper-3.4.5
1)配置zoo.cfg(默认是没有zoo.cfg,将zoo_sample.cfg复制一份,并命名为zoo.cfg)
1
|
root@m1:/home/hadoop/zookeeper-3.4.5/conf# vi zoo.cfg |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir= /home/hadoop/zookeeper-3 .4.5 /data dataLogDir= /home/hadoop/zookeeper-3 .4.5 /logs server.1=m1:2888:3888 server.2=m2:2888:3888 server.3=s1:2888:3888 server.4=s2:2888:3888 # the port at which the clients will connect clientPort=2181 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 |
2)将zookeeper从m1复制到m2,s1,s2机器上
1
2
3
|
root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@m2:/home/hadoop root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@s1:/home/hadoop root@m1:/home/hadoop/zookeeper-3.4.5/conf# scp -r /home/hadoop/zookeeper-3.4.5 root@s2:/home/hadoop |
3)在m1,m2,s1,s2机器上,的/home/hadoop/zookeeper-3.4.5/dataDir目录下创建 myid文件,内容为在zoo.cfg中配置的server.后面的数字,记住只能是数字
m1为1
m2为2
s1为3
s2为4
至此,zookeeper的配置结束。
4、安装hadoop2.2.0
修改以下7个配置文件:
1)/home/hadoop/hadoop-2.2.0/etc/hadoop/hadoop-env.sh(主要修改java路径)
1
|
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi hadoop-env.sh |
1
2
|
export JAVA_HOME= /usr/lib/jvm/java-7-oracle #export JAVA_HOME=${JAVA_HOME} |
2)/home/hadoop/hadoop-2.2.0/etc/hadoop/yarn-env.sh(主要修改java路径)
1
|
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi yarn-env.sh |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # User for YARN daemons export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn} export JAVA_HOME= /usr/lib/jvm/java-7-oracle |
3)/home/hadoop/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
1
|
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi hdfs-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
<? xml version = "1.0" encoding = "UTF-8" ?> <? xml-stylesheet type = "text/xsl" href = "configuration.xsl" ?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> < configuration > < property > < name >dfs.nameservices</ name > < value >mycluster</ value > </ property > < property > < name >dfs.ha.namenodes.mycluster</ name > < value >m1,m2</ value > </ property > < property > < name >dfs.namenode.rpc-address.mycluster.m1</ name > < value >m1:9000</ value > </ property > < property > < name >dfs.namenode.rpc-address.mycluster.m2</ name > < value >m2:9000</ value > </ property > < property > < name >dfs.namenode.http-address.mycluster.m1</ name > < value >m1:50070</ value > </ property > < property > < name >dfs.namenode.http-address.mycluster.m2</ name > < value >m2:50070</ value > </ property > < property > < name >dfs.namenode.shared.edits.dir</ name > < value >qjournal://m1:8485;m2:8485/mycluster</ value > </ property > < property > < name >dfs.ha.automatic-failover.enabled.mycluster</ name > < value >true</ value > </ property > < property > < name >dfs.client.failover.proxy.provider.mycluster</ name > < value >org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</ value > </ property > < property > < name >dfs.ha.fencing.methods</ name > < value >sshfence</ value > </ property > < property > < name >dfs.ha.fencing.ssh.private-key-files</ name > < value >/root/.ssh/id_rsa</ value > </ property > < property > < name >dfs.journalnode.edits.dir</ name > < value >/home/hadoop/hadoop-2.2.0/tmp/journal</ value > </ property > < property > < name >dfs.replication</ name > < value >3</ value > </ property > < property > < name >dfs.webhdfs.enabled</ name > < value >true</ value > </ property > < property > < name >dfs.permissions</ name > < value >false</ value > </ property > < property > < name >dfs.permissions.enabled</ name > < value >false</ value > </ property > </ configuration > |
4)/home/hadoop/hadoop-2.2.0/etc/hadoop/mapred-site.xml
1
|
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi mapred-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
<? xml version = "1.0" ?> <? xml-stylesheet type = "text/xsl" href = "configuration.xsl" ?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> < configuration > < property > < name >mapreduce.framework.name</ name > < value >yarn</ value > < description >Execution framework set to Hadoop YARN.</ description > </ property > </ configuration > |
5)/home/hadoop/hadoop-2.2.0/etc/hadoop/core-site.xml
1
|
root@m1: /home/hadoop/hadoop-2 .2.0 /etc/hadoop # vi core-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
<? xml version = "1.0" encoding = "UTF-8" ?> <? xml-stylesheet type = "text/xsl" href = "configuration.xsl" ?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> < configuration > < property > < name >fs.defaultFS</ name > < value >hdfs://mycluster</ value > </ property > < property > < name >dfs.nameservices</ name > < value >mycluster</ value > </ property > < property > < name >ha.zookeeper.quorum</ name > < value >m1:2181,m2:2181,s1:2181,s2:2181</ value > </ property > < property > < name >hadoop.tmp.dir</ name > < value >/home/hadoop/hadoop-2.2.0/tmp</ value > < description ></ description > </ property > </ configuration > |
6)/home/hadoop/hadoop-2.2.0/etc/hadoop/yarn-site.xml
1
|
root@m1: /home/hadoop/hadoop-2 .2.0 /etc/hadoop # vi yarn-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
<? xml version = "1.0" ?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> < configuration > <!-- Site specific YARN configuration properties--> < property > < name >yarn.nodemanager.aux-services</ name > < value >mapreduce_shuffle</ value > </ property > < property > < name >yarn.nodemanager.aux-services.mapreduce.shuffle.class</ name > < value >org.apache.hadoop.mapred.ShuffleHandler</ value > </ property > < property > < name >yarn.resourcemanager.hostname</ name > < value >m1</ value > </ property > </ configuration > |
7)/home/hadoop/hadoop-2.2.0/etc/hadoop/slaves
1
2
3
|
root@m1:/home/hadoop/hadoop-2.2.0/etc/hadoop# vi slaves s1 s2 |
至此,hadoop的配置结束。
5、启动zookeeper
1)在m1,m2,s1,s2所有机器上执行,下面的代码是在m1上执行的示例:
1
2
3
4
5
6
7
8
9
|
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh start JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower root@m1:/home/hadoop# |
2)在每台机器上执行下面的命令,可以查看状态,在s1上是leader,其他机器是follower
1
2
3
4
5
6
7
8
9
|
root@s1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh start JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED root@s1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: leader root@s1:/home/hadoop# |
3)测试zookeeper是否启动成功,看下面第29行高亮处,表示成功。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkCli.sh Connecting to localhost:2181 2014-07-27 00:27:16,621 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 2014-07-27 00:27:16,628 [myid:] - INFO [main:Environment@100] - Client environment:host.name=m1 2014-07-27 00:27:16,628 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0_65 2014-07-27 00:27:16,629 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2014-07-27 00:27:16,629 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre 2014-07-27 00:27:16,630 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/home/hadoop/zookeeper-3.4.5/bin/../build/classes:/home/hadoop/zookeeper-3.4.5/bin/../build/lib/*.jar:/home/hadoop/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/hadoop/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/home/hadoop/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/home/hadoop/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/home/hadoop/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/home/hadoop/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/home/hadoop/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/home/hadoop/zookeeper-3.4.5/bin/../conf:/usr/lib/jvm/java-7-oracle/lib 2014-07-27 00:27:16,630 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=:/usr/local/lib:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 2014-07-27 00:27:16,631 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2014-07-27 00:27:16,631 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA> 2014-07-27 00:27:16,632 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux 2014-07-27 00:27:16,632 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64 2014-07-27 00:27:16,632 [myid:] - INFO [main:Environment@100] - Client environment:os.version=3.11.0-15-generic 2014-07-27 00:27:16,633 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root 2014-07-27 00:27:16,633 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root 2014-07-27 00:27:16,634 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/hadoop 2014-07-27 00:27:16,636 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@19b1ebe5 Welcome to ZooKeeper! 2014-07-27 00:27:16,672 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2014-07-27 00:27:16,685 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket connection established to localhost/127.0.0.1:2181, initiating session JLine support is enabled 2014-07-27 00:27:16,719 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x147737cd5d30000, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] |
4)在m1上格式化zookeeper,第33行的日志表示创建成功。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs zkfc -formatZK 14/07/27 00:31:59 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at m1/192.168.1.50:9000 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:host.name=m1 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_65 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/home/hadoop/hadoop-2.2.0/etc/hadoop:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-math-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-lang-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/hadoop-auth-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsch-0.1.42.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/junit-4.8.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/stax-api-1.0.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/zookeeper-3.4.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-nfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/hamcrest-core-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/junit-4.10.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/hadoop/hadoop-2.2.0/contrib/capacity-scheduler/*.jar 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/hadoop/hadoop-2.2.0/lib/native 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:os.version=3.11.0-15-generic 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.name=root 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.home=/root 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop 14/07/27 00:32:00 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=m1:2181,m2:2181,s1:2181,s2:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@5990054a 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Opening socket connection to server m1/192.168.1.50:2181. Will not attempt to authenticate using SASL (unknown error) 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Socket connection established to m1/192.168.1.50:2181, initiating session 14/07/27 00:32:00 INFO zookeeper.ClientCnxn: Session establishment complete on server m1/192.168.1.50:2181, sessionid = 0x147737cd5d30001, negotiated timeout = 5000 =============================================== The configured parent znode /hadoop-ha/mycluster already exists. Are you sure you want to clear all failover information from ZooKeeper? WARNING: Before proceeding, ensure that all HDFS services and failover controllers are stopped! =============================================== Proceed formatting /hadoop-ha/mycluster? (Y or N) 14/07/27 00:32:00 INFO ha.ActiveStandbyElector: Session connected. y 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Recursively deleting /hadoop-ha/mycluster from ZK... 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Successfully deleted /hadoop-ha/mycluster from ZK. 14/07/27 00:32:13 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK. 14/07/27 00:32:13 INFO zookeeper.ClientCnxn: EventThread shut down 14/07/27 00:32:13 INFO zookeeper.ZooKeeper: Session: 0x147737cd5d30001 closed root@m1:/home/hadoop# |
5)验证zkfc是否格式化成功,如果多了一个hadoop-ha包就是成功了。
1
2
3
4
|
root@m1:/home/hadoop# /home/hadoop/zookeeper-3.4.5/bin/zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / [hadoop-ha, zookeeper] [zk: localhost:2181(CONNECTED) 1] |
6、启动JournalNode集群
1)依次在m1,m2,s1,s2上面执行
1
2
3
4
5
6
7
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start journalnode starting journalnode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-journalnode-m1.out root@m1:/home/hadoop# jps 2884 JournalNode 2553 QuorumPeerMain 2922 Jps root@m1:/home/hadoop# |
2)格式化集群的一个NameNode(m1),有两种方法,我使用的是第一种
方法一:
1
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs namenode –format |
方法二:
1
|
root@m1:/home/hadoop/hadoop-2.2.0/bin/hdfs namenode -format -clusterId m1 |
3)在m1上启动刚才格式化的 namenode
1
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start namenode |
4)在m2机器上,将m1的数据复制到m2上来,在m2上执行
1
|
root@m2:/home/hadoop# /home/hadoop/hadoop-2.2.0/bin/hdfs namenode –bootstrapStandby |
5)启动m2上的namenode,执行命令后
1
|
root@m2:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start namenode |
6)启动所有的datanode,在m1上执行
1
2
3
4
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemons.sh start datanode s2: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-datanode-s2.out s1: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-datanode-s1.out root@m1:/home/hadoop# |
7)启动yarn,在m1上执行以下命令
1
2
3
4
5
6
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/start-yarn.sh starting yarn daemons starting resourcemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-resourcemanager-m1.out s1: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-nodemanager-s1.out s2: starting nodemanager, logging to /home/hadoop/hadoop-2.2.0/logs/yarn-root-nodemanager-s2.out root@m1:/home/hadoop# |
然后浏览:http://m1:8088/cluster, 可以看到效果
8)、启动 ZooKeeperFailoverCotroller,在m1,m2机器上依次执行以下命令,这个时候再浏览50070端口,可以发现m1变成active状态了,而m2还是standby状态
1
2
3
|
root@m1:/home/hadoop# /home/hadoop/hadoop-2.2.0/sbin/hadoop-daemon.sh start zkfc starting zkfc, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-root-zkfc-m1.out root@m1:/home/hadoop# |
-----两张图片分隔线-----
9)、测试HDFS是否可用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls / Found 2 items drwx------ - root supergroup 0 2014-07-17 23:54 /tmp drwxr-xr-x - lion supergroup 0 2014-07-21 00:40 /user root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -mkdir /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls / Found 3 items drwxr-xr-x - root supergroup 0 2014-07-27 01:20 /input drwx------ - root supergroup 0 2014-07-17 23:54 /tmp drwxr-xr-x - lion supergroup 0 2014-07-21 00:40 /user root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -put hadoop.cmd /input root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hdfs dfs -ls /input Found 1 items -rw-r--r-- 3 root supergroup 7530 2014-07-27 01:20 /input/hadoop.cmd root@m1:/home/hadoop/hadoop-2.2.0/bin# |
10)、测试YARN是否可用,我们来做一个经典的例子,统计刚才放入input下面的hadoop.cmd的单词频率
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
|
root@m1:/home/hadoop/hadoop-2.2.0/bin# /home/hadoop/hadoop-2.2.0/bin/hadoop jar /home/hadoop/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /input /output 14/07/27 01:22:41 INFO client.RMProxy: Connecting to ResourceManager at m1/192.168.1.50:8032 14/07/27 01:22:43 INFO input.FileInputFormat: Total input paths to process : 1 14/07/27 01:22:44 INFO mapreduce.JobSubmitter: number of splits:1 14/07/27 01:22:44 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/07/27 01:22:44 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/07/27 01:22:44 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/07/27 01:22:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406394452186_0001 14/07/27 01:22:46 INFO impl.YarnClientImpl: Submitted application application_1406394452186_0001 to ResourceManager at m1/192.168.1.50:8032 14/07/27 01:22:46 INFO mapreduce.Job: The url to track the job: http://m1:8088/proxy/application_1406394452186_0001/ 14/07/27 01:22:46 INFO mapreduce.Job: Running job: job_1406394452186_0001 14/07/27 01:23:10 INFO mapreduce.Job: Job job_1406394452186_0001 running in uber mode : false 14/07/27 01:23:10 INFO mapreduce.Job: map 0% reduce 0% 14/07/27 01:23:31 INFO mapreduce.Job: map 100% reduce 0% 14/07/27 01:23:48 INFO mapreduce.Job: map 100% reduce 100% 14/07/27 01:23:48 INFO mapreduce.Job: Job job_1406394452186_0001 completed successfully 14/07/27 01:23:49 INFO mapreduce.Job: Counters: 43 File System Counters FILE: Number of bytes read=6574 FILE: Number of bytes written=175057 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=7628 HDFS: Number of bytes written=5088 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=18062 Total time spent by all reduces in occupied slots (ms)=14807 Map-Reduce Framework Map input records=240 Map output records=827 Map output bytes=9965 Map output materialized bytes=6574 Input split bytes=98 Combine input records=827 Combine output records=373 Reduce input groups=373 Reduce shuffle bytes=6574 Reduce input records=373 Reduce output records=373 Spilled Records=746 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=335 CPU time spent (ms)=2960 Physical memory (bytes) snapshot=270057472 Virtual memory (bytes) snapshot=1990762496 Total committed heap usage (bytes)=136450048 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=7530 File Output Format Counters Bytes Written=5088 root@m1:/home/hadoop/hadoop-2.2.0/bin# |
11)、验证HA的高可用性,故障转移,刚才我们用浏览器打开m1和m2的50070端口,已经看到m1的状态是active,m2的状态是standby
a)我们在m1上kill掉namenode进程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
root@m1:/home/hadoop/hadoop-2.2.0/bin# jps 5492 Jps 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 3898 NameNode 4075 ResourceManager root@m1:/home/hadoop/hadoop-2.2.0/bin# kill -9 3898 root@m1:/home/hadoop/hadoop-2.2.0/bin# jps 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager 5627 Jps root@m1:/home/hadoop/hadoop-2.2.0/bin# |
b)再浏览m1和m2的50070端口,发现m1是打不开,而m2是active状态。
这时候在m2上的HDFS和mapreduce还是可以正常运行的,虽然m1上的namenode进程已经被kill掉,但不影响使用这就是故障转移的优势!
7、Hbase-0.96.2-hadoop2(启动双HMaster的配置,m1是主HMaster,m2是从HMaster)
1)、修改hbase-env.sh配置,主要修JAVA_HOME的目录,以及HBASE_MANAGES_ZK
1
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi hbase-env.sh |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
|
# #/** # * Copyright 2007 The Apache Software Foundation # * # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ # Set environment variables here. # This script sets variables multiple times over the course of starting an hbase process, # so try to keep things idempotent unless you want to take an even deeper look # into the startup scripts (bin/hbase, etc.) # The java implementation to use. Java 1.6 required. export JAVA_HOME= /usr/lib/jvm/java-7-oracle # Extra Java CLASSPATH elements. Optional. # export HBASE_CLASSPATH= # The maximum amount of heap to use, in MB. Default is 1000. # export HBASE_HEAPSIZE=1000 # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, # see http://wiki.apache.org/hadoop/PerformanceTuning export HBASE_OPTS= "-XX:+UseConcMarkSweepGC" # Uncomment one of the below three options to enable java garbage collection logging for the server-side processes. # This enables basic gc logging to the .out file. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment one of the below three options to enable java garbage collection logging for the client processes. # This enables basic gc logging to the .out file. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment below if you intend to use the EXPERIMENTAL off heap cache. # export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=" # Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident #HBASE_REGIONSERVER_MLOCK=true #HBASE_REGIONSERVER_UID="hbase" # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER # The scheduling priority for daemon processes. See 'man nice'. # export HBASE_NICENESS=10 # The directory where pid files are stored. /tmp by default. # export HBASE_PID_DIR=/var/hadoop/pids # Seconds to sleep between slave commands. Unset by default. This # can be useful in large clusters, where, e.g., slave rsyncs can # otherwise arrive faster than the master can service them. # export HBASE_SLAVE_SLEEP=0.1 # Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK= false #这个值为false时,表示启动的是独立的zookeeper。而配置成true则是hbase自带的zookeeper。 # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the # RFA appender. Please refer to the log4j.properties file to see more details on this appender. # In case one needs to do log rolling on a date change, one should set the environment property # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA". # For example: # HBASE_ROOT_LOGGER=INFO,DRFA # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context. |
2)、修改hbase-site.xml配置
1
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi hbase-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
<? xml version = "1.0" ?> <? xml-stylesheet type = "text/xsl" href = "configuration.xsl" ?> <!-- /** * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> < configuration > < property > <!--这里用来设置region server的共享目录,用来持久化Hbase。URL需要是'完全正确'的,还要包含文件系统的scheme。--> < name >hbase.rootdir</ name > < value >hdfs://mycluster/hbase</ value > <!--这里必须跟hadoop的core-site.xml中的配置一样--> </ property > < property > <!--Hbase的运行模式。false是单机模式,true是分布式模式。若为false,Hbase和Zookeeper会运行在同一个JVM里面。默认: false--> < name >hbase.cluster.distributed</ name > < value >true</ value > </ property > < property > < name >hbase.tmp.dir</ name > < value >/home/hadoop/hbase-0.96.2-hadoop2/tmp</ value > </ property > < property > <!--这里是对的,只配置端口,为了配置多个HMaster--> < name >hbase.master</ name > < value >60000</ value > </ property > < property > <!--配置zookeeper--> < name >hbase.zookeeper.quorum</ name > < value >m1,m2,s1,s2</ value > </ property > < property > <!--配置zookeeperp客户端连接端口,如果hbase.zookeeper.property.clientPort不配的话,将会默认一个端口,可能就不是你的zookeeper提供的3351~3353这些有用的端口。--> < name >hbase.zookeeper.property.clientPort</ name > < value >2181</ value > </ property > < property > < name >hbase.zookeeper.property.dataDir</ name > < value >/home/hadoop/zookeeper-3.4.5/data</ value > </ property > </ configuration > |
2)、修改regionservers文件
通常部署master的机器上不就部署slave了,用两台集群做Hbase从服务器
1
2
3
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# vi regionservers s1 s2 |
3)、创建hadoop的hdfs-site.xml的软连接到hbase的配置文件目录
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ll 总用量 40 drwxr-xr-x 2 root root 4096 Jul 27 09:15 ./ drwxr-xr-x 9 root root 4096 Jul 20 21:40 ../ -rw-r--r-- 1 root staff 1026 Mar 25 06:29 hadoop-metrics2-hbase.properties -rw-r--r-- 1 root staff 4023 Mar 25 06:29 hbase-env.cmd -rw-r--r-- 1 root staff 7129 Jul 27 08:58 hbase-env.sh -rw-r--r-- 1 root staff 2257 Mar 25 06:29 hbase-policy.xml -rw-r--r-- 1 root staff 2550 Jul 27 09:10 hbase-site.xml -rw-r--r-- 1 root staff 3451 Mar 25 06:29 log4j.properties -rw-r--r-- 1 root staff 6 Jul 20 21:38 regionservers root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ln -s /home/hadoop/hadoop-2.2.0/etc/hadoop/hdfs-site.xml hdfs-site.xml root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# ll 总用量 40 drwxr-xr-x 2 root root 4096 Jul 27 09:16 ./ drwxr-xr-x 9 root root 4096 Jul 20 21:40 ../ -rw-r--r-- 1 root staff 1026 Mar 25 06:29 hadoop-metrics2-hbase.properties -rw-r--r-- 1 root staff 4023 Mar 25 06:29 hbase-env.cmd -rw-r--r-- 1 root staff 7129 Jul 27 08:58 hbase-env.sh -rw-r--r-- 1 root staff 2257 Mar 25 06:29 hbase-policy.xml -rw-r--r-- 1 root staff 2550 Jul 27 09:10 hbase-site.xml lrwxrwxrwx 1 root root 50 Jul 27 09:16 hdfs-site.xml -> /home/hadoop/hadoop-2.2.0/etc/hadoop/hdfs-site.xml* -rw-r--r-- 1 root staff 3451 Mar 25 06:29 log4j.properties -rw-r--r-- 1 root staff 6 Jul 20 21:38 regionservers root@m1:/home/hadoop/hbase-0.96.2-hadoop2/conf# |
3)、hbase0.96.2版本的jar包不需要复制,官方提供的是已经打包好的
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# ls | grep hadoop hadoop-annotations-2.2.0.jar hadoop-auth-2.2.0.jar hadoop-client-2.2.0.jar hadoop-common-2.2.0.jar hadoop-hdfs-2.2.0.jar hadoop-hdfs-2.2.0-tests.jar hadoop-mapreduce-client-app-2.2.0.jar hadoop-mapreduce-client-common-2.2.0.jar hadoop-mapreduce-client-core-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0-tests.jar hadoop-mapreduce-client-shuffle-2.2.0.jar hadoop-yarn-api-2.2.0.jar hadoop-yarn-client-2.2.0.jar hadoop-yarn-common-2.2.0.jar hadoop-yarn-server-common-2.2.0.jar hadoop-yarn-server-nodemanager-2.2.0.jar hbase-client-0.96.2-hadoop2.jar hbase-common-0.96.2-hadoop2.jar hbase-common-0.96.2-hadoop2-tests.jar hbase-examples-0.96.2-hadoop2.jar hbase-hadoop2-compat-0.96.2-hadoop2.jar hbase-hadoop-compat-0.96.2-hadoop2.jar hbase-it-0.96.2-hadoop2.jar hbase-it-0.96.2-hadoop2-tests.jar hbase-prefix-tree-0.96.2-hadoop2.jar hbase-protocol-0.96.2-hadoop2.jar hbase-server-0.96.2-hadoop2.jar hbase-server-0.96.2-hadoop2-tests.jar hbase-shell-0.96.2-hadoop2.jar hbase-testing-util-0.96.2-hadoop2.jar hbase-thrift-0.96.2-hadoop2.jar root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# |
4)、将m1上面的hbase0.96.2复制到m2,s1,s2同样的目录中
1
2
3
|
root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@m2:/home/hadoop root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@s1:/home/hadoop root@m1:/home/hadoop/hbase-0.96.2-hadoop2/lib# scp -r /home/hadoop/hbase-0.96.2-hadoop2 root@s2:/home/hadoop |
5)、在m1上启动hbase0.96.2,执行命令:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
root@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/start-hbase.sh starting master, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-master-m1.out s1: starting regionserver, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-regionserver-s1.out s2: starting regionserver, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-regionserver-s2.out root@m1:/home/hadoop# jps 6688 NameNode 7540 HMaster 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 7769 Jps 4075 ResourceManager root@m1:/home/hadoop# |
执行命令后,浏览网址可以看效果:http://m1:60010/master-status
6)、在m1上用shell测试连接hbase
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
root@m1:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 09:31:07,601 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 0 row(s) in 2.8030 seconds => [] hbase(main):002:0> version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):003:0> status 2 servers, 0 dead, 1.0000 average load hbase(main):004:0> create 'test_idoall_org','uid','name' 0 row(s) in 0.5800 seconds => Hbase::Table - test_idoall_org hbase(main):005:0> list TABLE test_idoall_org 1 row(s) in 0.0320 seconds => ["test_idoall_org"] hbase(main):006:0> put 'test_idoall_org','10086','name:idoall','idoallvalue' 0 row(s) in 0.1090 seconds ^ hbase(main):009:0> get 'test_idoall_org','10086' COLUMN CELL name:idoall timestamp=1406424831473, value=idoallvalue 1 row(s) in 0.0450 seconds hbase(main):010:0> scan 'test_idoall_org' ROW COLUMN+CELL 10086 column=name:idoall, timestamp=1406424831473, value=idoallvalue 1 row(s) in 0.0620 seconds hbase(main):011:0> |
7)、在m2上启动hbase,同样执行命令:
1
2
3
|
root@m2:/home/hadoop# /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase-daemon.sh start master starting master, logging to /home/hadoop/hbase-0.96.2-hadoop2/bin/../logs/hbase-root-master-m2.out root@m2:/home/hadoop# |
执行命令后,在浏览器打开网址也可以看到m2上的hbase状态:http://m2:60010/master-status
8)、测试m1和m2的主从备份切换
a)这时在浏览器打开http://m1:60010/master-status和http://m2:60010/master-status,可以看到下图的状态
b)我们在m1上停止掉hbase的进程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
root@m1:/home/hadoop# jps 6688 NameNode 7540 HMaster 2884 JournalNode 8645 Jps 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager root@m1:/home/hadoop# kill -9 7540 root@m1:/home/hadoop# jps 6688 NameNode 2884 JournalNode 4375 DFSZKFailoverController 2553 QuorumPeerMain 4075 ResourceManager 8655 HMaster 8719 Jps root@m1:/home/hadoop# |
再打开网址,会发现m1已经打不开,而m2的hbase集群状态已经被改变
至此,hbase已经配置完,并且主从故障转移是可用的。
8、在ubuntu12.04的m1上面安装mysql5.5.x
1)、apt-get install mysql-server mysql-client mysql-common
过程中会弹出一个界面,让你输入root的密码。我设置的是123456
安装后可以测试下mysql的连接状态:mysql -uroot -p123456
可以用service mysql stop/service mysql start来启动和停止mysql状态
2)、授权可以远程访问mysql
1
|
root@m1:/home/hadoop# mysql -uroot -p123456 |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 36 Server version: 5.5.22-0ubuntu1 (Ubuntu) Copyright (c) 2000, 2011, Oracle and / or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and / or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> grant all on *.* to 'root' @ '%' identified by '123456' WITH GRANT OPTION ; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges ; Query OK, 0 rows affected (0.00 sec) mysql> quit Bye |
3)、如果还无法远程连接,打开:vi /etc/mysql/my.cnf。将bind-address=127.0.0.1,改为本机ip,重新启动mysql
9、hive 0.13.1安装(在m1上操作)
1)、将apache-hive-0.13.1-bin.tar.gz解压到/home/hadoop/hive-0.13.1
2)、进入到hive的conf文件,将模板文件复制出对应的配置文件
1
2
|
root@m1:/home/hadoop/hive-0.13.1/conf# cp hive-env.sh.template hive-env.sh root@m1:/home/hadoop/hive-0.13.1/conf# cp hive-default.xml.template hive-site.xml |
3)、修改hive-env.sh文件,主要设置hadoop目录
1
2
|
root@m1:/home/hadoop/hive-0.13.1/conf# vi hive-env.sh HADOOP_HOME=/home/hadoop/hadoop-2.2.0 |
4)、修改hive-site.xml文件
1
|
root@m1:/home/hadoop/hive-0.13.1/conf# vi hive-site.xml |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
|
<? xml version = "1.0" ?> <? xml-stylesheet type = "text/xsl" href = "configuration.xsl" ?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- <!--这里是重点的地方,为了跟Hbase整合,所以千万别写错了,hive.aux.jars.path 的value中间不允许有空格,回车,换行什么的,全部写在一行上就行了,不然会出各种错 <configuration> <property> <!--hive 默认的数据文件存储路径,通常为 HDFS 可写的路径--> < name >hive.metastore.warehouse.dir</ name > < value >hdfs://mycluster/user/hive/warehouse</ value > </ property > < property > </ property > < description >The list of zookeeper servers to talk to. This isonly needed for read/write locks.</ description > <!--HDFS路径,用于存储不同 map/reduce 阶段的执行计划和这些阶段的中间输出结果。--> < name >hive.exec.scratchdir</ name > < value >hdfs://mycluster/user/hive/scratchdir</ value > </ property > < property > <!--Hive 实时查询日志所在的目录,如果该值为空,将不创建实时的查询日志。--> < name >hive.querylog.location</ name > < value >/home/hadoop/hive-0.13.1/logs</ value > </ property > < property > <!--JDBC连接字符串,默认jdbc:derby:;databaseName=metastore_db;create=true;--> < name >javax.jdo.option.ConnectionURL</ name > < value >jdbc:mysql://m1:3306/hiveMeta?createDatabaseIfNotExist=true</ value > </ property > < property > <!--JDBC的driver,默认org.apache.derby.jdbc.EmbeddedDriver;--> < name >javax.jdo.option.ConnectionDriverName</ name > < value >com.mysql.jdbc.Driver</ value > </ property > < property > < name >javax.jdo.option.ConnectionUserName</ name > < value >root</ value > </ property > < property > < name >javax.jdo.option.ConnectionPassword</ name > < value >123456</ value > </ property > < property > <!--当用户自定义了UDF或者SerDe,这些插件的jar都要放到这个目录下,无默认值;--> <!--这里是重点的地方,为了跟Hbase整合,所以千万别写错了,hive.aux.jars.path 的value中间不允许有空格,回车,换行什么的,全部写在一行上就行了,不然会出各种错--> < name >hive.aux.jars.path</ name > < value >file:///home/hadoop/hive-0.13.1/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-hadoop2-compat-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hive-h base-handler-0.13.1.jar,file:///home/hadoop/hive-0.13.1/lib/protobuf-java-2.5.0.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-client-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-common-0.96.2-hadoop2 .jar,file:///home/hadoop/hive-0.13.1/lib/hbase-protocol-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/hbase-server-0.96.2-hadoop2.jar,file:///home/hadoop/hive-0.13.1/lib/zookeeper-3.4.5.jar,file:///home/had oop/hive-0.13.1/lib/guava-11.0.2.jar,file:///home/hadoop/hive-0.13.1/lib/htrace-core-2.04.jar</ value > </ property > < property > <!--zk地址列表,默认是空;没用配置hive.zookeeper.quorum会导致无法并发执行hive ql请求和导致数据异常--> < name >hive.zookeeper.quorum</ name > < value >m1,m2,s1,s2</ value > </ property > </ configuration > |
5)、hive-site.xml中hive.aux.jars.path配置项包含的jar,hive-hbase-handler-0.13.1.jar和guava-11.0.2.jar是默认就有的,只需要执行以下命令,将其他的从hadoop/zookeeper/hbase中复制过来即可
1
2
3
4
5
6
7
8
9
|
root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/protobuf-java-2.5.0.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-client-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-common-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-protocol-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-server-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-hadoop2-compat-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/hbase-hadoop-compat-0.96.2-hadoop2.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/hbase-0.96.2-hadoop2/lib/htrace-core-2.04.jar /home/hadoop/hive-0.13.1/lib root@m1:/home/hadoop# cp /home/hadoop/zookeeper-3.4.5/dist-maven/zookeeper-3.4.5.jar /home/hadoop/hive-0.13.1/lib |
6)、mysql的odbc驱动,可以到这里下载http://dev.mysql.com/downloads/connector/j/,解压后,将目录中的mysql-connector-java-5.1.31-bin.jar复制到 /home/hadoop/hive-0.13.1/lib
7)、创建测试数据,以及在hadoop上创建数据仓库目录
1
2
3
4
|
root@m1:/home/hadoop/hive-0.13.1/conf# vi /home/hadoop/hive-0.13.1/testdata001.dat 12306,mname,yname 10086,myidoall,youidoall /home/hadoop/hadoop-2.2.0/bin/hadoop fs -mkdir -p /user/hive/warehouse |
8)、使用shell命令,测试hive
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
root@m1: /home/hadoop # /home/hadoop/hive-0.13.1/bin/hive 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.min. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.min. split .size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.node 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.input. dir .recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input. dir .recursive 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.min. split .size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.rack 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.max. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .maxsize 14 /07/27 11:17:35 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar: file : /home/hadoop/hive-0 .13.1 /lib/hive-common-0 .13.1.jar! /hive-log4j .properties hive> show databases; OK default Time taken: 0.464 seconds, Fetched: 1 row(s) hive> create database testidoall; OK Time taken: 0.279 seconds hive> show databases; OK default testidoall Time taken: 0.021 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.039 seconds hive> create external table testtable(uid int,myname string,youname string) row format delimited fields terminated by ',' location '/user/hive/warehouse/testtable' ; OK Time taken: 0.205 seconds hive> LOAD DATA LOCAL INPATH '/home/hadoop/hive-0.13.1/testdata001.dat' OVERWRITE INTO TABLE testtable; Copying data from file : /home/hadoop/hive-0 .13.1 /testdata001 .dat Copying file : file : /home/hadoop/hive-0 .13.1 /testdata001 .dat Loading data to table testidoall.testtable rmr: DEPRECATED: Please use 'rm -r' instead. Deleted hdfs: //mycluster/user/hive/warehouse/testtable Table testidoall.testtable stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0] OK Time taken: 0.77 seconds hive> select * from testtable; OK 12306 mname yname 10086 myidoall youidoall Time taken: 0.279 seconds, Fetched: 2 row(s) hive> |
至此,hive已经安装完成。
10、hive to hbase(Hive中的表数据导入到Hbase中去)
1)、创建hbase可以识别的表
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
root@m1: /home/hadoop # /home/hadoop/hive-0.13.1/bin/hive 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.min. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.min. split .size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.node 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.input. dir .recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input. dir .recursive 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.min. split .size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.rack 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.max. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .maxsize 14 /07/27 11:33:53 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar: file : /home/hadoop/hive-0 .13.1 /lib/hive-common-0 .13.1.jar! /hive-log4j .properties hive> show databases; OK default testidoall Time taken: 0.45 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.021 seconds hive> show tables; OK testtable Time taken: 0.032 seconds, Fetched: 1 row(s) hive> CREATE TABLE hive2hbase_idoall(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,cf1:val" ) TBLPROPERTIES ( "hbase.table.name" = "hive2hbase_idoall" ); OK Time taken: 2.332 seconds hive> show tables; OK hive2hbase_idoall testtable Time taken: 0.036 seconds, Fetched: 2 row(s) hive> |
2)、创建本地表,用来存储数据,然后插入到Hbase用的,相当于一张中间表了。同时将之前的测试数据导入到这张中间表。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
hive> create table hive2hbase_idoall_middle(foo int,bar string)row format delimited fields terminated by ',' ; OK Time taken: 0.086 seconds hive> show tables; OK hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.03 seconds, Fetched: 3 row(s) hive> load data local inpath '/home/hadoop/hive-0.13.1/testdata001.dat' overwrite into table hive2hbase_idoall_middle; Copying data from file : /home/hadoop/hive-0 .13.1 /testdata001 .dat Copying file : file : /home/hadoop/hive-0 .13.1 /testdata001 .dat Loading data to table testidoall.hive2hbase_idoall_middle rmr: DEPRECATED: Please use 'rm -r' instead. Deleted hdfs: //mycluster/user/hive/warehouse/testidoall .db /hive2hbase_idoall_middle Table testidoall.hive2hbase_idoall_middle stats: [numFiles=1, numRows=0, totalSize=43, rawDataSize=0] OK Time taken: 0.683 seconds hive> |
3)、将本地中间表(hive2hbase_idoall_middle)导入到表(hive2hbase_idoall)中,会自动同步到hbase。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
hive> insert overwrite table hive2hbase_idoall select * from hive2hbase_idoall_middle; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1406394452186_0002, Tracking URL = http: //m1 :8088 /proxy/application_1406394452186_0002/ Kill Command = /home/hadoop/hadoop-2 .2.0 /bin/hadoop job - kill job_1406394452186_0002 Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0 2014-07-27 11:44:11,491 Stage-0 map = 0%, reduce = 0% 2014-07-27 11:44:22,684 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 1.51 sec MapReduce Total cumulative CPU time : 1 seconds 510 msec Ended Job = job_1406394452186_0002 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 1.51 sec HDFS Read: 288 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 510 msec OK Time taken: 25.613 seconds hive> select * from hive2hbase_idoall; OK 10086 myidoall 12306 mname Time taken: 0.179 seconds, Fetched: 2 row(s) hive> select * from hive2hbase_idoall_middle; OK 12306 mname 10086 myidoall Time taken: 0.088 seconds, Fetched: 2 row(s) hive> |
4)、用shell连接hbase,查看hive过来的数据是否已经存在
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
root@m1: /home/hadoop # /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 11:47:14,454 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file : /home/hadoop/hbase-0 .96.2-hadoop2 /lib/slf4j-log4j12-1 .6.4.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: Found binding in [jar: file : /home/hadoop/hadoop-2 .2.0 /share/hadoop/common/lib/slf4j-log4j12-1 .7.5.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: See http: //www .slf4j.org /codes .html #multiple_bindings for an explanation. hive2hbase_idoall test_idoall_org 2 row(s) in 2.9480 seconds => [ "hive2hbase_idoall" , "test_idoall_org" ] hbase(main):002:0> scan "hive2hbase_idoall" ROW COLUMN+CELL 10086 column=cf1:val, timestamp=1406432660860, value=myidoall 12306 column=cf1:val, timestamp=1406432660860, value=mname 2 row(s) in 0.0540 seconds hbase(main):003:0> get "hive2hbase_idoall" , '12306' COLUMN CELL cf1:val timestamp=1406432660860, value=mname 1 row(s) in 0.0110 seconds hbase(main):004:0> |
至此,hive to hbase的测试功能正常。
11、hbase to hive(Hbase中的表数据导入到Hive)
1)、在hbase下创建表hbase2hive_idoall
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
root@m1: /home/hadoop # /home/hadoop/hbase-0.96.2-hadoop2/bin/hbase shell 2014-07-27 11:54:25,844 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 0.96.2-hadoop2, r1581096, Mon Mar 24 16:03:18 PDT 2014 hbase(main):001:0> create 'hbase2hive_idoall' , 'gid' , 'info' SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file : /home/hadoop/hbase-0 .96.2-hadoop2 /lib/slf4j-log4j12-1 .6.4.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: Found binding in [jar: file : /home/hadoop/hadoop-2 .2.0 /share/hadoop/common/lib/slf4j-log4j12-1 .7.5.jar! /org/slf4j/impl/StaticLoggerBinder .class] SLF4J: See http: //www .slf4j.org /codes .html #multiple_bindings for an explanation. 0 row(s) in 3.4970 seconds => Hbase::Table - hbase2hive_idoall hbase(main):002:0> put 'hbase2hive_idoall' , '3344520' , 'info:time' , '20140704' 0 row(s) in 0.1020 seconds hbase(main):003:0> put 'hbase2hive_idoall' , '3344520' , 'info:address' , 'HK' 0 row(s) in 0.0090 seconds hbase(main):004:0> scan 'hbase2hive_idoall' ROW COLUMN+CELL 3344520 column=info:address, timestamp=1406433302317, value=HK 3344520 column=info: time , timestamp=1406433297567, value=20140704 1 row(s) in 0.0330 seconds hbase(main):005:0> |
2)、Hive下创建表连接Hbase中的表
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
root@m1: /home/hadoop # /home/hadoop/hive-0.13.1/bin/hive 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.min. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.min. split .size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.node 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.input. dir .recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input. dir .recursive 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.min. split .size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat. split .minsize.per.rack 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.max. split .size is deprecated. Instead, use mapreduce.input.fileinputformat. split .maxsize 14 /07/27 11:57:20 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar: file : /home/hadoop/hive-0 .13.1 /lib/hive-common-0 .13.1.jar! /hive-log4j .properties hive> show databases; OK default testidoall Time taken: 0.449 seconds, Fetched: 2 row(s) hive> use testidoall; OK Time taken: 0.02 seconds hive> show tables; OK hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.026 seconds, Fetched: 3 row(s) hive> create external table hbase2hive_idoall (key string,gid map<string,string>)STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = "info:" ) TBLPROPERTIES ( "hbase.table.name" = "hbase2hive_idoall" ); OK Time taken: 1.696 seconds hive> show tables; OK hbase2hive_idoall hive2hbase_idoall hive2hbase_idoall_middle testtable Time taken: 0.034 seconds, Fetched: 4 row(s) hive> select * from hbase2hive_idoall; OK 3344520 { "address" : "HK" , "time" : "20140704" } Time taken: 0.701 seconds, Fetched: 1 row(s) hive> |
至此,如文章标题所描述的ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1分布式环境部署,全部测试完毕,过程中也遇到了一些坑,会在常见问题中介绍。希望这个测试笔记可以帮助到更多的人。
四、常见问题
1、过程中如果在hadoop(namenode/datanode/yarn)、hbase、hive启动出现问题时,一定要用tail -n 100 ***.log仔细查看相关的日志,可以发现很多有用的信息。以下几个命令,也有助于在命令行模式追踪错误。
1)、hadoop在控制台输出debug信息,执行完以下命令后,可以启动namenode,datanode,yarn测试效果
1
|
root@m1: /home/hadoop # export HADOOP_ROOT_LOGGER=DEBUG,console |
2)、hive 在控制台输出debug信息
1
|
root@m1: /home/hadoop # /home/hadoop/hive-0.13.1/bin/hive --hiveconf hive.root.logger=DEBUG,console |
2、mysql在启动时,遇到过job failed to start,可以用以下几个命令,重新安装解决。
1
2
3
4
5
|
rm /var/lib/mysql/ -R rm /etc/mysql/ -R apt-get autoremove mysql* —purge apt-get remove apparmor apt-get install mysql-server mysql-client mysql-common |
3、dpkg 被中断,您必须手工运行 sudo dpkg --configure -a解决此问题
1
2
3
|
sudo rm /var/lib/dpkg/updates/* sudo apt-get update sudo apt-get upgrade |
---------------------------------------
博文作者:迦壹
转载声明:可以转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明,谢谢合作!
---------------------------------------