HBase 分布式环境搭建
一、前期环境
- 安装概览
IP | Host Name | Software | Node |
192.168.23.128 | ae01 | JDK 1.7, Zookeeper-3.4.5 | HMaster |
192.168.23.129 | ae02 | JDK 1.7, Zookeeper-3.4.5 | HRegionServer |
192.168.23.130 | ae03 | JDK 1.7, Zookeeper-3.4.5 | HRegionServer |
- 若使用虚拟机安装,可以安装samba, smbfs方便对于文件的控制。
- 系统环境: ubuntu-12.04.2-server-amd64
- 安装目录: /usr/local/ae
- JDK 安装目录: export JAVA_HOME=/usr/local/ae/jdk1.7.0_51
- Zookeeper 安装目录 export ZOOKEEPER_HOME=/usr/local/ae/zookeeper-3.4.5
- HBase版本: hbase-0.94.8
二、配置服务器间SSH无密码登录
参考 http://www.cnblogs.com/tannerBG/p/4271831.html
三、安装HBase
- 修改host文件,添加3台服务器的host
user@ae01:/usr/local/ae$ sudo vim /etc/hosts
127.0.0.1 localhost 192.168.23.128 ae01 192.168.23.129 ae02 192.168.23.129 ae03
- 解压HBase
将hbase-0.94.8.tar.gz 复制到 /usr/local/ae,解压
user@ae01:/usr/local/ae$ sudo tar -zxvf hbase-0.94.8.tar.gz
- 配置HBase
修改$HBASE_HOME/conf/hbase-env.sh 添加JAVA_HOME 并配置单独Zookeeper
export JAVA_HOME=/usr/local/ae/jdk1.7.0_51 export HBASE_MANAGES_ZK=false
conf/hbase-env.sh
里面的HBASE_MANAGES_ZK
来切换。这个值默认是true的,作用是让Hbase启动的时候同时也启动zookeeper.
修改$HBASE_HOME/conf/hbase-site.xml 加入以下文件到<configuration>节点
<property> <name>hbase.rootdir</name> <value>hdfs://ae01:9000/hbase</value> <description>The directory shared byRegionServers.</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> <description>The mode the clusterwill be in. Possible values are false: standalone and pseudo-distributedsetups with managed Zookeeper true: fully-distributed with unmanagedZookeeper Quorum (see hbase-env.sh) </description> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> <description>Property fromZooKeeper's config zoo.cfg. The port at which the clients willconnect. </description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>ae01,ae02,ae03</value> <description>Comma separated listof servers in the ZooKeeper Quorum. For example,"host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost forlocal and pseudo-distributed modes of operation. For a fully-distributedsetup, this should be set to a full list of ZooKeeper quorum servers. IfHBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we willstart/stop ZooKeeper on. </description> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/usr/local/ae/zookeeper-3.4.5/conf</value> <description>Property fromZooKeeper's config zoo.cfg. The directory where the snapshot isstored. </description> </property>
hbase.rootdir: 为region server的共享目录,用来持久化Hbase。URL需要是'完全正确'的,还要包含文件系统的schema。例如,要表示hdfs中的'/hbase'目录,namenode 运行在ae01的9000端口。则需要设置为hdfs://ae01:9000/hbase。默认情况下Hbase是写到/tmp的。不改这个配置,数据会在重启的时候丢失。默认: file:///tmp/hbase-${user.name}/hbase
hbase.zookeeper.property.clientPort:ZooKeeper的zoo.conf中的配置。 客户端连接的端口 默认:2181
hbase.cluster.distributed: HBase的运行模式。false是单机模式,true是分布式模式。若为false,HBase和Zookeeper会运行在同一个JVM里面。默认: false
hbase.zookeeper.quorum: Zookeeper集群的地址列表,用逗号分割。例如:"ae01,ae02,ae03".默认是localhost,是给伪分布式用的。要修改才能在完全分布式的情况下使用。如果在hbase-env.sh设置了HBASE_MANAGES_ZK,这些ZooKeeper节点就会和HBase一起启动。默认: localhost
hbase.zookeeper.property.dataDir:ZooKeeper的zoo.conf中的配置路径。
配置$HBASE_HOME/conf/regionserversae02 ae03
这个文件把RegionServer的节点列了下来。在这个例子里面我们让ae02, ae03都运行RegionServer,除了第一个节点ae01,它要运行HBase Master 和HDFS NameNode
四、启动HBase
- 启动Zookeeper 集群
user@ae01:/usr/local/ae/zookeeper-3.4.5/bin$ sudo zkServer.sh start user@ae02:/usr/local/ae/zookeeper-3.4.5/bin$ sudo zkServer.sh start user@ae03:/usr/local/ae/zookeeper-3.4.5/bin$ sudo zkServer.sh start
- 启动HBase
user@ae01:/usr/local/ae$ start-hbase.sh starting master, logging to /usr/local/ae/hbase-0.94.8/logs/hbase-user-master-ae01.out ae02: starting regionserver, logging to /usr/local/ae/hbase-0.94.8/bin/../logs/hbase-user-regionserver-ae02.out ae03: starting regionserver, logging to /usr/local/ae/hbase-0.94.8/bin/../logs/hbase-user-regionserver-ae03.out user@ae01:/usr/local/ae$
- 验证HBase
ae01
user@ae01:/usr/local/ae$ jps 26239 JobTracker 26158 SecondaryNameNode 39581 HMaster 26468 TaskTracker 25687 NameNode 26718 QuorumPeerMain 39764 Jps 25926 DataNode user@ae01:/usr/local/ae$
ae02,ae03
user@ae02:/usr/local/ae$ jps 26842 HRegionServer 18999 TaskTracker 19195 QuorumPeerMain 18791 DataNode 26973 Jps user@ae02:/usr/local/ae$
user@ae03:/usr/local/ae$ jps 3901 DataNode 4280 QuorumPeerMain 11735 HRegionServer 4106 TaskTracker 11847 Jps user@ae03:/usr/local/ae$
四、文档参考
- http://abloz.com/hbase/book.html#important_configurations