这是我自己装BigData相关软件的一系列教程的第二篇,第一篇是Hadoop的安装https://www.cnblogs.com/annie666/p/11567690.html
装软件是学大数据最最基础的一步,虽然相对简单,还是很容易出错啊。希望这个详细的教程可以帮助其他想学大数据的同学少走弯路。
参考资料 厦大林子雨教程:http://dblab.xmu.edu.cn/blog/2139-2/
一、准备工作
装软件最怕的事情就是安错版本。所以安装Hbase前,看一下官网的basic preparation。(但官方文档有好多准备工作啊,比如DNS,我们这个没那么复杂,只需确定Hadoop,Hbase,JDK的版本问题。官网链接:http://hbase.apache.org/book.html#basic.prerequisites)
1.Hadoop
官方文档建议安装Hadoop2.*,因为更快,更稳定
2.JDK 8
(应该没有Java还没装好的朋友吧,这个我就不细讲了)
3.Hadoop与Hbase版本支持关系
因此,我们最后选择Hadoop-2.9.2与Hbase-2.2.0(清华镜像上没有,可以去官网下载然后通过FTP传到服务器上)进行安装
清华大学镜像下载Hadoop-2.9.2
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz
二、解压缩
sudo tar -zxvf ~/hbase-2.2.0-bin.tar.gz -C /usr/local
cd /usr/local
sudo mv ./hbase-2.2.0 ./hbase
sudo chown -R hadoop:hadoop ./hbase
查看安装成功没
/usr/local/hbase/bin/hbase version
若成功,返回下面的信息:
linziyu@iZbp11gznj7n38xkztu64dZ:/usr/local$ /usr/local/hbase/bin/hbase version
2019-10-02 14:00:33,041 INFO [main] util.VersionInfo: HBase 2.2.0
2019-10-02 14:00:33,042 INFO [main] util.VersionInfo: Source code repository git://diocles.local/Volumes/hbase-1.1.5/hbase revision=239b80456118175b340b2e562a5568b5c744252e
2019-10-02 14:00:33,042 INFO [main] util.VersionInfo: Compiled by ndimiduk on Wed Oct 02 14:03:05 CST 2019
2019-10-02 14:00:33,042 INFO [main] util.VersionInfo: From source with checksum 7ad8dc6c5daba19e4aab081181a2457d
三、单机式配置
在/usr/local/hbase/conf/下
1.修改hbase-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 #忘记自己的javahome可以打开hadoop-env.sh看自己以前怎么写的
2.修改hdfs-site.xml
和官网一模一样https://hbase.apache.org/book.html#quickstart
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/testuser/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/testuser/zookeeper</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
<description>
Controls whether HBase will check for stream capabilities (hflush/hsync).
Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
with the 'file://' scheme, but be mindful of the NOTE below.
WARNING: Setting this to false blinds you to potential data loss and
inconsistent system state in the event of process and/or node failures. If
HBase is complaining of an inability to use hsync or hflush it's most
likely not a false positive.
</description>
</property>
</configuration>
四、测试运行Hbase
第一步 启动Hadoop
由于Hbase依赖hadoop,所以先得启动Hadoop
输入jps,若非Data、Name、SecondaryNameNode三个都出现了,开启一下Hadoop吧!
cd /usr/local/hadoop-2.9.2
./sbin/start-dfs.sh
第二步 启动Hbase
使用 ./bin/start-hbase 启动Hbase
cd /usr/local/hbase
./bin/start-hbase.sh
输入JPS检查进程,若出现HMaster,独立式配置成功!
hadoop@iZbp11gznj7n38xkztu64dZ:/usr/local/hbase$ jps
NameNode #1
DataNode #2
SecondaryNameNode #3
HMaster #4
Jps #5
start-hbase之后可以访问UI界面
第三步 使用hbase shell
cd /usr/local/hbase
./bin/hbase shell
耐心等待后会出现下面的信息:
2019-10-02 14:04:04,826 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
using builtin-java classes
where applicable
HBase Shell
Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell Version 2.2.0, rUnknown, Tue Jun 11 04:30:30 UTC 2019 Took 0.0066 seconds hbase(main):001:0> create 'test','cf'
然后开始编程吧!
基础知识:行键/列族/列限定符/时间戳
1.创建表:
语法:create 'tablename', 'columnname'
hbase(main):001:0> create 'test', 'cf' 0 row(s) in 0.4170 seconds => Hbase::Table - test
然后可通过 list ‘test’ 查看创建的表
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds
=> ["test"]
或通过 describe ‘test’ 查看详细配置信息
hbase(main):003:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE =>
'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'f
alse', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE
=> '65536'}
1 row(s)
Took 0.9998 seconds
2.增删改查
(1)插入&更新
使用put命令插入或更新单元格
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1' 0 row(s) in 0.0850 seconds hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2' 0 row(s) in 0.0110 seconds hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3' 0 row(s) in 0.0100 seconds
然后可使用scan命令查看整张表中数据
hbase(main):009:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1569996475448, value=value1
row2 column=cf:b, timestamp=1569996503176, value=value2
row3 column=cf:c, timestamp=1569996607878, value=value3
(2)删除
使用drop删除整张表,注意:要更改删除时必须先disable
disable 'test' #第一步 让表不可用(必须先这样)
drop 'test' #第二步 删除表
(3)查看
get 'test', 'row2'
COLUMN CELL
cf:b timestamp=1569996503176, value=value2
退出quit:To exit the HBase Shell and disconnect from your cluster, use the quit
command. HBase is still running in the background.
五、关闭Hbase
cd /usr/local/hbase
./bin/stop-hbase.sh