Hbase 数据库基本操作
本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
上个月写了一篇Hive数据仓库基本操作过了这么长的时间,还没来得及复习,今天又学了Hbase数据库的一大堆操作,为了防止混淆,和后期快速复习,查找,今天再写一篇Hbase Shell的基本操作,记性不好,只好写下来啦。
命令 | 作用 |
---|---|
create | 创建表 |
desc | 查看表信息 |
put | 插入数据 |
get | 数据查询 |
scan | 数据查询 |
alter | 修改表 |
truncate | 清空数据表 |
drop | 删除表 |
... | ... |
... | ... |
... | ... |
在保证Hbase和相关依赖项都启动后输入hbase shell,进入Hbase客户端。 |
[root@namenode opt]# hbase shell
输入help,查看Hbase的shell命令
hbase(main):001:0> help HBase Shell, version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017 Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command. Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group. ......
命令有很多,本文只罗列一些基本的命令
1、create 创建表
输入help 'create' 查看建表语句
hbase(main):002:0> help 'create' ......
基本语句:create '表名' , '列族名'
插一句:可能会出现 znode data = = null 的问题,这是因为运行Hbase的用户无法将文件写入zookeeper,导致znode为空
解决方案:在hbase-site.xml文件中指定zookeeper的文件目录即可
<property> <name>hbase.zookeeper.property.dataDir</name> <value>/opt/zookeeper/data</value> </property>
Hbase创建表时。只指定表的名称和列族名称,不指定列的名称和类型
如:创建一个名为 student1 ,列族名为 cf1 的表
hbase(main):003:0> create 'student1' , 'cf1' 0 row(s) in 3.4160 seconds => Hbase::Table - student1
创建一个名为 student2 ,列族名为 cf1 和 cf2 的表
hbase(main):011:0> create 'student2' , 'cf1' , 'cf2' 0 row(s) in 2.2410 seconds => Hbase::Table - student2
输入 list 命令,查看Hbase中的表
hbase(main):016:0> list TABLE student1 student2 2 row(s) in 0.0080 seconds => ["student1", "student2"]
2、desc 查看表信息
使用命令:desc '表名' 查看student1表的详细信息
hbase(main):018:0> desc 'student1' Table student1 is ENABLED student1 COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0590 seconds
{}中的内容是列族的信息,以下是各个字符的对应解释
字符 | 解释 |
---|---|
NAME | 名称 |
VERSION | 版本号 |
IN_MEMORY | 是否将数据在内存中存储 |
TTL | 创建时间 |
BLOCKSIZE | 列族的大小 |
REPLICATION_SCOPE | 复制 |
查看 student2 表的信息, student2 表有两个列族 ,故有两个{} |
hbase(main):019:0> desc 'student2' Table student2 is ENABLED student2 COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} {NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 2 row(s) in 0.2050 seconds
3、put 插入数据
put命令:put '表名' , '行键' , '列族:列名' , '数据内容'
将信息(name:July,age:18,grade:98,sex:男)插入到表student1中
hbase(main):020:0> put 'student1' , '001' , 'cf1:name' , 'July' 0 row(s) in 0.2070 seconds hbase(main):021:0> put 'student1' , '001' , 'cf1:age' , '18' 0 row(s) in 0.0240 seconds hbase(main):022:0> put 'student1' , '001' , 'cf1:grade' , '98' 0 row(s) in 0.0210 seconds hbase(main):023:0> put 'student1' , '001' , 'cf1:sex' , 'M' 0 row(s) in 0.0080 seconds
浏览器输入:namenode:50070
Hbase插入的数据存储在HDFS中
存储路径为:/hbase/data/default/表/region编号/列族/HDFS的文件名
default 是默认的命名空间
4、get 数据查询
基本命令:get '表名' , '行键'
获取表 student1 行键为 001 的数据
hbase(main):025:0> get 'student1' , '001' COLUMN CELL cf1:age timestamp=1589278473070, value=18 cf1:grade timestamp=1589278487224, value=98 cf1:name timestamp=1589278459460, value=July cf1:sex timestamp=1589278496165, value=M 4 row(s) in 0.0760 seconds
(其中,timetamp 表示存入数据的时间戳,value 是对应的值)
查询表 student1,001行,cf1列族,name列的数据
hbase(main):026:0> get 'student1' , '001' , 'cf1:name' COLUMN CELL cf1:name timestamp=1589278459460, value=July 1 row(s) in 0.0210 seconds
将表student1中,name列的值July修改为Mary,然后查询结果
hbase(main):027:0> put 'student1' , '001' , 'cf1:name' , 'Mary' 0 row(s) in 0.0220 seconds hbase(main):028:0> get 'student1' , '001' , 'cf1:name' COLUMN CELL cf1:name timestamp=1589279637308, value=Mary 1 row(s) in 0.0150 seconds
(在Hbase中,列族默认VERSION的值为1,表示每一列只能存储一个值,后插入的值会覆盖之前的值!)
创建表时,指定VERSION的值:create '表名' {NAME => '列族名' , VERSIONS => '版本值'}
创建表 student3,并指定列族 cf1 的版本值是3,查询结果
hbase(main):031:0> create 'student3' , {NAME => 'cf1' , VERSIONS => '3'} 0 row(s) in 8.8800 seconds => Hbase::Table - student3 hbase(main):032:0> desc 'student3' Table student3 is ENABLED student3 COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0410 seconds
向表student3的相同的列中插入3次数据,查询结果
hbase(main):033:0> put 'student3' , '001' , 'cf1:name' , 'July' 0 row(s) in 0.0470 seconds hbase(main):034:0> put 'student3' , '001' , 'cf1:name' , 'Tom' 0 row(s) in 0.0120 seconds hbase(main):035:0> put 'student3' , '001' , 'cf1:name' , 'Mary' 0 row(s) in 0.0090 seconds hbase(main):039:0> get 'student3' , '001' , {COLUMN => 'cf1:name' ,VERSIONS => 3} COLUMN CELL cf1:name timestamp=1589280435047, value=Mary cf1:name timestamp=1589280430294, value=Tom cf1:name timestamp=1589280423248, value=July 3 row(s) in 0.2580 seconds
5、scan 数据查询
使用get查询时,必须输入行键,不能直接对某一列进行查询
可以使用scan对表的指定列进行查询
命令:scan '表名' , {COLUMN => '列族:列名' , VERSIONS => '版本值' }
查询表student3中的name列
hbase(main):005:0> scan 'student3' , {COLUMN => 'cf1:name' , VERSIONS => 3} ROW COLUMN+CELL 001 column=cf1:name, timestamp=1589280435047, value=Mary 001 column=cf1:name, timestamp=1589280430294, value=Tom 001 column=cf1:name, timestamp=1589280423248, value=July 1 row(s) in 0.1310 seconds
6、alter 修改表
alter 可以在表中增加列族
命令:alter '表名' , NAME => ‘列族名' , VERSIONS => 版本值
在表student1中增加列族cf1,修改版本值为3,查询结果
hbase(main):007:0> alter 'student1' , NAME => 'cf2' , VERSIONS => 3 Updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. Done. 0 row(s) in 5.0950 seconds hbase(main):008:0> desc 'student1' Table student1 is ENABLED student1 COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} {NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 2 row(s) in 0.0500 seconds
alter 可以删除表中的数据,但alter只能以列族为单位删除
命令:
alter '表名’ , NAME => '列族' , METHOD => 'delete' ,或输入
alter '表名' , 'delete' => '列族'
删除表student1中的列族cf2
hbase(main):009:0> alter 'student1' , 'delete' => 'cf2' Updating all regions with the new schema... 1/1 regions updated. Done. 0 row(s) in 2.7580 seconds hbase(main):010:0> desc 'student1' Table student1 is ENABLED student1 COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0280 seconds
7、truncate 清空数据
命令:truncate '表名'
清空表student3中的数据
hbase(main):011:0> truncate 'student3' Truncating 'student3' table (it may take a while): - Disabling table... - Truncating table... 0 row(s) in 4.7220 seconds hbase(main):014:0> get 'student3' , '001' COLUMN CELL 0 row(s) in 0.2540 seconds
在清空表中数据时,系统自动先禁用表再清空数据
当数据清空完成后,系统自动恢复表的使用
使用命令:is_enabled '表名' 查看表是否可用
hbase(main):015:0> is_enabled 'student3' true 0 row(s) in 0.0190 seconds
8、drop 删除表
命令:drop '表名'
Hbase表不能直接删除
hbase(main):016:0> drop 'student2' ERROR: Table student2 is enabled. Disable it first. Here is some help for this command: Drop the named table. Table must first be disabled: hbase> drop 't1' hbase> drop 'ns1:t1'
在删除之前,必须先禁用表
命令:disable '表名'
hbase(main):017:0> disable 'student2' 0 row(s) in 2.3580 seconds hbase(main):018:0> drop 'student2' 0 row(s) in 1.3430 seconds hbase(main):019:0> list TABLE student1 student3 2 row(s) in 0.0090 seconds => ["student1", "student3"]
...
...
...
本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· DeepSeek在M芯片Mac上本地化部署
· 葡萄城 AI 搜索升级:DeepSeek 加持,客户体验更智能
2021-04-06 Python 简单的时间处理