Python操作Hbase
目录:
1.原理介绍
2.安装环境所依赖的工具
3.安装Thrift
4.拷贝python操作hbase 符合Thrift协议的API目录到开发目录
5.启动Thrift服务
6.启动Hbase
7.测试
1.原理介绍
2.安装环境所依赖的工具
sudo yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel openssl-devel
sudo yum install boost-devel.x86_64
sudo yum install libevent-devel.x86_64
3.安装Thrift
./configure --with-cpp=no --with-ruby=no
make
sudo make install (若不是以root用户执行make install,普通用户要加sudo,不然不能在/usr/local/bin下创建thrift可执行文件,会出现下面的错误)
4.产生针对Python的Hbase的API
1.查找 Hbase.thrift ,此文件是python针对Hbase操作的符合Thrift协议的接口。
[liangjf@master hbase-0.98.0]$ find . -name Hbase.thrift
./hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift
[liangjf@master hbase-0.98.0]$ cd ./hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/
[liangjf@master thrift]$ ls
Hbase.thrift
2.产生针对Python的Hbase的API
thrift -gen py Hbase.thrif
3.拷贝到python_hbase的开发目录,把此目录里面的接口文件相当于模块调用,这里用到的是python的__init__.py的知识,请自行查看。
cp -raf gen-py/hbase/ /home/liangjf/big_data/hbase/python_hbase 【python_hbase,修改为此名字,方便辨别。后面开发导入时要注意名字
若步骤2 安装 Thrift 时,没有用sudo make install ,不会产生thrift可执行文件的。
[liangjf@master thrift]$ thrift -gen py Hbase.thrift
-bash: thrift: command not found
重新安装发现这里有错误。错误明显是没权限在/usr/local/bin目录创建可执行文件thrift。因此在终端不能直接thrift。所以要这样:sudo make install
[liangjf@master thrift-0.8.0]$ make install
Making install in compiler/cpp
make[1]: Entering directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
make install-am
make[2]: Entering directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
make[3]: Entering directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
test -z "/usr/local/bin" || /bin/mkdir -p "/usr/local/bin"
/bin/sh ../../libtool --mode=install /usr/bin/install -c thrift '/usr/local/bin'
libtool: install: /usr/bin/install -c thrift /usr/local/bin/thrift
/usr/bin/install: cannot create regular file `/usr/local/bin/thrift': Permission denied
make[3]: *** [install-binPROGRAMS] Error 1
make[3]: Leaving directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
make[2]: *** [install-am] Error 2
make[2]: Leaving directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/home/liangjf/app/thrift-0.8.0/compiler/cpp'
make: *** [install-recursive] Error 1
5.启动Thrift服务,在hbase/bin目录下
hbase-daemon.sh start thrift
6.启动Hbase,在hbase/bin目录下
./start-hbase.sh
7.测试
【create_table.py】--------------建表
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from pyhon_hbase import Hbase
from pyhon_hbase.ttypes import *
transport = TSocket.TSocket('master', 9090);
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport);
client = Hbase.Client(protocol)
transport.open()
contents = ColumnDescriptor(name='cf:', maxVersions=1)
client.createTable('test', [contents])
------------------------
hbase(main):023:0> list
TABLE
member
test
2 row(s) in 0.0150 seconds
=> ["member", "test"]
-------------------------------------------------------------------
【insert_data.py】--------------插入数据
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from pyhon_hbase import Hbase
from pyhon_hbase.ttypes import *
transport = TSocket.TSocket('master', 9090)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
row = 'id-1'
mutations = [Mutation(column="cf:name", value="liangjf")]
client.mutateRow('test', row, mutations, None)
------------------------
hbase(main):022:0> scan 'test'
ROW COLUMN+CELL
id-1 column=cf:name, timestamp=1511079251051, value=liangjf
1 row(s) in 0.0130 seconds
-------------------------------------------------------------------
【getRow.py】--------------获取一行记录
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from python_hbase import Hbase
from python_hbase.ttypes import *
transport = TSocket.TSocket('master', 9090)
transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
tableName = 'test'
rowKey = 'id-1'
result = client.getRow(tableName, rowKey, None)
print result
for r in result:
print 'the row is ' , r.row
print 'the values is ' , r.columns.get('cf:name').value
----------------------
[liangjf@master python_hbase]$ python getRow.py
[TRowResult(sortedColumns=None, columns={'cf:name': TCell(timestamp=1511079251051, value='liangjf')}, row='id-1')]
the row is id-1
the values is liangjf