hbaseSink

1.格式:

CREATE TABLE MyResult(
    colFamily:colName colType,
    ...
 )WITH(
    type ='hbase',
    zookeeperQuorum ='ip:port[,ip:port]',
    tableName ='tableName',
    rowKey ='colName[+colName]',
    parallelism ='1',
    zookeeperParent ='/hbase'
 )


2.支持版本

hbase2.0

3.表结构定义

参数名称 含义
tableName 在 sql 中使用的名称;即注册到flink-table-env上的名称
colFamily:colName hbase中的列族名称和列名称
colType 列类型 colType支持的类型

4.参数:

参数名称 含义 是否必填 默认值
type 表明 输出表类型[mysq|hbase|elasticsearch]
zookeeperQuorum hbase zk地址,多个直接用逗号隔开
zookeeperParent zkParent 路径
tableName 关联的hbase表名称
rowkey hbase的rowkey关联的列信息'+'多个值以逗号隔开
updateMode APPEND:不回撤数据,只下发增量数据,UPSERT:先删除回撤数据,然后更新 APPEND|
parallelism 并行度设置 1
kerberosAuthEnable 是否开启kerberos认证 false
regionserverPrincipal regionserver的principal,这个值从hbase-site.xml的hbase.regionserver.kerberos.principal属性中获取
clientKeytabFile client的keytab 文件
clientPrincipal client的principal
zookeeperSaslClient zookeeper.sasl.client值 true
securityKrb5Conf java.security.krb5.conf值
另外开启Kerberos认证还需要在VM参数中配置krb5, -Djava.security.krb5.conf=/Users/xuchao/Documents/flinkSql/kerberos/krb5.conf
同时在addShipfile参数中添加keytab文件的路径,参数具体细节请看命令参数说明

5.样例:

普通结果表语句示例

CREATE TABLE MyTable(
    name varchar,
    channel varchar,
    age int
 )WITH(
    type ='kafka10',
    bootstrapServers ='172.16.8.107:9092',
    zookeeperQuorum ='172.16.8.107:2181/kafka',
    offsetReset ='latest',
    topic ='mqTest01',
    timezone='Asia/Shanghai',
    updateMode ='append',
    enableKeyPartitions ='false',
    topicIsPattern ='false',
    parallelism ='1'
 );

CREATE TABLE MyResult(
    cf:name varchar ,
    cf:channel varchar 
 )WITH(
	type ='hbase',
	zookeeperQuorum ='172.16.10.104:2181,172.16.10.224:2181,172.16.10.252:2181',
	zookeeperParent ='/hbase',
	tableName ='myresult',
	partitionedJoin ='false',
	parallelism ='1',
	rowKey='name+channel'
 );

insert          
into
    MyResult
    select
        channel,
        name                                            
    from
        MyTable a       

 

kerberos认证结果表语句示例

CREATE TABLE MyTable(
    name varchar,
    channel varchar,
    age int
 )WITH(
    type ='kafka10',
    bootstrapServers ='172.16.8.107:9092',
    zookeeperQuorum ='172.16.8.107:2181/kafka',
    offsetReset ='latest',
    topic ='mqTest01',
    timezone='Asia/Shanghai',
    updateMode ='append',
    enableKeyPartitions ='false',
    topicIsPattern ='false',
    parallelism ='1'
 );

CREATE TABLE MyResult(
    cf:name varchar ,
    cf:channel varchar 
 )WITH(
	type ='hbase',
	zookeeperQuorum ='cdh2.cdhsite:2181,cdh4.cdhsite:2181',
	zookeeperParent ='/hbase',
	tableName ='myresult',
	partitionedJoin ='false',
	parallelism ='1',
	rowKey='name',
    kerberosAuthEnable='true',
    regionserverPrincipal='hbase/_HOST@DTSTACK.COM',
    clientKeytabFile='test.keytab',
    clientPrincipal='test@DTSTACK.COM',
    securityKrb5Conf='krb5.conf',
 );

insert          
into
    MyResult
    select
        channel,
        name                                            
    from
        MyTable a      

6.hbase数据

数据内容说明

hbase的rowkey 构建规则:以描述的rowkey字段值作为key,多个字段以'+'连接

数据内容示例

hbase(main):007:0> scan 'myresult'
ROW COLUMN+CELL
roc-daishu column=cf:channel, timestamp=1589183971724, value=daishu
roc-daishu column=cf:name, timestamp=1589183971724, value=roc

posted @ 2022-12-02 14:26  三里清风18  阅读(44)  评论(0编辑  收藏  举报