Hbase+Phoenix环境部署和使用
功能: Mysql数据迁移至Hbase, 通过Phoenix框架作为操作Hbase的CRUD层, 平滑过渡sql语法使用, 解决Mysql单表数据量过大导致查询效率低的问题.
Hbase的性能自然不用多说, PB级别的数据处理起来也没什么问题. 看一下Phoenix官方给出的性能介绍: http://phoenix.apache.org/faq.html#How_fast_is_Phoenix_Why_is_it_so_fast
1. Hbase Phoenix 环境安装
Hbase环境安装属于大数据领域的知识, Java开发程序员一般不太熟悉, 尤其还要涉及到Hadoop等框架的知识, 安装过程特别繁琐, 容易砸键盘.
所以最好采用docker方式安装, 部署方便, 开发测试的话也够用了. 上生产就交给大牛和运维吧! 你要是大牛, 你必须上 ^_^
docker安装和docker-compose使用, 自行了解.
环境部署:【 原文地址: https://gitee.com/assad/docker-hbase 】
下载仓库
git clone https://gitee.com/assad/docker-hbase.git
docker-compose 构建基础镜像。
make build
运行 Standalone 版本
Standalone 版本依赖结构与 quickstart HBase documentation 一致,只不过使用了 Hadoop 的 HDFS 版本作为底层储存。主要用于开发测试环境的快速搭建。
docker-compose -f docker-compose-standalone.yml up -d
运行 Local Distributed 伪分布式版本
运行该版本会使用一个独立的 Zookeeper,HMaster 节点和 HResgionserver 节点也会分开为独立容器。
docker-compose -f docker-compose-distributed-local.yml up -d
查看docker运行状态: docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Ports}}\t{{.Status}}"
CONTAINER ID NAMES PORTS STATUS
82523e8c14a7 nodemanager 8042/tcp Up 5 hours (healthy)
bf81f61f2df9 datanode 50075/tcp Up 5 hours (healthy)
1a9e2f1439e2 namenode 50070/tcp Up 5 hours (healthy)
1473a0330fc6 historyserver 8188/tcp Up 5 hours (healthy)
65c4e55c4c02 resourcemanager 8088/tcp Up 5 hours (healthy)
977b7ad4f4ef hbase 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp, 0.0.0.0:16000->16000/tcp, 0.0.0.0:16010->16010/tcp, 0.0.0.0:16020->16020/tcp, 0.0.0.0:16030->16030/tcp Up 5 hours
查看hbase容器日志: docker logs -f hbase
进入hbase容器base: docker exec -it hbase /bin/bash
服务器,添加host映射 vi /etc/hosts
服务器ip hbase
访问hbase管理页面: http://服务器ip:16010/master-status
2. Hbase shell 使用
进入hbase容器base: docker exec -it hbase /bin/bash
启动hbase shell客户端: hbase shell
[root@hbase ~]# docker exec -it hbase /bin/bash
root@hbase:/# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0, r7483b111e4da77adbfc8062b3b22cbe7c2cb91c1, Sun Apr 22 20:26:55 PDT 2018
Took 0.0031 seconds
hbase(main):001:0>
基本操作:
- 创建表
create <tableName> , <colFamily> [ ,<colFamily1> ,... ,<colFamilyN> ]
- 添加数据
put <tableName>, <rowKey> , < colFamily : col >,value [ , stampTime ]
- 查询数据
get <tableName>, <rowKey> , < colFamily : col >
- 查询所有数据
scan <tableName>
hbase(main):002:0> create 'user', 'info'
Created table user
Took 2.6171 seconds
=> Hbase::Table - user
hbase(main):003:0> put 'user', '001', 'info:name', 'linyufeng01'
Took 0.2252 seconds
hbase(main):004:0> get 'user', '001'
COLUMN CELL
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0576 seconds
hbase(main):005:0> put 'user', '001', 'info:age', '12'
Took 0.0079 seconds
hbase(main):006:0> get 'user', '001'
COLUMN CELL
info:age timestamp=1623618951968, value=12
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0102 seconds
hbase(main):007:0> get 'user', '001', 'info:name'
COLUMN CELL
info:name timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0084 seconds
hbase(main):008:0> scan 'user'
ROW COLUMN+CELL
001 column=info:age, timestamp=1623618951968, value=12
001 column=info:name, timestamp=1623618903615, value=linyufeng01
1 row(s)
Took 0.0138 seconds
hbase(main):009:0>
4. Phoenix shell 使用
下载 http://www.apache.org/dyn/closer.lua/phoenix/apache-phoenix-5.0.0-HBase-2.0/bin/apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
解压 tar -zxvf apache-phoenix-5.0.0-HBase-2.0-bin.tar.gz
, 进入bin目录
启动 Phoenix shell 客户端 ./sqlline.py localhost
[root@hbase bin]# ./sqlline.py localhost
0: jdbc:phoenix:localhost> !tables
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | INDEX_STATE | IMMUTABLE_ROWS | SALT_BUCKETS | MULTI_TENANT | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
| | | IDX_T_LOG | INDEX | | | | | ACTIVE | false | 16 | false | |
| | | IDX_T_LOG_UID | INDEX | | | | | ACTIVE | false | 16 | false | |
| | SYSTEM | CATALOG | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | FUNCTION | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | LOG | SYSTEM TABLE | | | | | | true | 32 | false | |
| | SYSTEM | SEQUENCE | SYSTEM TABLE | | | | | | false | null | false | |
| | SYSTEM | STATS | SYSTEM TABLE | | | | | | false | null | false | |
+------------+--------------+--------------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+---------------+-+
0: jdbc:phoenix:localhost>
基本sql语法
CREATE TABLE IF NOT EXISTS t_user (
name VARCHAR(20) NOT NULL,
age integer(3) NOT NULL,
address VARCHAR,
updatetime timestamp,
inserttime timestamp
CONSTRAINT pk_name_age PRIMARY KEY (name, age)// 联合主键
);
// 添加和更新数据都用 upsert, 没有 insert 和 update
UPSERT INTO t_user (name, age, address) values ('test01',12,'上海市松江区');
UPSERT INTO t_user (name, age, address) values ('test02',13,'上海市金山区');
SELECT * FROM t_user;
SELECT COUNT(*) FROM t_user;
0: jdbc:phoenix:localhost> SELECT * FROM t_user;
+---------+------+----------+-------------+-------------+
| NAME | AGE | ADDRESS | UPDATETIME | INSERTTIME |
+---------+------+----------+-------------+-------------+
| test01 | 12 | 上海市松江区 | | |
| test02 | 13 | 上海市金山区 | | |
+---------+------+----------+-------------+-------------+
4. Hbase Phoenix 数据库连接工具
可视化数据库连接工具势必可少的, 就像Navicat
一样. 这里可以使用 squirrel sql
和 Dbeaver
推荐Dbeaver V6.1.5, 高版本需要Jdk11才能运行
1. 下载安装
下载页面: https://github.com/dbeaver/dbeaver/releases?after=6.2.0
windows 64位下载地址: https://github.com/dbeaver/dbeaver/releases/download/6.1.5/dbeaver-ce-6.1.5-x86_64-setup.exe
安装完成后, 如果启动报错, 在安装目录找到 dbeaver.ini
文件, 在首行添加本机jdk安装目录:
-vm
C:\Program Files\Java\jdk1.8.0_201\bin // 以本机为准
复制 phoenix-5.0.0-HBase-2.0-client.jar
到根目录, 后边需要加载驱动
2. 配置连接
新建连接
配置驱动
选择本地jar包驱动, 不要使用自动下载的Hadoop Phoenix驱动
连接测试
5. SpringBoot集成
乐意用MyBatis就用, 用MyBatisPlus也行
依赖:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>3.1.0</version>
</dependency>
<!--phoenix core-->
<dependency>
<groupId>org.apache.phoenix</groupId>
<artifactId>phoenix-core</artifactId>
<version>5.0.0-HBase-2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.9.2</version>
</dependency>
配置:
spring.datasource.driver-class-name=org.apache.phoenix.jdbc.PhoenixDriver
spring.datasource.url=jdbc:phoenix:服务器ip:2181
spring.datasource.username=
spring.datasource.password=
官方说,不需要数据库连接池 http://phoenix.apache.org/faq.html#Should_I_pool_Phoenix_JDBC_Connections
其他配置跟普通项目一样, bean, dao, service 等等该怎么建就怎么建, 都一样.
唯一的区别就是, 在Phoenix中, 把 insert
和 update
语句合并成了 upsert
语句. 不要再写 insert into
, 用 upsert into
来代替
Bean
@Data
@ToString
@TableName("t_user")
public class User {
private String name;
private Integer age;
private String address;
private Date inserttime;
private Date updatetime;
}
dao
@Mapper
public interface UserMapper {
@Select("select * from t_user")
List<User> list();
@Insert("UPSERT INTO t_user (name, age, address, inserttime, updatetime) " +
"values (#{name}, #{age}, #{address}, #{inserttime}, #{updatetime})")
void insert(User user);
}
配置Hadoop环境
参考: 【 https://blog.csdn.net/weixin_41122339/article/details/81141913 】
也可以采用非Hadoop方案
在项目根目录添加bin
目录, 拷贝winutils.exe
到此目录, 修改 SpringBootApplication
@SpringBootApplication
public class HbasephoenixApplication {
public static void main(String[] args) {
// 或设置 HADOOP_HOME
System.setProperty("hadoop.home.dir", System.getProperty("user.dir"));
SpringApplication.run(HbasephoenixApplication.class, args);
}
}
单元测试
@SpringBootTest
@RunWith(SpringJUnit4ClassRunner.class)
class HbasephoenixApplicationTests {
@Autowired
private UserMapper userMapper;
@Test
void list() {
List<User> users = userMapper.list();
users.forEach(System.out::println);
}
@Test
void add() {
User user = new User();
user.setName("test03");
user.setAge(14);
user.setAddress("SHANGHAI SONGJIANG");
user.setInserttime(new Date());
user.setUpdatetime(new Date());
userMapper.insert(user);
list();
}
}
Bug
1. null\bin\winutils.exe
https://blog.csdn.net/weixin_41122339/article/details/81141913
2. certificate has expired or is not yet valid
https://blog.csdn.net/qq_39562468/article/details/84592139
3. spark-phoenix 一直出现一个 bug?
http://www.hainiubl.com/topics/36024
参考: