实时流处理系统PipeLine单机模拟集群环境
一、Pipeline 安装
rpm或源码安装,请自行查教程
二、Plproxy 安装
make PG_CONFIG=/xxx/xxx/xxx/pg_config //路径必须是pipeline中对应的路径
make install
安装是注意 pipeline版本与plproxy 版本匹配
三、配置plproxy
单机搭建
proxy:pipelinedb:5433
datanode: pipelinedb1:5433
datanode: pipelinedb2:5433
3.1 主库上操作
1. 创建 plpgsql 扩展
pipeline=# create extension plpgsql;
ERROR: extension "plpgsql" already exists
改错误忽略
- 创建 plproxy 扩展
pipeline=# create extension plproxy; ERROR: incompatible library "/user/servers/pipelinedb/lib/pipelinedb/plproxy.so": version mismatch
DETAIL: Server is version 9.5, library is version 9.6.
Plproxy 版本不对,需要重新编译安装 plproxy2.6
- 创建两个数据库 作为子库。
pipeline1
pipeline2
- 创建 schema
create schema plproxy;
- 创建集群对应关系
CREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(cluster_name text) RETURNS SETOF text AS $$ BEGIN IF cluster_name = 'cluster' THEN RETURN NEXT 'dbname=pipeline1 host=127.0.0.1 port=5433'; RETURN NEXT 'dbname=pipeline2 host=127.0.0.1 port=5433'; RETURN; END IF; RAISE EXCEPTION 'Unknown cluster'; END; $$ LANGUAGE plpgsql
- 创建version
CREATE OR REPLACE FUNCTION
plproxy.get_cluster_version(cluster_name text)
RETURNS integer AS $$ BEGIN
IF cluster_name = 'cluster' THEN
RETURN 1;
END IF;
RAISE EXCEPTION 'Unknown cluster';
END; $$ LANGUAGE plpgsql;
- 创建配置
CREATE OR REPLACE FUNCTION
plproxy.get_cluster_config(IN cluster_name text, OUT key text, OUT val text)
returns setof record as $$ begin
key := 'statement_timeout';
val := 60;
return next;
return;
end; $$ language plpgsql;
- Ddl
CREATE OR REPLACE FUNCTION ddlexec(query text)
RETURNS SETOF integer AS
$BODY$
CLUSTER 'cluster';
RUN ON ALL;
$BODY$
LANGUAGE plproxy VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION ddlexec(text)
OWNER TO pipeline;
- Dml
CREATE OR REPLACE FUNCTION dmlexec(query text)
RETURNS SETOF integer AS
$BODY$
CLUSTER 'cluster';
RUN ON ANY;
$BODY$
LANGUAGE plproxy VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION dmlexec(text)
OWNER TO pipeline;
- Dpl
CREATE OR REPLACE FUNCTION dqlexec(query text)
RETURNS SETOF record AS
$BODY$
CLUSTER 'cluster';
RUN ON ALL;
$BODY$
LANGUAGE plproxy VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION dqlexec(text)
OWNER TO pipeline;
3.3 从库上操作
在两个从库pipeline1、pipeline2上操作
1) 切库
pipeline1-# LANGUAGE 'plpgsql' VOLATILE
pipeline1-# COST 100;
CREATE FUNCTION
2) ddl
CREATE OR REPLACE FUNCTION ddlexec(query text)
RETURNS integer AS
$BODY$
declare
ret integer;
begin
execute query;
return 1;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100;
3) dml
CREATE OR REPLACE FUNCTION dmlexec(query text) RETURNS integer AS $BODY$ declare ret integer; begin execute query; return 1; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100;
4) dql
CREATE OR REPLACE FUNCTION dqlexec(query text) RETURNS SETOF record AS $BODY$ declare ret record; begin for ret in execute query loop return next ret; end loop; return; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100 ROWS 1000;
四、测试
1) 创建表
pipeline=# select ddlexec('create table userinfo(name varchar(24), age int, sex char, phone varchar(24))');
ERROR: Pl/Proxy: cluster not found: cluster
出错,把主节点上的行数重新创建一遍,有的可能没有执行成功。
2)Select 一把
pipeline=# select * from dqlexec('select * from userinfo') as (name varchar(24), age int, sex char, phone varchar(24)); NOTICE: PL/Proxy: dropping stale conn
NOTICE: PL/Proxy: dropping stale conn
name | age | sex | phone
------+-----+-----+-------
(0 rows)
pipeline=#
3)insert
select dmlexec('insert into userinfo(name,age,sex,phone) values(''zhangsanfen'',8,''9'',''15008873298'')');
字符字段必须用两个单引号, 语句外用一个单引号
Select
select sum(count) from dqlexec('select count(*) from userinfo') as (count bigint);
insert
select dmlexec('insert into userinfo(name,age,sex,phone) values(''zhangdaoquan'',8,''9'',''15008873298'')');
4)连接查询
select * from dqlexec('select userinfo.name,userinfo.age,userinfo.sex,orderinfo.product_name from userinfo inner join orderinfo on userinfo.name=orderinfo.name') as (name varchar(24), age int, sex char, product_name varchar(24) );
五、流与视图
假设有个订单表
1)创建订单流
select ddlexec('drop stream orderstream CASCADE');
select ddlexec('create stream orderstream(userid int, product_id int, order_id int, order_time int, first_cate int, second_cate int, third_cate int)');
2)创建流视图
select ddlexec('create continuous view con_view as select userid, product_id from orderstream where product_id=100 and arrival_timestamp > clock_timestamp() - interval ''10 minutes''');
3) 查看结果
select * from dqlexec('select * from con_view') as(uid int, product_id int);
select * from dqlexec('select count(*) from con_view') as(cnt bigint);
4)切换到数据节点,查看创建的流及视图
\c pipeline1
pipeline1=# \d con_view
pipeline1=# \d orderstream
5)测试脚本
pipelinedb2]$ cat test.sql
\setrandom id 1 1000000
\setrandom val 1 5000000
insert into orderstream(userid, product_id, order_id, order_time, first_cate, second_cate, third_cate) values (:id, :id/1000, :val, :val, :val, :val, :val);
启动两个分别向两个库里插入
pipelinedb2]$ cat batche1.sql
/user/servers/pipelinedb/bin/pgbench -M prepared -n -r -P 1 -f /user/dev/pipelinedb2/test.sql -p5433 -c 40 -j 10 -T 1000 pipeline1
/user/servers/pipelinedb/bin/pgbench -M prepared -n -r -P 1 -f /user/dev/pipelinedb2/test.sql -p5433 -c 40 -j 10 -T 1000 pipeline2
浙公网安备 33010602011771号