实时流处理系统PipeLine单机模拟集群环境

一、Pipeline 安装

rpm或源码安装,请自行查教程

二、Plproxy 安装

make PG_CONFIG=/xxx/xxx/xxx/pg_config  //路径必须是pipeline中对应的路径

make install

安装是注意 pipeline版本与plproxy 版本匹配

三、配置plproxy

单机搭建

proxy:pipelinedb:5433

datanode: pipelinedb1:5433

datanode: pipelinedb2:5433

 

3.1 主库上操作

1. 创建 plpgsql 扩展

pipeline=# create extension plpgsql;

ERROR:  extension "plpgsql" already exists

改错误忽略

  1. 创建 plproxy 扩展

pipeline=# create extension plproxy;                                                                     ERROR:  incompatible library "/user/servers/pipelinedb/lib/pipelinedb/plproxy.so": version mismatch

DETAIL:  Server is version 9.5, library is version 9.6.

 

Plproxy 版本不对,需要重新编译安装 plproxy2.6

  1. 创建两个数据库 作为子库。

pipeline1

pipeline2

  1. 创建 schema

create schema plproxy;

  1. 创建集群对应关系

CREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(cluster_name text) RETURNS SETOF text AS $$  BEGIN IF cluster_name = 'cluster' THEN RETURN NEXT 'dbname=pipeline1 host=127.0.0.1 port=5433'; RETURN NEXT 'dbname=pipeline2 host=127.0.0.1 port=5433'; RETURN; END IF; RAISE EXCEPTION 'Unknown cluster'; END; $$ LANGUAGE plpgsql

  1. 创建version

CREATE OR REPLACE FUNCTION

plproxy.get_cluster_version(cluster_name text)

        RETURNS integer AS $$ BEGIN    

                IF cluster_name = 'cluster' THEN        

                        RETURN 1;    

                END IF;    

                RAISE EXCEPTION 'Unknown cluster';

        END; $$ LANGUAGE plpgsql;

  1. 创建配置

CREATE OR REPLACE FUNCTION

plproxy.get_cluster_config(IN cluster_name text, OUT key text, OUT val text)

        returns setof record as $$ begin

                key := 'statement_timeout';

                val := 60;

                return next;

                return;

        end; $$ language plpgsql;

  1.  Ddl

CREATE OR REPLACE FUNCTION ddlexec(query text)  

  RETURNS SETOF integer AS

$BODY$

CLUSTER 'cluster';

RUN ON ALL;

$BODY$  

  LANGUAGE plproxy VOLATILE  

  COST 100  

  ROWS 1000;

ALTER FUNCTION ddlexec(text)  

  OWNER TO pipeline;

  1. Dml

CREATE OR REPLACE FUNCTION dmlexec(query text)

 RETURNS SETOF integer AS

$BODY$

CLUSTER 'cluster';

RUN ON ANY;

$BODY$

  LANGUAGE plproxy VOLATILE

  COST 100

  ROWS 1000;

ALTER FUNCTION dmlexec(text)

  OWNER TO pipeline;

  1. Dpl

CREATE OR REPLACE FUNCTION dqlexec(query text)

 RETURNS SETOF record AS

$BODY$

CLUSTER 'cluster';

RUN ON ALL;

$BODY$

  LANGUAGE plproxy VOLATILE

  COST 100

  ROWS 1000;

ALTER FUNCTION dqlexec(text)

  OWNER TO pipeline;

3.3 从库上操作

         在两个从库pipeline1、pipeline2上操作

1)       切库

pipeline1-# LANGUAGE 'plpgsql' VOLATILE

pipeline1-# COST 100;

CREATE FUNCTION

2)       ddl

CREATE OR REPLACE FUNCTION ddlexec(query text)

RETURNS integer AS

$BODY$

declare

ret integer;

begin

execute query;

return 1;

end;

$BODY$

LANGUAGE 'plpgsql' VOLATILE

COST 100;

3)       dml

CREATE OR REPLACE FUNCTION dmlexec(query text) RETURNS integer AS $BODY$ declare ret integer; begin execute query; return 1; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100;

4)       dql

CREATE OR REPLACE FUNCTION dqlexec(query text) RETURNS SETOF record AS $BODY$ declare ret record; begin for ret in execute query loop return next ret; end loop; return; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100 ROWS 1000;

四、测试

1) 创建表

pipeline=# select ddlexec('create table userinfo(name varchar(24), age int, sex char, phone varchar(24))');      

ERROR:  Pl/Proxy: cluster not found: cluster

 

出错,把主节点上的行数重新创建一遍,有的可能没有执行成功。

        

         2)Select 一把

pipeline=# select * from dqlexec('select * from userinfo') as (name varchar(24), age int, sex char, phone varchar(24));   NOTICE:  PL/Proxy: dropping stale conn

NOTICE:  PL/Proxy: dropping stale conn

 name | age | sex | phone

------+-----+-----+-------

(0 rows)

pipeline=#

 

3)insert

 

select dmlexec('insert into userinfo(name,age,sex,phone) values(''zhangsanfen'',8,''9'',''15008873298'')');

字符字段必须用两个单引号, 语句外用一个单引号

 

Select

select sum(count) from dqlexec('select count(*) from userinfo') as (count bigint);

 

insert

select dmlexec('insert into userinfo(name,age,sex,phone) values(''zhangdaoquan'',8,''9'',''15008873298'')');

 

4)连接查询

select * from dqlexec('select userinfo.name,userinfo.age,userinfo.sex,orderinfo.product_name from userinfo inner join orderinfo on userinfo.name=orderinfo.name') as (name varchar(24), age int, sex char, product_name varchar(24) );

 

五、流与视图

假设有个订单表

1)创建订单流

select ddlexec('drop stream orderstream CASCADE');

select ddlexec('create stream orderstream(userid int, product_id int, order_id int, order_time int, first_cate int, second_cate int, third_cate int)');

 

2)创建流视图

select ddlexec('create continuous view con_view as select userid, product_id from orderstream where  product_id=100 and arrival_timestamp > clock_timestamp() - interval ''10 minutes''');

        

         3) 查看结果

select * from dqlexec('select * from con_view') as(uid int, product_id int);

select * from dqlexec('select count(*) from con_view') as(cnt bigint);

 

4)切换到数据节点,查看创建的流及视图

\c pipeline1

pipeline1=# \d con_view

pipeline1=# \d orderstream

 

        

5)测试脚本

pipelinedb2]$ cat test.sql

\setrandom id 1 1000000   

\setrandom val 1 5000000   

insert into orderstream(userid, product_id, order_id, order_time, first_cate, second_cate, third_cate) values (:id, :id/1000, :val, :val, :val, :val, :val);

 

启动两个分别向两个库里插入

pipelinedb2]$ cat batche1.sql

/user/servers/pipelinedb/bin/pgbench -M prepared -n -r -P 1 -f /user/dev/pipelinedb2/test.sql -p5433 -c 40 -j 10 -T 1000 pipeline1

 

/user/servers/pipelinedb/bin/pgbench -M prepared -n -r -P 1 -f /user/dev/pipelinedb2/test.sql -p5433 -c 40 -j 10 -T 1000 pipeline2

 

posted on 2017-09-05 15:26  dawu  阅读(540)  评论(0编辑  收藏  举报