Postgres分表

这个需求就是一个典型的按时间创建分区表,首先看一下步骤:

  • 创建父表

先创建一张“父表”,所有分区表都从它继承,这个表中没有数据,也不要在这个表上定义任何检查约束及索引,现在我们就先创建这样一张表,但之前先建一个序列:

CREATE SEQUENCE "public"."control_alarm_info_uid_seq"

INCREMENT 1

MINVALUE 1

MAXVALUE 99999999

START 1

CACHE 1;

ALTER TABLE "public"."control_alarm_info_uid_seq " OWNER TO "postgres";

接下来创建“父表”,

-- Table: control_alarm_info

-- DROP TABLE control_alarm_info;

CREATE TABLE control_alarm_info

(

uid bigserial NOT NULL,

status bigint,

create_time timestamp without time zone,

update_time timestamp without time zone,

creator character varying(32),

store_id integer,

store_name character varying(255),

target_id integer,

person_name character varying(255),

telphone character varying(255),

alarm_time character varying(255),

face_picurl character varying(255),

camera_info_id character varying(255),

camera_name character varying(255),

event_log_id character varying(255),

similarity real,

person_code character varying(64),

alarm_type character varying(255),

bkg_picurl character varying(255),

app_key character varying(64)

)

WITH (

OIDS=FALSE

);

ALTER TABLE control_alarm_info4

OWNER TO postgres;

  • 按时间触发,创建n个子表

  • 创建n个子表,每个子表都是继承于父表

由于每个分区表都是从父表继承的,所以分区表不会增加任何字段,下面我们按需求创建4张分区子表,分别用于存放9月、10月、11月和12月的日志数据:

create table control_alarm_info_201809

(CHECK (alarm_time >= '2018-09-01' AND alarm_time < '2018-10-01'))

INHERITS (control_alarm_info);

create table control_alarm_info_201810

(CHECK (alarm_time >= '2018-10-01' AND alarm_time < '2018-11-01'))

INHERITS (control_alarm_info);

create table control_alarm_info_201811

(CHECK (alarm_time >= '2018-11-01' AND alarm_time < '2018-12-01'))

INHERITS (control_alarm_info);

接下来在这4张分区表的每个分区键上建立索引:

在上面的语句中我们添加了一个约束表示只允许插入本月的数据,接下来在这4张分区表的每个分区键上建立索引:

create index control_alarm_info_201809_alarm_time ON control_alarm_info_201809 (alarm_time);

create index control_alarm_info_201810_alarm_time ON control_alarm_info_201810 (alarm_time);

create index control_alarm_info_201811_alarm_time ON control_alarm_info_201811(alarm_time);

查询时查询条件中包含这些索引时才会提高查询效率,如果能定位到一张子表内,效率更高。

  • 定义一个规则(Rule)或触发器(Trigger),把对主表的数据插入重定向到合适的分区表

如何才能让不同日期的数据自动的插入与其对应的分区子表中呢?有两种解决方案,分别是:规则(Rule)和触发器(Trigger),相比触发器,Rule的开销更大,所以我在这里就不做过多介绍了,下面直接介绍Trigger的方式。

Trigger通常会结合自定义函数(Function)来实现分区插入,Function负责根据条件选择插入,而Trigger则负责Function的自动调用。首先定义Function,功能很简单,即根据日期区间insert数据即可:

Drop trigger control_alarm_info_insert_trigger;

CREATE

OR REPLACE FUNCTION control_alarm_info_insert_trigger () RETURNS TRIGGER AS $$

BEGIN

IF (

NEW .alarm_time >= '2018-09-01'

AND NEW .alarm_time < '2018-10-01'

) THEN

INSERT INTO control_alarm_info_201809

VALUES

(NEW .*) ;

ELSEIF (

NEW .alarm_time >= '2018-10-01'

AND NEW .alarm_time < '2018-11-01'

) THEN

INSERT INTO control_alarm_info_201810

VALUES

(NEW .*) ;

ELSEIF (

NEW .alarm_time >= '2018-11-01'

AND NEW .alarm_time < '2018-12-01'

) THEN

INSERT INTO control_alarm_info_201811

VALUES

(NEW .*) ;

ELSE

RAISE EXCEPTION 'Date out of range!' ;

END

IF ; RETURN NULL ;

END ; $$ LANGUAGE plpgsql;

最后再创建触发器用于执行刚才的Function:

CREATE TRIGGER control_alarm_info_insert_trigger  BEFORE INSERT ON control_alarm_info

FOR EACH ROW

EXECUTE PROCEDURE control_alarm_info_insert_trigger();

未分表时200万数据中查询最后一条告警记录,2.505s

分表后200万数据中查询最后一条告警记录,0.221s,提升了10倍

插入性能

未分表时200万数据中插入10万条数据,需要104s

posted @ 2024-04-23 18:41  木头左  阅读(21)  评论(0编辑  收藏  举报