How to use table partitioning to scale PostgreSQL(转发)

原文:

https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql#:~:text=1%20PostgreSQL%20declarative%20partitioning%20is%20highly%20flexible%20and,may%20be%20cheaper%20or%20slower%20for%20low-use%20data.

 

This article discusses table partitions, the benefits of using them to increase performance, and the types of partitions that can be used in PostgreSQL.

1. Benefits of partitioning

2. When to use partitioning

3. How to use partitioning

a. List partition

b. Range partition

c. Hash partition

d. Multilevel partition

4. Limitations

 

With huge data being stored in databases, performance and scaling are two main factors that are affected.

As table size increases with data load, more data scanning, swapping pages to memory, and other table operation costs also increase.

Partitioning may be a good solution, as It can help divide a large table into smaller tables and thus reduce table scans and memory swap problems, which ultimately increases performance.

 

Partitioning helps to scale PostgreSQL by splitting large logical tables into smaller physical tables that can be stored on different storage media based on uses.

Users can take better advantage of scaling by using declarative partitioning along with foreign tables using postgres_fdw. 

 

Benefits of partitioning 

  • PostgreSQL declarative partitioning is highly flexible and provides good control to users. Users can create any level of partitioning based on need and can modify, use constraints, triggers, and indexes on each partition separately as well as on all partitions together.
  • Query performance can be increased significantly compared to selecting from a single large table.
  • Partition-wise-join and partition-wise-aggregate features increase complex query computation performance as well.
  • Bulk loads and data deletion can be much faster, as based on user requirements these operations can be performed on individual partitions.
  • Each partition can contain data based on its frequency of use and so can be stored on media that may be cheaper or slower for low-use data.

 

When to use partitioning 

Most benefits of partitioning can be enjoyed when a single table is not able to provide them.

So we can say that if a lot of data is going to be written on a single table at some point, users need partitioning.

Apart from data, there may be other factors users should consider, like update frequency of the data, use of data over a time period, how small a range data can be divided, etc.

With good planning and taking all factors into consideration, table partitioning can give a great performance boost and scale your PostgreSQL to larger datasets.

How to use partitioning 

As of PostgreSQL12 release List, Range, Hash and combinations of these partition methods at different levels are supported.

Let’s explore what these are and how users can create different types of partitions with examples. For this article we will use the same table, which can be created by different partition methods. 

LIST PARTITION 

A list partition is created with predefined values to hold in a partitioned table.

A default partition (optional) holds all those values that are not part of any specified partition.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
postgres=# CREATE TABLE customers (id INTEGER, status TEXT, arr NUMERIC) PARTITION BY LIST(status);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_active PARTITION OF customers FOR VALUES IN ('ACTIVE');
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_archived PARTITION OF customers FOR VALUES IN ('EXPIRED');
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_others PARTITION OF customers DEFAULT;
 
CREATE TABLE
 
 
 
postgres=# \d+ customers
 
                           Partitioned table "public.customers"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition key: LIST (status)
 
Partitions: cust_active FOR VALUES IN ('ACTIVE'),
 
            cust_archived FOR VALUES IN ('EXPIRED'),
 
            cust_others DEFAULT
 
 
 
postgres=# \d+ cust_active
 
                                Table "public.cust_active"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: customers FOR VALUES IN ('ACTIVE')
 
Partition constraint: ((status IS NOT NULL) AND (status = 'ACTIVE'::text))
 
Access method: heap
 
 
 
postgres=# INSERT INTO customers VALUES (1,'ACTIVE',100), (2,'RECURRING',20), (3,'EXPIRED',38), (4,'REACTIVATED',144);
 
INSERT 0 4
 
postgres=# SELECT tableoid::regclass,* FROM customers;
 
   tableoid    | id |   status    | arr
 
---------------+----+-------------+-----
 
 cust_active   |  1 | ACTIVE      | 100
 
 cust_archived |  3 | EXPIRED     |  38
 
 cust_others   |  2 | RECURRING   |  20
 
 cust_others   |  4 | REACTIVATED | 144
 
(4 rows)

  

RANGE PARTITION 

A range partition is created to hold values within a range provided on the partition key.

Both minimum and maximum values of the range need to be specified, where minimum value is inclusive and maximum value is exclusive.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
postgres=# CREATE TABLE customers (id INTEGER, status TEXT, arr NUMERIC) PARTITION BY RANGE(arr);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_arr_small PARTITION OF customers FOR VALUES FROM (MINVALUE) TO (25);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_arr_medium PARTITION OF customers FOR VALUES FROM (25) TO (75);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_arr_large PARTITION OF customers FOR VALUES FROM (75) TO (MAXVALUE);
 
CREATE TABLE
 
postgres=#
 
postgres=# \d+ customers
 
                           Partitioned table "public.customers"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition key: RANGE (arr)
 
Partitions: cust_arr_large FOR VALUES FROM ('75') TO (MAXVALUE),
 
            cust_arr_medium FOR VALUES FROM ('25') TO ('75'),
 
            cust_arr_small FOR VALUES FROM (MINVALUE) TO ('25')
 
 
 
postgres=# \d+ cust_arr_small
 
                               Table "public.cust_arr_small"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: customers FOR VALUES FROM (MINVALUE) TO ('25')
 
Partition constraint: ((arr IS NOT NULL) AND (arr < '25'::numeric))
 
Access method: heap
 
 
 
postgres=#
 
postgres=# INSERT INTO customers VALUES (1,'ACTIVE',100), (2,'RECURRING',20), (3,'EXPIRED',38), (4,'REACTIVATED',144);
 
INSERT 0 4
 
postgres=# SELECT tableoid::regclass,* FROM customers;
 
    tableoid     | id |   status    | arr
 
-----------------+----+-------------+-----
 
 cust_arr_small  |  2 | RECURRING   |  20
 
 cust_arr_medium |  3 | EXPIRED     |  38
 
 cust_arr_large  |  1 | ACTIVE      | 100
 
 cust_arr_large  |  4 | REACTIVATED | 144
 
(4 rows)

  

HASH PARTITION 

A hash partition is created by using modulus and remainder for each partition, where rows are inserted by generating a hash value using these modulus and remainders. 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
postgres=# CREATE TABLE customers (id INTEGER, status TEXT, arr NUMERIC) PARTITION BY HASH(id);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part1 PARTITION OF customers FOR VALUES WITH (modulus 3, remainder 0);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part2 PARTITION OF customers FOR VALUES WITH (modulus 3, remainder 1);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part3 PARTITION OF customers FOR VALUES WITH (modulus 3, remainder 2);
 
CREATE TABLE
 
postgres=#
 
postgres=# \d+ customers
 
                           Partitioned table "public.customers"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition key: HASH (id)
 
Partitions: cust_part1 FOR VALUES WITH (modulus 3, remainder 0),
 
            cust_part2 FOR VALUES WITH (modulus 3, remainder 1),
 
            cust_part3 FOR VALUES WITH (modulus 3, remainder 2)
 
 
 
postgres=# \d+ cust_part1
 
                                 Table "public.cust_part1"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: customers FOR VALUES WITH (modulus 3, remainder 0)
 
Partition constraint: satisfies_hash_partition('16475'::oid, 3, 0, id)
 
Access method: heap
 
 
 
postgres=#
 
postgres=# INSERT INTO customers VALUES (1,'ACTIVE',100), (2,'RECURRING',20), (3,'EXPIRED',38), (4,'REACTIVATED',144);
 
INSERT 0 4
 
postgres=# SELECT tableoid::regclass,* FROM customers;
 
  tableoid  | id |   status    | arr
 
------------+----+-------------+-----
 
 cust_part1 |  2 | RECURRING   |  20
 
 cust_part1 |  4 | REACTIVATED | 144
 
 cust_part2 |  3 | EXPIRED     |  38
 
 cust_part3 |  1 | ACTIVE      | 100
 
(4 rows)

  

MULTILEVEL PARTITION 

PostgreSQL multilevel partitions can be created up to N levels.

Partition methods

LIST-LIST, LIST-RANGE, LIST-HASH,

RANGE-RANGE, RANGE-LIST, RANGE-HASH,

HASH-HASH, HASH-LIST, and HASH-RANGE can be created in PostgreSQL declarative partitioning.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
postgres=# CREATE TABLE customers (id INTEGER, status TEXT, arr NUMERIC) PARTITION BY LIST(status);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_active PARTITION OF customers FOR VALUES IN ('ACTIVE','RECURRING','REACTIVATED') PARTITION BY RANGE(arr);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_arr_small PARTITION OF cust_active FOR VALUES FROM (MINVALUE) TO (101) PARTITION BY HASH(id);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part11 PARTITION OF cust_arr_small FOR VALUES WITH (modulus 2, remainder 0);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part12 PARTITION OF cust_arr_small FOR VALUES WITH (modulus 2, remainder 1);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_other PARTITION OF customers DEFAULT PARTITION BY RANGE(arr);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_arr_large PARTITION OF cust_other FOR VALUES FROM (101) TO (MAXVALUE) PARTITION BY HASH(id);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part21 PARTITION OF cust_arr_large FOR VALUES WITH (modulus 2, remainder 0);
 
CREATE TABLE
 
postgres=# CREATE TABLE cust_part22 PARTITION OF cust_arr_large FOR VALUES WITH (modulus 2, remainder 1);
 
CREATE TABLE
 
postgres=#
 
postgres=# \d+ customers
 
                           Partitioned table "public.customers"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition key: LIST (status)
 
Partitions: cust_active FOR VALUES IN ('ACTIVE', 'RECURRING', 'REACTIVATED'), PARTITIONED,
 
            cust_other DEFAULT, PARTITIONED
 
 
 
postgres=# \d+ cust_active
 
                          Partitioned table "public.cust_active"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: customers FOR VALUES IN ('ACTIVE', 'RECURRING', 'REACTIVATED')
 
Partition constraint: ((status IS NOT NULL) AND (status = ANY (ARRAY['ACTIVE'::text, 'RECURRING'::text, 'REACTIVATED'::text])))
 
Partition key: RANGE (arr)
 
Partitions: cust_arr_small FOR VALUES FROM (MINVALUE) TO ('101'), PARTITIONED
 
 
 
postgres=# \d+ cust_arr_small
 
                         Partitioned table "public.cust_arr_small"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: cust_active FOR VALUES FROM (MINVALUE) TO ('101')
 
Partition constraint: ((status IS NOT NULL) AND (status = ANY (ARRAY['ACTIVE'::text, 'RECURRING'::text, 'REACTIVATED'::text])) AND (arr IS NOT NULL) AND (arr < '101'::numeric))
 
Partition key: HASH (id)
 
Partitions: cust_part11 FOR VALUES WITH (modulus 2, remainder 0),
 
            cust_part12 FOR VALUES WITH (modulus 2, remainder 1)
 
 
 
postgres=# \d+ cust_part11
 
                                Table "public.cust_part11"
 
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description
 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 
 id     | integer |           |          |         | plain    |              |
 
 status | text    |           |          |         | extended |              |
 
 arr    | numeric |           |          |         | main     |              |
 
Partition of: cust_arr_small FOR VALUES WITH (modulus 2, remainder 0)
 
Partition constraint: ((status IS NOT NULL) AND (status = ANY (ARRAY['ACTIVE'::text, 'RECURRING'::text, 'REACTIVATED'::text])) AND (arr IS NOT NULL) AND (arr < '101'::numeric) AND satisfies_hash_partition('16621'::oid, 2, 0, id))
 
Access method: heap
 
 
 
postgres=#
 
postgres=# INSERT INTO customers VALUES (1,'ACTIVE',100), (2,'RECURRING',20), (3,'REACTIVATED',38), (4,'EXPIRED',144);
 
INSERT 0 4
 
postgres=# SELECT tableoid::regclass,* FROM customers;
 
  tableoid   | id |   status    | arr
 
-------------+----+-------------+-----
 
 cust_part11 |  1 | ACTIVE      | 100
 
 cust_part11 |  2 | RECURRING   |  20
 
 cust_part12 |  3 | REACTIVATED |  38
 
 cust_part22 |  4 | EXPIRED     | 144
 
(4 rows)

  

 

Limitations 

Partitioning was introduced in PostgreSQL 10 and continues to be improved and made more stable. Still, there are certain limitations that users may need to consider:

1. Unique constraints on partitioned tables must include all the partition key columns. One work-around is to create unique constraints on each partition instead of a partitioned table.

2. Partition does not support BEFORE ROW triggers on partitioned tables. If necessary, they must be defined on individual partitions, not the partitioned table.

3. Range partition does not allow NULL values.

4. PostgreSQL does not create a system-defined subpartition when not given it explicitly, so if a subpartition is present at least one partition should be present to hold values.

5. In the case of HASH-LIST, HASH-RANGE, and HASH-HASH composite partitions, users need to make sure all partitions are present at the subpartition level as HASH can direct values at any partition based on hash value.

 

For more information, please refer to the PostgreSQL documentation: 

https://www.postgresql.org/docs/current/ddl-partitioning.html

https://www.postgresql.org/docs/current/sql-createtable.html

 

posted @   PanPan003  阅读(112)  评论(0编辑  收藏  举报
编辑推荐:
· 没有源码,如何修改代码逻辑?
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
阅读排行:
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· DeepSeek R1 简明指南:架构、训练、本地部署及硬件要求
· 没有源码,如何修改代码逻辑?
· NetPad:一个.NET开源、跨平台的C#编辑器
· 面试官:你是如何进行SQL调优的?
点击右上角即可分享
微信分享提示