Replication Between PostgreSQL Versions Using Logical Replication

转自:https://www.percona.com/blog/2019/04/04/replication-between-postgresql-versions-using-logical-replication/

When you need to upgrade PostgreSQL, there are a variety of approaches that you can use. To avoid application downtime, then not all of the options to upgrade postgres are suitable. When avoiding downtime is essential, then you might consider using replication as a means of upgrade, and depending on your scenario, you can choose to approach this task using either logical replication or physical (streaming) replication. Here, we take a look at the difference between logical and physical replication in PostgreSQL. Then we explore how to accomplish an upgrade using logical replication in more detail, and by doing so, avoid application downtime. In a subsequent article, we’ll investigate physical replication.

We have already discussed about a few methods available to perform PostgreSQL upgrades in our previous blog posts – PostgreSQL Upgrade Using pg_dumpall and PostgreSQL Upgrade Using pg_dump/pg_restore – as part of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series. However, both of these methods involve downtime to application.

Types of logical replication

Here we’ll cover two types of replication you could implement:

  1. Replication between PostgreSQL 10 and 11 versions using built-in logical replication.
  2. Replication between PostgreSQL 9.4 or (<  PG 11) to PostgreSQL 11 using an extension named pglogical .

We might opt to implement replication as a method of upgrade to minimize downtime for the application. Once all the data to date has been replicated to another PostgreSQL server on the latest version, you can simply switch your application to the new server with a minimal downtime… though of course this does depends on the complexity of your application stack.

Logical replication in PostgreSQL allows users to perform a selective replication of a few tables and open a standby for writes. Whereas physical replication in PostgreSQL is a block level replication. In this case, each database in the master is replicated to a standby, and the standby is not open for writes. Going forward, we’ll refer to physical replication as streaming replication.

With logical replication, a standby can have replication enabled from multiple masters. This could be helpful in situations where you need to replicate data from several PostgreSQL databases (OLTP) to a single PostgreSQL server for reporting and data warehousing.

One of the biggest advantages of logical over streaming replication is that logical replication allows us to replicate changes from an older version PostgreSQL to a later version. Streaming replication works only when both the master and standby are of the same major version. We’d recommend they have same minor version too for best practice.

Replication between PostgreSQL 10 and 11 versions

Starting from PostgreSQL 10, logical replication is available with the PostgreSQL source by default. So, you could easily replicate a PostgreSQL 10 database to PostgreSQL 11. Logical Replication uses a publish and subscribe model. The node that sends the changes becomes a publisher. And the node that subscribes to those changes becomes a subscriber. You may have one or more subscriptions to a publication.

Publication

Publication is a set of changes generated from a group of tables. It is referred to as a change set or replication set. Publications can only contains tables and cannot contain any other objects. DMLs on these tables can be replicated but not DDLs.

In a publication, you can choose what type of DML to replicate: INSERT or DELETE or UPDATE or ALL. By default, it is ALL. You must have a replica identity set on the table being published to replicate UPDATES and DELETES to a subscriber. A replica identity set helps in identifying the rows to be updated or deleted.

The primary key of a table is its default replica identity. You can also make a unique index with NO NULL values as a replica identity. If there is no primary key or a unique index with NO NULLs, then you can set the replica_identity to FULL. When a replica identity is set to FULL, postgres uses the entire row as a key. Of course, this may be inefficient.

You might see ERRORS if a table with no primary key and a non-default replica identity has been added to a publication after an UPDATE or a DELETE operation.

Subscription

A subscriber can subscribe to one or more publications. Before adding the subscription, you must ensure that the tables being replicated have been created in the subscriber node. In order to achieve that, you can perform a schema-only dump from publisher to subscriber node.

An example of logical replication

The following example steps work for logical replication between PostgreSQL 10 and 11 versions only.

On the publishing node, create a publication. You can either add all tables or choose to add selected tables to the publication.

On the subscriber node, create a subscription that refers to the publication on the publisher node. Perform a DDL dump of the tables to the subscriber node before creating the subscription, as mentioned above,

The above command also copies the pre-existing data from the tables. If you want to disable the copy of the pre-existing data, you can use the following syntax. It will then only start copying the changes to the publisher after you run this command.

Monitor the replication using the following command on the publishing node.

 

Replication between PostgreSQL 9.4 and PostgreSQL 11

Now, what about the versions that are older than PostgreSQL 10? For this purpose, there is an extension named pglogical that works for versions from 9.4 until 11. Using pglogical, you can easily replicate PostgreSQL 9.4 to PostgreSQL 11.

The following sequence of steps demonstrates a high-level procedure to setup replication between PG 9.4 and PG 11 using pglogical extension.

Step 1 : Consider pgserver_94 to be the source server with a database : percona_94 running on PostgreSQL 9.4. Create the following extensions.

Step 2 : Now, you can go ahead and add either selected tables or all the tables in a schema or multiple schemas for replication. In the following example, you can see an error when there is no primary key on one of the tables.

 

Step 3

On the subscriber node, which is our PostgreSQL 11 database, you can run similar commands as follows.

Step 4 You can then validate the replication status by querying a few tables pglogical always updates:

 

Primary key selection

At step 2, you saw how all the tables of schema : public got added to a replication set by creating a primary key on the table that doesn’t currently have one. The primary key I chose may not be the right one for that table as it is just for demonstration. However, when you choose a primary key, make sure that you are selecting the right one. It needs to be always unique and use column(s) that don’t normally contain NULLs. If you don’t research primary key selection thoroughly, you could cause downtime to your application. Here’s an example error that you could encounter:

So far we have seen how you can use pglogical to create replication between an older version to a newer version PostgreSQL. After you have set up replication, you can easily switch your applications to the latest version with a lower downtime.

posted on 2021-01-13 21:23  荣锋亮  阅读(198)  评论(0编辑  收藏  举报

导航