代码改变世界

找出PostgreSQL schema变更差异

2022-12-27 13:33  abce  阅读(151)  评论(0编辑  收藏  举报

管理PostgreSQL的例行操作之一是定期更新数据库系统的架构。PostgreSQL在更新schema、添加type、函数、触发器或修改表添加和删除列、更新列数据类型等方面提供了可靠的方法。但是,没有内置机制来帮助识别差异,更不用说生成必要的SQL,以轻松的方式完成从开发到生产环境的更新。

我们来看看可能的方法。

使用逻辑转储清单
识别从一个数据库到另一个数据库的schema的差异的最简单方法是比较schema转储清单。

下面的例子演示了一种可以在不同数据库上查找schema差异的方法:

例子:

-- create database schemas
create database db01
create database db02

-- db01: version 1
create table t1(
 c1 int,
 c2 text,
 c4 date);
 
create table t2(
 c1 int,
 c2 varchar(3),
 c3 timestamp,
 c4 date
 );

-- db02: version 2
create table t1(
 c1 serial primary key,
 c2 varchar(256),
 c3 date default now()
 );

create table t2(
 c1 serial primary key,
 c2 varchar(3),
 c3 varchar(50),
 c4 timestamp with time zone default now(),
 c5 int references t1(c1)
 );

create index on t2 (c5);

# generate schema dumps
pg_dump -s db01 -Fc > db01.db
pg_dump -s db02 -Fc > db02.db

# generate manifests
pg_restore -l db01.db > db01_manifest.ini
pg_restore -l db02.db > db02_manifest.ini

这段代码演示了通过比较md5校验和来查找差异:

# EX 1: generate checksums
md5sum \
<(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
<(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)

# output
2ae75961137c5fc9fc967ed1ed1e5405  /dev/fd/63
084ec23229bced0eb6f7011019b04c37  /dev/fd/62

下面的代码片段区分了两个清单之间的差异,仅标识已更改的对象和属性。请注意,冗余信息,即前16行,被忽略了:

# EX 2: perform diff
diff \
 <(tail +16 db01_manifest.ini | cut -d ' ' -f 4-) \
 <(tail +16 db02_manifest.ini | cut -d ' ' -f 4-)

这个结果差异显示了两个schema之间所做的更改:

2,10c2
<   Integer: 4 bytes
<   Offset: 8 bytes
<   Dumped from database version: 14.1
<   Dumped by pg_dump version: 14.1
< ;
< ;
< Entries:
< ;
< TABLE public t1 postgres
---
> SEQUENCE OWNED BY public t1_c1_seq postgres
11a4,11
> SEQUENCE public t2_c1_seq postgres
> SEQUENCE OWNED BY public t2_c1_seq postgres
> DEFAULT public t1 c1 postgres
> DEFAULT public t2 c1 postgres
> CONSTRAINT public t1 t1_pkey postgres
> CONSTRAINT public t2 t2_pkey postgres
> INDEX public t2_c5_idx postgres
> FK CONSTRAINT public t2 t2_c5_fkey postgres

好消息是,有许多现有的工具可以完成以上的工作,有商业的,也有开源的。

 

 

apgdiff扩展
网址:https://www.apgdiff.com/

Apgdiff可以在Postgres社区存储库中找到。它比较了两个schema转储文件,并创建了一个SQL输出文件,在大多数情况下,它适用于旧schema的升级:

Package: apgdiff

Version: 2.7.0-1.pgdg18.04+1
Architecture: all
Maintainer: Debian PostgreSQL Maintainers <team+postgresql@tracker.debian.org>
Installed-Size: 173
Depends: default-jre-headless | java2-runtime-headless
Homepage: https://www.apgdiff.com/
Priority: optional
Section: database
Filename: pool/main/a/apgdiff/apgdiff_2.7.0-1.pgdg18.04+1_all.deb
Size: 154800
SHA256: 9a83fcf54aed00e1a28c3d00eabe1c166977af1e26e91035e15f88b5215b181b
SHA1: ea713acb55898f07374dadd1bebb09ec2fa4b589
MD5sum: e70a97903cb23b8df8a887da4c54e945

下面的示例演示如何使用apgdiff更新开发环境和生产数据库schema之间的差异。

# EX 1: dump as SQL statements
pg_dump -s db01 -Fp > db01.sql
pg_dump -s db02 -Fp > db02.sql
createdb db03 --template=db01

apgdiff --ignore-start-with db01.sql db02.sql > db01-db02.sql

# “psql -1” encapsulates statements within a transaction
psql -1 -f db01-db02.sql db03

# EX 2: uses logical dumps
# notice the dumps are standard logical dumps and includes data
pg_dump db01 -Fc > db01.db
pg_dump db02 -Fc > db02.db
createdb db03 --template=db01

# this invocation assumes the resultant diff doesn’t require editing
apgdiff --ignore-start-with \
    <(pg_restore -s -f - db01.db) \
    <(pg_restore -s -f - db02.db) \
    | psql -1 db03