postgresql中三种不适合使用if not exists、if exists的场景

2022-05-31 08:37 abce 阅读(8526) 评论(1) 编辑收藏举报

postgresql中，许多ddl语句支持if exists、if not exists。例如：

postgres=# create table if not exists abce();
CREATE TABLE
postgres=# drop table if exists abce;
DROP TABLE
postgres=#

建议只是在必须的时候在ddl中使用if exists、if not exists。以下是三个示例，展示了过度使用他们而产生的负面效应。

示例1：create table if not exists

假设我们使用以下一些工具(如Flyway、Sqitch或嵌入到 ORM/框架中的如 Ruby on Rails Active Record 迁移)部署以下内容：

create table if not exists abce(
  id int8 primary key,
  title text not null,
  created_at timestamptz not null default now()
);

严格来讲，这里的问题是我们没法保证结果：可能会有一个已经存在的表，使用相同的表名，但是不同的表结构。可能是之前的

它可能是由同一工具之前部署的相同数据库schema，或者由另一个工具或手动执行创建的：

postgres=# -- something from the past
postgres=# create table if not exists abce(
  id int8 primary key
);
CREATE TABLE
postgres=#
postgres=# -- our attempt to create a table
postgres=# create table if not exists abce(
  id int8 primary key,
  title text not null,
  created_at timestamptz not null default now()
);
NOTICE:  relation "abce" already exists, skipping
CREATE TABLE
postgres=#
postgres=# -- what do we have?
postgres=# \d abce
              Table "public.mytable"
 Column |  Type  | Collation | Nullable | Default
--------+--------+-----------+----------+---------
 id     | bigint |           | not null |
Indexes:
    "abce_pkey" PRIMARY KEY, btree (id)

为什么工程师在这里加上if not exists呢？因为他们不确定他们的变更是否已经部署到非正式环境：比如测试、开发、阶段性测试环境等。if not exists隐藏了错误，并可以多次部署。然而，这种方法只是隐藏了这些环境管理的不完善。而如不是完善了工作流，我们现在增加了各种schema更改异常甚至生产事件的风险。

示例2：drop table if exists

大多数现代schema版本控制工具，支持do和undo步骤(也被称作部署、撤消)。

有人认为“撤消”步骤是无用的——根据环境、部署工作流程和 DevOps 文化，它们可能或多或少有价值。但是，在某些情况下，我们可以找到以下内容：

drop table if exists abce;

这种方法有什么不好？它可以隐藏schema更改脚本中的逻辑问题。例如，假设我们开发了一些要部署在数千个地方的软件，并且我们的schema更改脚本——它们的“执行”和“撤消”部分——应该在任何地方都可靠地执行。当我们的客户需要恢复升级时，我们依靠“撤消”步骤；它可能是由于与数据库不一定相关的各种问题而引发的。

我们有完善的CI/CD管道以高级形式测试我们的更改，以确保逻辑正确并更早地检测异常——它们总是使用CI/CD管道中的链“do -> undo -> do”进行测试。

现在让我们考虑一下以下情况：

-- "do"
create table abce2(
  id int8 primary key
);
-- "undo"
drop table if exists abce2;
drop table if exists abce; --有人错误将该语句放到了这里

这里会发生什么错误呢？CI/CD管道不会发现任何问题——“do -> undo -> do”链会很好地工作。删除abce 的尝试不会失败；我们只会收到通知：

postgres=# drop table if exists abce;
NOTICE:  table "abce" does not exist, skipping
DROP TABLE

但是，如果abce是很久以前由另一个数据库schema迁移创建的呢？如果需要在某处执行“撤消”步骤，我们将删除该表。这不是我们所期望的！对于这样的错误事件，我们确实需要在CI测试中报出错误——但是if exists会“掩盖”问题。结果，自动化测试无法捕捉到问题，并且这种错误的更改有被发布的风险。

示例3：create index concurrently if not exists

对于负载大的表，建议使用create index concurrently，它会比create index工作更长时间，但不会导致停机。(这里的停机是指短暂的不可用)

经常看到DBA如何在生产数据库上尝试各种索引想法，试图为一些次优查询找到优化，然后，一旦找到，他们建议开发人员将索引包含到迁移中。在这种情况下，如果不存在，则很容易使用create index concurrently if not exists。

然而，问题是create index concurrently不是原子的：如果一次尝试部署失败，那么索引将仍被定义了，处于无效状态。这是一个简单的例子：

postgres=# create table abce(
  id int8 PRIMARY KEY,
  title text NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE

postgres=# insert into abce(title)
  select random()::text
  from generate_series(1, 10000000) i;
INSERT 0 10000000

postgres=# set statement_timeout to '1ms';
SET
postgres=# create index concurrently if not exists abce_title_idx
  on abce
  using btree (title);
ERROR:  canceling statement due to statement timeout
postgres=# reset statement_timeout;
RESET

test=# \d abce
                         Table "public.abce"
   Column   |           Type           | Collation | Nullable | Default
------------+--------------------------+-----------+----------+---------
 id         | bigint                   |           | not null |
 title      | text                     |           | not null |
 created_at | timestamp with time zone |           | not null | now()
Indexes:
    "abce_pkey" PRIMARY KEY, btree (id)
    "abce_title_idx" btree (title) INVALID

在这种情况下，再次尝试部署CREATE INDEX CONCURRENTLY IF NOT EXISTS将不执行任何操作并报告“成功”：

postgres=# create index concurrently if not exists abce_title_idx
  on abce
  using btree (title);
NOTICE:  relation "abce_title_idx" already exists, skipping
CREATE INDEX

有趣的是，在这种情况下——实际上，我们通常会说在大负载下在大表上创建任何索引——if exists 可能会有所帮助。我们可以在“do”步骤中使用它来支持清理——如果存在，则drop index concurrently，让 create index concurrently干净地执行：

postgres=# drop index concurrently if exists abce_title_idx;
DROP INDEX
postgres=#
postgres=# create index concurrently if not exists abce_title_idx
  on abce
  using btree (title);
CREATE INDEX

这可能就足够了。除非我们忘记了dba留下valid索引的操作！（当然，在生产环境中测试索引想法并不是一个好主意）

如何删除被标记为无效的索引？在这种情况下，我们可以实现一些逻辑，仅当索引的 pg_index.indisvalid 为 false 时才同时运行 drop index：

postgres=# select indisvalid from pg_index
where indexrelid = 'abce_title_idx'::regclass::oid;
 indisvalid
------------
 f
(1 row)

不幸的是，这个逻辑不能以pl/pgsql代码的匿名do块的形式编写脚本，因为 drop index concurrently不能在显式定义的事务、函数或匿名do块内执行——所以逻辑需要在外部编码。在这样的块中运行drop index是可能的，但这不适合零停机时间做更改的要求。

何时使用 if not exists / if exists

如果我们的目标是获得可预测的结果，最好避免在数据库schema迁移中使用 if not exists / if exists。

在其他一些情况下，例如用于基准测试的脚本，它们可能非常有用。例如，我们可以编写引导脚本，以幂等的方式创建用于基准测试的db schema，并在多个基准测试迭代中使用它——所以不管它是否是第100次迭代的第一次迭代，结果总是可以预测的：

drop table if exists test_table;
create table test_table(
    ...
);

或者在重新创建之前删除整个“基准测试”schema：

drop schema if exists benchmark cascade;
create schema ...
create ...

在其他情况下，尤其是在处理生产、多个系统、多人时——最好将 if not exists / if exists 的使用减少到最低限度。

刷新页面返回顶部

abce don't dream it , be it