PostgreSQL的pg_stats学习

磨砺技术珠矶，践行数据之道，追求卓越价值

回到上一级页面： PostgreSQL统计信息索引页回到顶级页面：PostgreSQL索引页

对于pg_stas，说明文档在这里：

http://www.postgresql.org/docs/9.1/static/view-pg-stats.html

下面做一个实验：

先建立一个表

postgres=# create table test(id integer);
CREATE TABLE
postgres=# \x
Expanded display is on.
postgres=#

此后，观察 pg_stats 中与test表相关的数据，结果是还没有数据。

postgres=# \d pg_stats;
        View "pg_catalog.pg_stats"
      Column       |   Type   | Modifiers 
-------------------+----------+-----------
 schemaname        | name     | 
 tablename         | name     | 
 attname           | name     | 
 inherited         | boolean  | 
 null_frac         | real     | 
 avg_width         | integer  | 
 n_distinct        | real     | 
 most_common_vals  | anyarray | 
 most_common_freqs | real[]   | 
 histogram_bounds  | anyarray | 
 correlation       | real     | 

postgres=# select * from pg_stats where tablename='test';
(No rows)

然后，插入两条数据后看看有何变化：

postgres=# insert into test values(1);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
(No rows)
postgres=# insert into test values(2);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
(No rows)

非得anaylize 一下，才可以：

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  | 
most_common_freqs | 
histogram_bounds  | {1,2}
correlation       | 1

然后，再插入一条数据，进行对比：也是必须得使用analyze后，才会起变化：

可以看到：当表test中只有 1 和 2 两条数据的时候，其 n_distinct 为 -1，表示它们的值现在是唯一的，就是说没有重复值。

此时:

most_common_vals 和 most_common_freqs的值都是空的。

histogram_bounds 的值是 {1,2}，正好是刚才输入的值。

当再插入一条 id=2 的记录之后，状况发生了变化：

由于此id列的值不再是unique的了{1,2,2}，所以n_distinct 不再是-1了。

由于2出现的最多，所以n_distinct 变为了2 出现的比率: -0.66667，most_common_vals 和 most_common_freqs的值也表明了这一点。

postgres=# insert into test values(2);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  | 
most_common_freqs | 
histogram_bounds  | {1,2}
correlation       | 1

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=#

接着观察 correlation :

correlation 表达的是逻辑顺序与物理顺序的关系。

由于我插入数据按由小到大来作的，分别插入了 1，2，2，故逻辑顺序与物理顺序目前线性正相关，所以 correlation 为1。

而当我再插入 10,9,5,6之后，逻辑顺序与物理顺序开始发生不一致。

逻辑顺序：{1,2,2,10,9,5,6}，故correlation 变成了 0.678571

postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=# insert into test values(10);
INSERT 0 1
postgres=# insert into test values(9);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+---------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.8
most_common_vals  | {2}
most_common_freqs | {0.4}
histogram_bounds  | {1,9,10}
correlation       | 0.9

postgres=# insert into test values(5);
INSERT 0 1
postgres=# insert into test values(6);
INSERT 0 1
postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.857143
most_common_vals  | {2}
most_common_freqs | {0.285714}
histogram_bounds  | {1,5,6,9,10}
correlation       | 0.678571

postgres=#

回到上一级页面： PostgreSQL统计信息索引页回到顶级页面：PostgreSQL索引页

磨砺技术珠矶，践行数据之道，追求卓越价值

posted @ 2013-08-17 13:32 健哥的数据花园阅读(1792) 评论(0) 收藏举报

刷新页面返回顶部

健哥的数据花园

PostgreSQL的pg_stats学习

公告