PostgreSQL的pg_stats学习

磨砺技术珠矶,践行数据之道,追求卓越价值

回到上一级页面: PostgreSQL统计信息索引页     回到顶级页面:PostgreSQL索引页

对于pg_stas,说明文档在这里:

http://www.postgresql.org/docs/9.1/static/view-pg-stats.html

下面做一个实验:

先建立一个表

postgres=# create table test(id integer);
CREATE TABLE
postgres=# \x
Expanded display is on.
postgres=# 

此后,观察 pg_stats 中与test表相关的数据,结果是还没有数据。

postgres=# \d pg_stats;
        View "pg_catalog.pg_stats"
      Column       |   Type   | Modifiers 
-------------------+----------+-----------
 schemaname        | name     | 
 tablename         | name     | 
 attname           | name     | 
 inherited         | boolean  | 
 null_frac         | real     | 
 avg_width         | integer  | 
 n_distinct        | real     | 
 most_common_vals  | anyarray | 
 most_common_freqs | real[]   | 
 histogram_bounds  | anyarray | 
 correlation       | real     | 

postgres=# select * from pg_stats where tablename='test';
(No rows)

然后,插入两条数据后看看有何变化:

postgres=# insert into test values(1);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
(No rows)
postgres=# insert into test values(2);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
(No rows)

非得anaylize 一下,才可以:

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  | 
most_common_freqs | 
histogram_bounds  | {1,2}
correlation       | 1

后,再插入一条数据,进行对比:也是必须得使用analyze后,才会起变化:

可以看到:当表test中只有 1 和 2 两条数据的时候,其 n_distinct 为 -1,表示它们的值现在是唯一的,就是说没有重复值。

此时:

most_common_vals 和 most_common_freqs的值都是空的。

histogram_bounds 的值是 {1,2},正好是刚才输入的值。

当再插入一条 id=2 的记录之后,状况发生了变化:

由于此id列的值不再是unique的了{1,2,2},所以n_distinct 不再是-1了。

由于2出现的最多,所以n_distinct 变为了2 出现的比率: -0.66667,most_common_vals 和 most_common_freqs的值也表明了这一点。

postgres=# insert into test values(2);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  | 
most_common_freqs | 
histogram_bounds  | {1,2}
correlation       | 1

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=# 

 接着观察 correlation :

correlation 表达的是 逻辑顺序与物理顺序的关系。

由于我插入数据按由小到大来作的,分别插入了 1,2,2,故逻辑顺序与物理顺序目前线性正相关,所以 correlation 为1。

而当我再插入 10,9,5,6之后,逻辑顺序与物理顺序开始发生不一致。

逻辑顺序:{1,2,2,10,9,5,6},故correlation 变成了 0.678571

postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=# insert into test values(10);
INSERT 0 1
postgres=# insert into test values(9);
INSERT 0 1
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-----------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.666667
most_common_vals  | {2}
most_common_freqs | {0.666667}
histogram_bounds  | 
correlation       | 1

postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+---------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.8
most_common_vals  | {2}
most_common_freqs | {0.4}
histogram_bounds  | {1,9,10}
correlation       | 0.9

postgres=# insert into test values(5);
INSERT 0 1
postgres=# insert into test values(6);
INSERT 0 1
postgres=# analyze;
ANALYZE
postgres=# select * from pg_stats where tablename='test';
-[ RECORD 1 ]-----+-------------
schemaname        | public
tablename         | test
attname           | id
inherited         | f
null_frac         | 0
avg_width         | 4
n_distinct        | -0.857143
most_common_vals  | {2}
most_common_freqs | {0.285714}
histogram_bounds  | {1,5,6,9,10}
correlation       | 0.678571

postgres=# 

回到上一级页面: PostgreSQL统计信息索引页     回到顶级页面:PostgreSQL索引页

磨砺技术珠矶,践行数据之道,追求卓越价值

posted @ 2013-08-17 13:32  健哥的数据花园  阅读(1725)  评论(0编辑  收藏  举报