PostgreSQL的pg_stats学习
磨砺技术珠矶,践行数据之道,追求卓越价值
回到上一级页面: PostgreSQL统计信息索引页 回到顶级页面:PostgreSQL索引页
对于pg_stas,说明文档在这里:
http://www.postgresql.org/docs/9.1/static/view-pg-stats.html
下面做一个实验:
先建立一个表
postgres=# create table test(id integer); CREATE TABLE postgres=# \x Expanded display is on. postgres=#
此后,观察 pg_stats 中与test表相关的数据,结果是还没有数据。
postgres=# \d pg_stats; View "pg_catalog.pg_stats" Column | Type | Modifiers -------------------+----------+----------- schemaname | name | tablename | name | attname | name | inherited | boolean | null_frac | real | avg_width | integer | n_distinct | real | most_common_vals | anyarray | most_common_freqs | real[] | histogram_bounds | anyarray | correlation | real | postgres=# select * from pg_stats where tablename='test'; (No rows)
然后,插入两条数据后看看有何变化:
postgres=# insert into test values(1); INSERT 0 1 postgres=# select * from pg_stats where tablename='test'; (No rows) postgres=# insert into test values(2); INSERT 0 1 postgres=# select * from pg_stats where tablename='test'; (No rows)
非得anaylize 一下,才可以:
postgres=# analyze; ANALYZE postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -1 most_common_vals | most_common_freqs | histogram_bounds | {1,2} correlation | 1
然后,再插入一条数据,进行对比:也是必须得使用analyze后,才会起变化:
可以看到:当表test中只有 1 和 2 两条数据的时候,其 n_distinct 为 -1,表示它们的值现在是唯一的,就是说没有重复值。
此时:
most_common_vals 和 most_common_freqs的值都是空的。
histogram_bounds 的值是 {1,2},正好是刚才输入的值。
当再插入一条 id=2 的记录之后,状况发生了变化:
由于此id列的值不再是unique的了{1,2,2},所以n_distinct 不再是-1了。
由于2出现的最多,所以n_distinct 变为了2 出现的比率: -0.66667,most_common_vals 和 most_common_freqs的值也表明了这一点。
postgres=# insert into test values(2); INSERT 0 1 postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -1 most_common_vals | most_common_freqs | histogram_bounds | {1,2} correlation | 1 postgres=# analyze; ANALYZE postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+----------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -0.666667 most_common_vals | {2} most_common_freqs | {0.666667} histogram_bounds | correlation | 1 postgres=#
接着观察 correlation :
correlation 表达的是 逻辑顺序与物理顺序的关系。
由于我插入数据按由小到大来作的,分别插入了 1,2,2,故逻辑顺序与物理顺序目前线性正相关,所以 correlation 为1。
而当我再插入 10,9,5,6之后,逻辑顺序与物理顺序开始发生不一致。
逻辑顺序:{1,2,2,10,9,5,6},故correlation 变成了 0.678571
postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+----------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -0.666667 most_common_vals | {2} most_common_freqs | {0.666667} histogram_bounds | correlation | 1 postgres=# insert into test values(10); INSERT 0 1 postgres=# insert into test values(9); INSERT 0 1 postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+----------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -0.666667 most_common_vals | {2} most_common_freqs | {0.666667} histogram_bounds | correlation | 1 postgres=# analyze; ANALYZE postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+--------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -0.8 most_common_vals | {2} most_common_freqs | {0.4} histogram_bounds | {1,9,10} correlation | 0.9 postgres=# insert into test values(5); INSERT 0 1 postgres=# insert into test values(6); INSERT 0 1 postgres=# analyze; ANALYZE postgres=# select * from pg_stats where tablename='test'; -[ RECORD 1 ]-----+------------- schemaname | public tablename | test attname | id inherited | f null_frac | 0 avg_width | 4 n_distinct | -0.857143 most_common_vals | {2} most_common_freqs | {0.285714} histogram_bounds | {1,5,6,9,10} correlation | 0.678571 postgres=#
回到上一级页面: PostgreSQL统计信息索引页 回到顶级页面:PostgreSQL索引页
磨砺技术珠矶,践行数据之道,追求卓越价值