Postgres的count统计误区

 数据库统计数据量时常常用的是count(1)或者count(*),这两者区别不大,在postgres里面统计数据总量时,要注意count(column_name)与count(*)的区别,有些场景下会造成数据统计不准确的误解。

官方文档上有一段话:

For example, count(*) yields the total number of input rows; 
count(f1) yields the number of input rows in which f1 is non-null, since count ignores nulls; 
and count(distinct f1) yields the number of distinct non-null values of f1.

示例:

[postgres@localhost ~]$ psql
psql (9.2.3)
Type "help" for help.

postgres=# create table t_kenyon(id int);
CREATE TABLE
postgres=# insert into t_kenyon values(1),(2),(2),(3),(null),(null);
INSERT 0 6
postgres=# select * from t_kenyon;
 id 
----
  1
  2
  2
  3
   
   
(6 rows)

postgres=# select count(1),count(*),count(id),count(distinct id) from t_kenyon;
 count | count | count | count 
-------+-------+-------+-------
     6 |     6 |     4 |     3
(1 row)

postgres=# drop table t_kenyon;
DROP TABLE

总结:
count(1)或者count(*)会统计所有行数,包括Null值;
count(字段)会对Null值去掉统计;
count(distinct column)则会对非Null字段进行去重统计。

posted @ 2021-11-01 11:02  数据库集中营  阅读(621)  评论(0编辑  收藏  举报