Fork me on GitHub

导航

多维数据查询效率分析

有时需求需要我们把系统做成灵活的。最常见的形式是，属性不能是固定的，要用户可以自定义。这样的需求往往会在数据库中建模成一个一对多的关系。

create table person {

...　　

}

create table person_attribute {

person_id ...

attribute_name ...

attribute_value ...

...

}

这样的建模在没有查询需要的时候，还是蛮不错的。但是一旦需要对扩充的属性值进行查询，速度往往惨不忍睹。曾经在新加坡做过一个电信的遗留系统的前端，其数据库的建模就是这样的。对于中间的属性表，一个简单的查询都需要join好几次，速度非常慢。好在那次只是做ETL，并不是直接把这样的数据库做后端，要不然肯定死的很惨。当时的做法是把所有的数据读入到内存中，针对属性的查询用内存集合遍历来实现。这样做的前提是集合的元素数量非常少（几百而已），总数据量也非常少。但是如果我们需要处理的数据量非常大，那么我们就必须在数据库中能够对多维数据进行高效查询。

为了搞清楚这个问题，我们需要做一系列实验。先来介绍一下我们实验的对象。

假定我们有一张contacts表，然后对每个联系人有一个contact_categories的表，简称cc

create table cc(
  contact_id integer,
  cad_id integer,
  value integer);

cad_id代表字段的id，value是cateogry的值。数据量是500万。实验用的数据库分别是PostgreSQL（原始数据大小250M），和使用MYISAM引擎的MySQL（原始数据大小73M）。使用的磁盘是普通的笔记本硬盘，没有raid，普通的ext4分区，峰值传输率大概是70M/s。假设没有资源的争抢，而且数据库总是以最快的顺序读的方式从磁盘中加载数据，那么PostgreSQL得用4s，而MySQL也需要1s才能把所有的磁盘内容读到内存中。

所以如果我们使用基于磁盘的解决方案的话，无论如何也无法把查询压缩到1s以内。因为我们需要提供一个Reponsive的界面前端，所以数据必须能够在内存中被查询，可能不是所有的数据都能放入内存，但是最起码被查询到的数据得一直在内存中。最简单的办法把数据库移到内存中的方式不是改数据库的设置，而是直接把内存映射成文件夹：

sudo mount -t ramfs -o size=200000m ramfs /mnt/memory

然后把数据库的数据目录移动到/mnt/memory之中。但是数据库的缓存设置也是必须修改的，比如PostgreSQL的work_mem如果设置过小的话，在做对一个很大的表做count(distinct xxx)时就会导致中间结果被写入到临时表之中。所以我们还是把所有的缓存搞大一些吧。

work_mem = 1000MB
shared_buffer = 1000MB
temp_buffer = 1000MB
effective_cache_size = 1000MB
wal_buffers = 1000MB
auto_vacuum = off

让我们来看看最基本的一个查询能有多快吧

taowen@dmright-perf:~$ time psql postgres -c 'select count(*) from cc;'
  count  
---------
 5904385
(1 row)

real    0m0.448s
user    0m0.024s
sys    0m0.000s

count(*)和count(contact_id)是一样的么，让我们来试验一下

taowen@dmright-perf:~$ time psql postgres -c 'select count(contact_id) from cc;'
  count  
---------
 5904385
(1 row)

real    0m0.537s
user    0m0.016s
sys    0m0.008s

有意思！居然比count(*)还要慢。可能是因为我们没有给contact_id字段加索引的原因。好吧，加上索引。

taowen@dmright-perf:~$ time psql postgres -c 'create index contact_id_idx on cc(contact_id);'
CREATE INDEX

real    0m4.848s
user    0m0.016s
sys    0m0.008s
taowen@dmright-perf:~$ time psql postgres -c 'analyze cc;'
ANALYZE

real    0m0.197s
user    0m0.024s
sys    0m0.004s

再试试看

taowen@dmright-perf:~$ time psql postgres -c 'select count(contact_id) from cc;'
  count  
---------
 5904385
(1 row)

real    0m0.534s
user    0m0.020s
sys    0m0.008s

taowen@dmright-perf:~$ time psql postgres -c 'select count(*) from cc;'
  count  
---------
 5904385
(1 row)

real    0m0.447s
user    0m0.028s
sys    0m0.008s

基本上没有变化……count(*)貌似就是比count(contact_id)要快。不管啦，既然小于500ms，也算是够快了。让我们给查询加上个条件吧。

taowen@dmright-perf:~$ time psql postgres -c 'select count(*) from cc where cad_id = 101 and value = 5;'
 count  
--------
 998839
(1 row)

real    0m0.686s
user    0m0.024s
sys    0m0.000s

taowen@dmright-perf:~$ time psql postgres -c 'select count(contact_id) from cc where cad_id = 101 and value = 5;'
 count  
--------
 998839
(1 row)

real    0m0.660s
user    0m0.024s
sys    0m0.000s

我们可以看到，在count(*)和count(contact_id)之间没有特别大的区别。但是我们还没有给cad_id和value加索引，所以让我们加上看看如何

taowen@dmright-perf:~$ time psql postgres -c 'create index cad_id_value_idx on cc(cad_id, value);'
CREATE INDEX

real    0m10.069s
user    0m0.020s
sys    0m0.008s

taowen@dmright-perf:~$ time psql postgres -c 'analyze cc;'
ANALYZE

real    0m0.199s
user    0m0.016s
sys    0m0.012s

and try again.

taowen@dmright-perf:~$ time psql postgres -c 'select count(contact_id) from cc where cad_id = 101 and value = 5;'
 count  
--------
 998839
(1 row)

real    0m0.283s
user    0m0.020s
sys    0m0.012s

真是很快很快耶！让我们来回顾一下，我们现在对于contact_id和(cad_id, value)都建立了索引。
接下来我们把查询弄得更复杂一些，加上AND条件。有三种可能的方式：
1、INTERSECT
2、INNER JOIN
3、IN + SUB QUERY
我们每种做法都试试

taowen@dmright-perf:~$ time psql postgres -c 'select count(contact_id) from (select contact_id from cc where cad_id = 101 and value = 5 intersect select contact_id from cc where cad_id = 102 and value = 7) as temp'
 count  
--------
 164788
(1 row)

real    0m1.159s
user    0m0.028s
sys    0m0.008s

taowen@dmright-perf:~$ time psql postgres -c 'select count(*) from (select contact_id from cc where cad_id = 101 and value = 5 intersect select contact_id from cc where cad_id = 102 and value = 7) as temp' count  
--------
 164788
(1 row)

real    0m1.148s
user    0m0.032s
sys    0m0.000s

显而易见，这么做很慢。那么INNER JOIN是不是更快一些呢？

taowen@dmright-perf:~$ time psql postgres -c 'select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id and a2.cad_id = 102 and a2.value = 7 where a1.cad_id = 101 and a1.value = 5;'
 count  
--------
 164788
(1 row)

real    0m1.162s
user    0m0.036s
sys    0m0.000s

这大概要慢上个200ms了。那么IN + SUB QUERY呢？

taowen@dmright-perf:~$ time psql postgres -c 'select count(a1.contact_id) from cc as a1 where a1.cad_id = 101 and a1.value = 5 and a1.contact_id in (select contact_id from cc as a2 where a2.cad_id = 102 and a2.value = 7)'
 count  
--------
 164788
(1 row)

real    0m2.645s
user    0m0.024s
sys    0m0.004s

靠，居然更慢了。总结就是，在有两个条件的情况下，INTERSECT似乎是最快的。但是即便如此，它也超过了一秒钟了。为什么会这样呢？时间都花哪里去了？

taowen@dmright-perf:~$ time psql postgres -c 'explain analyze select count(contact_id) from (select contact_id from cc where cad_id = 101 and value = 5 intersect select contact_id from cc where cad_id = 102 and value = 7) as temp'
                                                                              QUERY PLAN                                                                               
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=107853.91..107853.92 rows=1 width=4) (actual time=1471.907..1471.907 rows=1 loops=1)
   ->  Subquery Scan on temp  (cost=7467.63..107375.65 rows=191301 width=4) (actual time=1366.131..1459.419 rows=164788 loops=1)
         ->  HashSetOp Intersect  (cost=7467.63..105462.64 rows=191301 width=4) (actual time=1366.129..1439.781 rows=164788 loops=1)
               ->  Append  (cost=7467.63..103326.69 rows=854380 width=4) (actual time=69.436..797.478 rows=1829408 loops=1)
                     ->  Subquery Scan on "*SELECT* 2"  (cost=7467.63..48180.53 rows=351876 width=4) (actual time=69.435..350.031 rows=830569 loops=1)
                           ->  Bitmap Heap Scan on cc  (cost=7467.63..44661.77 rows=351876 width=4) (actual time=69.434..264.538 rows=830569 loops=1)
                                 Recheck Cond: ((cad_id = 102) AND (value = 7))
                                 ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7379.66 rows=351876 width=0) (actual time=64.162..64.162 rows=830569 loops=1)
                                       Index Cond: ((cad_id = 102) AND (value = 7))
                     ->  Subquery Scan on "*SELECT* 1"  (cost=10667.56..55146.16 rows=502504 width=4) (actual time=69.846..331.544 rows=998839 loops=1)
                           ->  Bitmap Heap Scan on cc  (cost=10667.56..50121.12 rows=502504 width=4) (actual time=69.845..233.794 rows=998839 loops=1)
                                 Recheck Cond: ((cad_id = 101) AND (value = 5))
                                 ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..10541.94 rows=502504 width=0) (actual time=64.501..64.501 rows=998839 loops=1)
                                       Index Cond: ((cad_id = 101) AND (value = 5))
 Total runtime: 1477.484 ms
(15 rows)

首先，INTERSECT不是并行执行的。两个子查询分别花费了300ms以上的时间，加起来有800ms是用在搜集contact_id上了。其次，集合之间的并集操作花费了600多ms。其余的时间都花在了数集合的成员个数上了。有一点值得注意的是，根据 http://postgresql.1045698.n5.nabble.com/ANTI-JOIN-needs-table-index-scan-not-possible-td3425340.html index scan和heap scan实际上都用上了索引。然而，SELECT还是在其之上额外耗费了100ms，我猜测它可能是回到原始的表结构中把行取出来，以获得contact_id的值。这在磁盘上的话速度会更慢，因为会是random seek操作。
看完了INTERSECT，让我们再来分析分析Join：

taowen@dmright-perf:~$ time psql postgres -c 'explain analyze select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id and a2.cad_id = 102 and a2.value = 7 where a1.cad_id = 101 and a1.value = 5;'
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=180366.31..180366.32 rows=1 width=4) (actual time=1280.964..1280.964 rows=1 loops=1)
   ->  Merge Join  (cost=174783.00..180104.02 rows=104912 width=4) (actual time=1043.879..1270.197 rows=164788 loops=1)
         Merge Cond: (a1.contact_id = a2.contact_id)
         ->  Sort  (cost=97705.18..98961.44 rows=502504 width=4) (actual time=586.735..626.292 rows=998839 loops=1)
               Sort Key: a1.contact_id
               Sort Method:  quicksort  Memory: 71397kB
               ->  Bitmap Heap Scan on cc a1  (cost=10667.56..50121.12 rows=502504 width=4) (actual time=75.377..257.403 rows=998839 loops=1)
                     Recheck Cond: ((cad_id = 101) AND (value = 5))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..10541.94 rows=502504 width=0) (actual time=69.565..69.565 rows=998839 loops=1)
                           Index Cond: ((cad_id = 101) AND (value = 5))
         ->  Sort  (cost=77077.83..77957.52 rows=351876 width=4) (actual time=457.131..492.428 rows=830569 loops=1)
               Sort Key: a2.contact_id
               Sort Method:  quicksort  Memory: 63509kB
               ->  Bitmap Heap Scan on cc a2  (cost=7467.63..44661.77 rows=351876 width=4) (actual time=60.039..186.422 rows=830569 loops=1)
                     Recheck Cond: ((cad_id = 102) AND (value = 7))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7379.66 rows=351876 width=0) (actual time=54.929..54.929 rows=830569 loops=1)
                           Index Cond: ((cad_id = 102) AND (value = 7))
 Total runtime: 1286.846 ms
(18 rows)

貌似大部分时间都花在了排序上。而且它也不是并行执行的。如果后台同时执行top命令的话，就会发现只有一个核是被实际占用着的。
要是再创建更多的索引呢？会不会有帮助？

taowen@dmright-perf:~$ time psql postgres -c 'create index cad_id_value_contact_id on cc(cad_id, value, contact_id);'CREATE INDEX

real    0m10.683s
user    0m0.020s
sys    0m0.004s
taowen@dmright-perf:~$ time psql postgres -c 'explain analyze select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id and a2.cad_id = 102 and a2.value = 7 where a1.cad_id = 101 and a1.value = 5;'
                                                                        QUERY PLAN                                                                         
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=180366.31..180366.32 rows=1 width=4) (actual time=1338.478..1338.478 rows=1 loops=1)
   ->  Merge Join  (cost=174783.00..180104.02 rows=104912 width=4) (actual time=1090.136..1327.312 rows=164788 loops=1)
         Merge Cond: (a1.contact_id = a2.contact_id)
         ->  Sort  (cost=97705.18..98961.44 rows=502504 width=4) (actual time=615.843..657.438 rows=998839 loops=1)
               Sort Key: a1.contact_id
               Sort Method:  quicksort  Memory: 71397kB
               ->  Bitmap Heap Scan on cc a1  (cost=10667.56..50121.12 rows=502504 width=4) (actual time=80.926..275.431 rows=998839 loops=1)
                     Recheck Cond: ((cad_id = 101) AND (value = 5))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..10541.94 rows=502504 width=0) (actual time=75.816..75.816 rows=998839 loops=1)
                           Index Cond: ((cad_id = 101) AND (value = 5))
         ->  Sort  (cost=77077.83..77957.52 rows=351876 width=4) (actual time=474.279..510.866 rows=830569 loops=1)
               Sort Key: a2.contact_id
               Sort Method:  quicksort  Memory: 63509kB
               ->  Bitmap Heap Scan on cc a2  (cost=7467.63..44661.77 rows=351876 width=4) (actual time=65.335..198.655 rows=830569 loops=1)
                     Recheck Cond: ((cad_id = 102) AND (value = 7))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7379.66 rows=351876 width=0) (actual time=60.314..60.314 rows=830569 loops=1)
                           Index Cond: ((cad_id = 102) AND (value = 7))
 Total runtime: 1346.587 ms
(18 rows)

还是不行！Merge Join慢的话，升级到9.1然后强制使用hash join会不会好一点？

postgres@dmright-perf:/mnt/memory/pg$ time psql -c 'explain analyze select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id where a2.cad_id = 102 and a2.value = 7 and a1.cad_id = 101 and a1.value = 5';
                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=902356.85..902356.86 rows=1 width=4) (actual time=1693.137..1693.137 rows=1 loops=1)
   ->  Hash Join  (cost=59599.78..902135.88 rows=88389 width=4) (actual time=461.788..1682.718 rows=164788 loops=1)
         Hash Cond: (a1.contact_id = a2.contact_id)
         ->  Bitmap Heap Scan on cc a1  (cost=10833.24..50406.32 rows=510472 width=4) (actual time=76.337..224.571 rows=998839 loops=1)
               Recheck Cond: ((cad_id = 101) AND (value = 5))
               ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..10705.62 rows=510472 width=0) (actual time=71.028..71.028 rows=998839 loops=1)
                     Index Cond: ((cad_id = 101) AND (value = 5))
         ->  Hash  (cost=44444.15..44444.15 rows=345792 width=4) (actual time=385.256..385.256 rows=830569 loops=1)
               Buckets: 65536  Batches: 1  Memory Usage: 29200kB
               ->  Bitmap Heap Scan on cc a2  (cost=7341.27..44444.15 rows=345792 width=4) (actual time=64.778..258.059 rows=830569 loops=1)
                     Recheck Cond: ((cad_id = 102) AND (value = 7))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7254.82 rows=345792 width=0) (actual time=59.675..59.675 rows=830569 loops=1)
                           Index Cond: ((cad_id = 102) AND (value = 7))
 Total runtime: 1698.207 ms
(14 rows)

有两个原因造成这样的情况：
1、两个条件意味着扫描两次，而且不是并行扫描
2、join自身很耗费时间，无论是sort merge join还是hash join

两个条件都这熊样了，三个条件呢？看看吧，先上INTERSECT：

postgres@dmright-perf:/mnt/memory/pg$ time psql postgres -c 'select count(contact_id) from (select contact_id from cc where cad_id = 101 and value = 5 intersect select contact_id from cc where cad_id = 102 and value = 7 intersect select contact_id from cc where cad_id = 6 and value = 1) as temp'
 count 
-------
  6748
(1 row)

real    0m1.350s
user    0m0.020s
sys    0m0.008s

然后是INNER JOIN

postgres@dmright-perf:/mnt/memory/pg$ time psql -c 'select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id inner join cc as a3 on a1.contact_id = a3.contact_id where a2.cad_id = 102 and a2.value = 7 and a1.cad_id = 101 and a1.value = 5 and a3.cad_id = 6 and a3.value = 1';
 count 
-------
  6748
(1 row)

real    0m0.756s
user    0m0.028s
sys    0m0.000s

然后是IN + SUBQUERY：

postgres@dmright-perf:/mnt/memory/pg$ time psql -c 'select count(a1.contact_id) from cc as a1 where a1.cad_id = 101 and a1.value = 5 and a1.contact_id in (select contact_id from cc as a2 where a2.cad_id = 102 and a2.value = 7 and a2.contact_id in (select contact_id from cc as a3 where a3.cad_id = 6 and a3.value = 1))';
 count 
-------
  6748
(1 row)

real    0m7.320s
user    0m0.024s
sys    0m0.004s

我们可以看到这回INNER JOIN是最快的了。为什么？

postgres@dmright-perf:/mnt/memory/pg$ time psql postgres -c 'explain analyze select count(contact_id) from (select contact_id from cc where cad_id = 101 and value = 5 intersect select contact_id from cc where cad_id = 102 and value = 7 intersect select contact_id from cc where cad_id = 6 and value = 1) as temp'
                                                                                       QUERY PLAN                                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=132357.21..132357.22 rows=1 width=4) (actual time=1659.666..1659.666 rows=1 loops=1)
   ->  Subquery Scan on temp  (cost=306.85..132335.71 rows=8602 width=4) (actual time=1648.818..1659.243 rows=6748 loops=1)
         ->  HashSetOp Intersect  (cost=306.85..132249.69 rows=8602 width=4) (actual time=1648.817..1658.543 rows=6748 loops=1)
               ->  Append  (cost=306.85..131691.89 rows=223118 width=4) (actual time=33.797..1547.811 rows=372016 loops=1)
                     ->  Subquery Scan on "*SELECT* 3"  (cost=306.85..26138.13 rows=14239 width=4) (actual time=33.797..153.230 rows=207228 loops=1)
                           ->  Bitmap Heap Scan on cc  (cost=306.85..25995.74 rows=14239 width=4) (actual time=33.795..134.292 rows=207228 loops=1)
                                 Recheck Cond: ((cad_id = 6) AND (value = 1))
                                 ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..303.29 rows=14239 width=0) (actual time=28.558..28.558 rows=207228 loops=1)
                                       Index Cond: ((cad_id = 6) AND (value = 1))
                     ->  Result  (cost=7341.27..105553.76 rows=208879 width=4) (actual time=1282.426..1371.255 rows=164788 loops=1)
                           ->  HashSetOp Intersect  (cost=7341.27..105553.76 rows=208879 width=4) (actual time=1282.418..1353.422 rows=164788 loops=1)
                                 ->  Append  (cost=7341.27..103413.10 rows=856264 width=4) (actual time=57.550..734.282 rows=1829408 loops=1)
                                       ->  Subquery Scan on "*SELECT* 2"  (cost=7341.27..47902.07 rows=345792 width=4) (actual time=57.550..292.595 rows=830569 loops=1)
                                             ->  Bitmap Heap Scan on cc  (cost=7341.27..44444.15 rows=345792 width=4) (actual time=57.549..218.162 rows=830569 loops=1)
                                                   Recheck Cond: ((cad_id = 102) AND (value = 7))
                                                   ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7254.82 rows=345792 width=0) (actual time=52.606..52.606 rows=830569 loops=1)
                                                         Index Cond: ((cad_id = 102) AND (value = 7))
                                       ->  Subquery Scan on "*SELECT* 1"  (cost=10833.24..55511.04 rows=510472 width=4) (actual time=69.129..330.737 rows=998839 loops=1)
                                             ->  Bitmap Heap Scan on cc  (cost=10833.24..50406.32 rows=510472 width=4) (actual time=69.128..242.416 rows=998839 loops=1)
                                                   Recheck Cond: ((cad_id = 101) AND (value = 5))
                                                   ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..10705.62 rows=510472 width=0) (actual time=64.161..64.161 rows=998839 loops=1)
                                                         Index Cond: ((cad_id = 101) AND (value = 5))
 Total runtime: 1665.691 ms
(23 rows)

对于INTERSECT来说，过程和两个条件是差不多的，只是集合更大一些罢了。

postgres@dmright-perf:/mnt/memory/pg$ time psql -c 'explain analyze select count(a1.contact_id) from cc as a1 inner join cc as a2 on a1.contact_id = a2.contact_id inner join cc as a3 on a1.contact_id = a3.contact_id where a2.cad_id = 102 and a2.value = 7 and a1.cad_id = 101 and a1.value = 5 and a3.cad_id = 6 and a3.value = 1';
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=118979.47..118979.48 rows=1 width=4) (actual time=771.393..771.393 rows=1 loops=1)
   ->  Nested Loop  (cost=49073.39..118977.89 rows=630 width=4) (actual time=399.835..770.587 rows=6748 loops=1)
         ->  Hash Join  (cost=49073.39..97159.97 rows=2466 width=8) (actual time=399.814..691.519 rows=34219 loops=1)
               Hash Cond: (a3.contact_id = a2.contact_id)
               ->  Bitmap Heap Scan on cc a3  (cost=306.85..25995.74 rows=14239 width=4) (actual time=25.655..83.526 rows=207228 loops=1)
                     Recheck Cond: ((cad_id = 6) AND (value = 1))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..303.29 rows=14239 width=0) (actual time=20.570..20.570 rows=207228 loops=1)
                           Index Cond: ((cad_id = 6) AND (value = 1))
               ->  Hash  (cost=44444.15..44444.15 rows=345792 width=4) (actual time=373.969..373.969 rows=830569 loops=1)
                     Buckets: 65536  Batches: 1  Memory Usage: 29200kB
                     ->  Bitmap Heap Scan on cc a2  (cost=7341.27..44444.15 rows=345792 width=4) (actual time=59.271..250.932 rows=830569 loops=1)
                           Recheck Cond: ((cad_id = 102) AND (value = 7))
                           ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7254.82 rows=345792 width=0) (actual time=54.030..54.030 rows=830569 loops=1)
                                 Index Cond: ((cad_id = 102) AND (value = 7))
         ->  Index Scan using cad_id_value_contact_id_idx on cc a1  (cost=0.00..8.83 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=34219)
               Index Cond: ((cad_id = 101) AND (value = 5) AND (contact_id = a2.contact_id))
 Total runtime: 774.588 ms
(17 rows)

对于INNER JOIN，索引都被利用上了。它不需要取得contact_id然后再来做集合操作。结论是对rowid做hash操作比集合操作更快。
不显示用INNER JOIN，让Planner决定Join顺序也是一样的：

postgres@dmright-perf:/mnt/memory/pg$ time psql -c 'explain analyze select count(a1.contact_id) from cc as a1, cc as a2, cc as a3 where a1.contact_id = a2.contact_id and a2.contact_id = a3.contact_id and a2.cad_id = 102 and a2.value = 7 and a1.cad_id = 101 and a1.value = 5 and a3.cad_id = 6 and a3.value = 1';
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=118979.47..118979.48 rows=1 width=4) (actual time=762.969..762.970 rows=1 loops=1)
   ->  Nested Loop  (cost=49073.39..118977.89 rows=630 width=4) (actual time=398.554..762.206 rows=6748 loops=1)
         ->  Hash Join  (cost=49073.39..97159.97 rows=2466 width=8) (actual time=398.531..684.425 rows=34219 loops=1)
               Hash Cond: (a3.contact_id = a2.contact_id)
               ->  Bitmap Heap Scan on cc a3  (cost=306.85..25995.74 rows=14239 width=4) (actual time=34.802..91.672 rows=207228 loops=1)
                     Recheck Cond: ((cad_id = 6) AND (value = 1))
                     ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..303.29 rows=14239 width=0) (actual time=29.524..29.524 rows=207228 loops=1)
                           Index Cond: ((cad_id = 6) AND (value = 1))
               ->  Hash  (cost=44444.15..44444.15 rows=345792 width=4) (actual time=363.537..363.537 rows=830569 loops=1)
                     Buckets: 65536  Batches: 1  Memory Usage: 29200kB
                     ->  Bitmap Heap Scan on cc a2  (cost=7341.27..44444.15 rows=345792 width=4) (actual time=57.799..245.467 rows=830569 loops=1)
                           Recheck Cond: ((cad_id = 102) AND (value = 7))
                           ->  Bitmap Index Scan on cad_id_value_idx  (cost=0.00..7254.82 rows=345792 width=0) (actual time=52.349..52.349 rows=830569 loops=1)
                                 Index Cond: ((cad_id = 102) AND (value = 7))
         ->  Index Scan using cad_id_value_contact_id_idx on cc a1  (cost=0.00..8.83 rows=1 width=4) (actual time=0.002..0.002 rows=0 loops=34219)
               Index Cond: ((cad_id = 101) AND (value = 5) AND (contact_id = a2.contact_id))
 Total runtime: 766.107 ms
(17 rows)

结果是一样的。

基本上对PostgreSQL的实验就到这里了，差不多也就这样了，提高空间不大。Google之后发现，PostgreSQL使用的MVCC机制导致其甚至在做SELECT COUNT这样的操作的时候也会去更新hint bit。也许这就是其慢的重要原因。MySQL的MYISAM引擎以完全不负责事务和著称，应该会比PostgreSQL有更大的提升，下一篇中，我们将对MySQL重复同样的实验。

多维数据查询效率分析（2）

上次我们分析了在附加属性表这样表结构设计下的PostgreSQL查询效率。由于PostgreSQL众所周知的所谓“性能”问题，所以有必要再用使用MyISAM引擎的MySQL再来实验一遍。在我们详细分析了两种常见的开源数据库之后，话题将会进一步引申到按行存储的数据库结构以及索引对于查询效率的影响。以下实验中的MySQL为MariaDB发行版本。还是从建表开始：

MariaDB [veil]> show create table cc2;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                         |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| cc2   | CREATE TABLE `cc2` (
  `contact_id` int(11) NOT NULL,
  `cad_id` int(11) NOT NULL,
  `value` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MariaDB [veil]> select count(*) from cc2;
+----------+
| count(*) |
+----------+
|  5904385 |
+----------+
1 row in set (0.00 sec)

MySQL在表上缓存了count(*)的结果，所以查询是不需要花费时间的。再来创建必要的索引：

MariaDB [veil]> create index contact_id_idx on cc2(contact_id);
Query OK, 5904385 rows affected (3.11 sec)             
Records: 5904385  Duplicates: 0  Warnings: 0

MariaDB [veil]> create index cad_id_value_idx on cc2(cad_id, value);
Query OK, 5904385 rows affected (8.17 sec)             
Records: 5904385  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from cc2 where cad_id = 101 and value = 5;
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.18 sec)

我们可以看到，这速度那是刚刚的。比较PostgreSQL中的相同的查询，速度要快上4倍。但是：

MariaDB [veil]> select count(contact_id) from cc2 where cad_id = 101 and value = 5;
+-------------------+
| count(contact_id) |
+-------------------+
|            998839 |
+-------------------+
1 row in set (0.41 sec)

这个查询比前一个慢，是因为count(*)是数返回行的rowid，而count(contact_id)是数真正的contact_id列，而这个列的值是不包含在cad_id_value_idx中的，如果创建更多的索引的话，速度就要更快一些了：

MariaDB [veil]> create index cad_id_value_contact_id_idx on cc2(cad_id, value, contact_id);
Query OK, 5904385 rows affected (13.37 sec)            
Records: 5904385  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(contact_id) from cc2 where cad_id = 101 and value = 5;
+-------------------+
| count(contact_id) |
+-------------------+
|            998839 |
+-------------------+
1 row in set (0.21 sec)

这个在MySQL中被称作covering index。大概PostgreSQL 9.2还没发布的index only query也是这个意思吧，我猜的。
现在，让我们来看看MySQL是否能够在两个条件的情况下表现得比PostgreSQL强。首先尝试INTERSECT吧：

MariaDB [veil]> select count(*) from (select contact_id from cc as a1 where a1.cad_id = 101 and a1.value = 5 intersect select contact_id from cc as a2 where a2.cad_id = 102 and a2.value = 7) as temp;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'intersect select contact_id from cc as a2 where a2.cad_id = 102 and a2.value = 7' at line 1

我靠，MySQL居然不支持INTERSECT。由于我们知道IN + SUBQUERY肯定是更慢的，所以就只剩INNER JOIN这一种写法了。

MariaDB [veil]> select count(*) from cc as a1, cc as a2 where a1.contact_id = a2.contact_id and a1.cad_id = 101 and a1.value = 5 and a2.cad_id = 102 and a2.value = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (6.56 sec)

MariaDB [veil]> select count(a1.contact_id) from cc as a1, cc as a2 where a1.contact_id = a2.contact_id and a1.cad_id = 101 and a1.value = 5 and a2.cad_id = 102 and a2.value = 7;
+----------------------+
| count(a1.contact_id) |
+----------------------+
|               164788 |
+----------------------+
1 row in set (6.67 sec)

你没看错，这结果就是这么惨不忍睹。它甚至比PostgreSQL用IN + SUBQUERY实现得还要慢。看看到底是啥状况吧：

MariaDB [veil]> explain extended select count(a1.contact_id) from cc as a1, cc as a2 where a1.contact_id = a2.contact_id and a1.cad_id = 101 and a1.value = 5 and a2.cad_id = 102 and a2.value = 7;
+----+-------------+-------+------+------------------------------------+-------------------------+---------+--------------------------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys                      | key                     | key_len | ref                            | rows   | filtered | Extra                    |
+----+-------------+-------+------+------------------------------------+-------------------------+---------+--------------------------------+--------+----------+--------------------------+
|  1 | SIMPLE      | a1    | ref  | contact_id,cad_id_value_contact_id | cad_id_value_contact_id | 10      | const,const                    | 808542 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | a2    | ref  | contact_id,cad_id_value_contact_id | cad_id_value_contact_id | 15      | const,const,veil.a1.contact_id |      1 |   100.00 | Using index              |
+----+-------------+-------+------+------------------------------------+-------------------------+---------+--------------------------------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)

虽然不像PostgreSQL的分析结果那般详细。但是至少还是可以看出，索引确实是被利用上了。所以我感觉，MySQL真的不擅长复杂的join。这就是问题了，join处理不好，多条件的情况就没法支持好了。

对于我们的第一次尝试，可以总结出以下几点：

基于磁盘的方案行不通，必须放在内存中
无条件：PostgreSQL慢，MySQL慢
一个条件：PostgreSQL快，MySQL更快（拜covering index所赐）
两个条件：PostgreSQL慢，MySQL更慢。INNER JOIN稍慢于INTERSECT。
三个条件：PostgreSQL快于MySQL，而且INNER JOIN要快于INTERSECT。

总体来说，要把请求响应时间控制在一秒之内那难度是相当的大啊。所以建模思路必须变化。那我们再来看看把行变列会不会好一些吧：

MariaDB [veil]> show create table spike;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                         |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| spike | CREATE TABLE `spike` (
  `contact_id` int(11) NOT NULL,
  `a1` int(11) DEFAULT NULL,
  `a2` int(11) DEFAULT NULL,
  `a3` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MariaDB [veil]> select * from spike limit 10;
+------------+------+------+------+
| contact_id | a1   | a2   | a3   |
+------------+------+------+------+
|     800001 |    4 | NULL | NULL |
|     800003 |    5 |    7 |    1 |
|     800004 |    3 | NULL | NULL |
|     800005 |    3 | NULL | NULL |
|     800006 |    5 | NULL | NULL |
|     800007 | NULL |    6 | NULL |
|     800008 |    5 | NULL | NULL |
|     800009 |    4 |    8 | NULL |
|     800011 | NULL |    7 | NULL |
|     800012 |    3 |    6 |    1 |
+------------+------+------+------+
10 rows in set (0.01 sec)

列a1来自于cad_id=101的值，a2来自于cad_id=102的值，a3来自于cad_id=6的值。

首先试试一个条件的性能吧：

MariaDB [veil]> select count(*) from spike where a1 = 5;
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.26 sec)

速度不错，即便我们还没给a1加索引呢。这是因为PostgreSQL和MySQL这样的按行存储的数据库，在需要做sequential scan的时候，性能很大程度上决定于行的大小。如果我们有100行，这性能就肯定不会这么好了。不信的话可以试试：

MariaDB [veil]> alter table spike add column dummy1 TEXT;
Query OK, 4074980 rows affected (1.28 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5;
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.64 sec)

不出意料，仅仅只是添加了一个TEXT字段，查询就慢了这么多。为了克服按行存储的查询速度随行的列数增加和越来越慢的问题，索引变得特别必要：

MariaDB [veil]> create index a1_idx on spike(a1);        
Query OK, 4074980 rows affected (4.97 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5;
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.22 sec)

我们可以看到，查询时间又变得正常了。因为a1_idx是一个covering index，所以count无需真的读取行，只需要查询索引就可以知道有多少行了。而且MYISAM又不像PostgreSQL那样，因为MVCC还需要对原始行更新hint bit，所以整个查询就和行的大小一点关系都没有了。

MariaDB [veil]> alter table spike add column dummy2 TEXT;
Query OK, 4074980 rows affected (5.00 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike add column dummy3 TEXT;
Query OK, 4074980 rows affected (5.30 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5;
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.23 sec)

对吧，没骗你吧。
那么两个条件如何？先把表结构回到正常的状态，把索引都去掉：

MariaDB [veil]> alter table spike drop column dummy1;
Query OK, 4074980 rows affected (5.19 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike drop column dummy2;
Query OK, 4074980 rows affected (4.98 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike drop column dummy3;
Query OK, 4074980 rows affected (3.41 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike drop index a1_idx;
Query OK, 4074980 rows affected (0.46 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> show create table spike;             
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                         |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| spike | CREATE TABLE `spike` (
  `contact_id` int(11) NOT NULL,
  `a1` int(11) DEFAULT NULL,
  `a2` int(11) DEFAULT NULL,
  `a3` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

接下来用两个条件做查询：

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.30 sec)

速度不错嘛！给a1创建一个索引，应该速度会更快吧：

MariaDB [veil]> create index a1_idx on spike(a1);                  
Query OK, 4074980 rows affected (3.06 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.45 sec)

尼玛居然更慢了……原因不是索引没有被使用，而是因为用了索引所以更慢了

MariaDB [veil]> explain select count(*) from spike where a1 = 5 and a2 = 7;
+----+-------------+-------+------+---------------+--------+---------+-------+---------+-------------+
| id | select_type | table | type | possible_keys | key    | key_len | ref   | rows    | Extra       |
+----+-------------+-------+------+---------------+--------+---------+-------+---------+-------------+
|  1 | SIMPLE      | spike | ref  | a1_idx        | a1_idx | 5       | const | 1420338 | Using where |
+----+-------------+-------+------+---------------+--------+---------+-------+---------+-------------+
1 row in set (0.01 sec)

a1索引确实被使用了，但是a2并没有在同一个索引之中。所以这不是一个covering index，查询就不能仅仅在索引中查询了，还要回到原始的行存储中去检查a2=7。

MariaDB [veil]> create index a2_idx on spike(a2);
Query OK, 4074980 rows affected (5.71 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.40 sec)

给a2_idx创建了一个单独的索引也于事无补，因为在扫描a1_idx的时候，还是没有a2。查询仍然需要回到原始的行存储中去获得a2。

MariaDB [veil]> create index a1_a2_idx on spike(a1, a2);
Query OK, 4074980 rows affected (9.47 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.06 sec)

现在，我们才看到什么叫做速度！它是完全基于索引的。

MariaDB [veil]> alter table spike add column dummy1 TEXT;
Query OK, 4074980 rows affected (12.33 sec)            
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike add column dummy2 TEXT;
Query OK, 4074980 rows affected (12.87 sec)            
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike add column dummy3 TEXT;
Query OK, 4074980 rows affected (13.15 sec)            
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.06 sec)

把行的大小增大也没有关系，因为压根就不会去读真正的行。但是一旦把covering index移除，情况就回到原来那个德行了：

MariaDB [veil]> alter table spike drop index a1_a2_idx;
Query OK, 4074980 rows affected (8.74 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.62 sec)

MariaDB [veil]> alter table spike drop index a1_idx;
Query OK, 4074980 rows affected (5.31 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> alter table spike drop index a2_idx;
Query OK, 4074980 rows affected (1.93 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7;
+----------+
| count(*) |
+----------+
|   164788 |
+----------+
1 row in set (0.72 sec)

所以嘛，wide table的表设计确实会是一个问题。一旦索引罩不住了，情况就糟糕了。

如果covering index这么棒的话，那为何不给所有的查询都建立covering index呢？这样问题可不就解决了么。

真是牛逼的想法，试试吧：

MariaDB [veil]> create index a1_a2_a3_idx on spike(a1, a2, a3);
Query OK, 4074980 rows affected (7.03 sec)             
Records: 4074980  Duplicates: 0  Warnings: 0

MariaDB [veil]> select count(*) from spike where a1 = 5 and a2 = 7 and a3 = 1;
+----------+
| count(*) |
+----------+
|     6748 |
+----------+
1 row in set (0.01 sec)

真的耶！covering index又立功了。似乎一旦建立了这样的一个索引，所有的查询问题都解决了。

MariaDB [veil]> select count(*) from spike where a1 = 5;                      
+----------+
| count(*) |
+----------+
|   998839 |
+----------+
1 row in set (0.28 sec)

看见没，它确实管用！

MariaDB [veil]> select count(*) from spike where a3 = 1;
+----------+
| count(*) |
+----------+
|   207228 |
+----------+
1 row in set (1.06 sec)

我靠，高兴太早了。栽了。原因是索引的顺序是a1，a2，a3，对于a1的查询因为和索引建立的顺序相符，所以用的上。类似的a1，a2的查询也能用得上。但是a3和建立顺序不符，就无法利用上。如果要真正的达到覆盖所有的情况，那么将是一个排列组合的结果。对于十几列的表来说，这个数字会大得让系统无法承受。

总结起来就是，无论是磁盘还是内存，无论是附加的属性表，还是把属性建模成列，无论是PostgreSQL还是MySQL都没有办法提供Responsive Query。传统的按行存储的数据库无法满足多维数据的高速查询需求。

http://www.cnblogs.com/taowen/archive/2012/02/26/2368768.html

posted on 2012-02-27 09:22 HackerVirus 阅读(475) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

公告