在分析函数基础上性能提高5倍的postgresql/lightdb特性--使用LATERAL JOINS代替分析函数优化按照每大类查询TOP N语句

  常规的做法中,按照每大类查询TOP N语句通常使用分析函数代替聚合函数进行优化(通常性能可以大幅度提升),实现的是扫描主表一次。但是order by是少不了的。如下:

复制代码
CREATE TABLE tags (
  id serial PRIMARY KEY,
  name VARCHAR(255)
);

CREATE TABLE movies (
  id serial PRIMARY KEY,
  name VARCHAR(255),
  tag_id int NOT NULL,
  created_at timestamp NOT NULL DEFAULT NOW(),
  FOREIGN KEY (tag_id) REFERENCES tags(id) ON UPDATE CASCADE
);

CREATE INDEX movies_tag_id_index ON movies (tag_id);
复制代码
复制代码
-- Genres
INSERT INTO "tags"("name") VALUES('Action');
INSERT INTO "tags"("name") VALUES('Animation');
INSERT INTO "tags"("name") VALUES('Sci-Fi');

-- Movies
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('The Matrix', (SELECT id FROM "tags" where "name" = 'Action'), '1999-05-21');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Tenet', (SELECT id FROM "tags" where "name" = 'Action'), '2020-10-29');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Wonder Woman 1984', (SELECT id FROM "tags" where "name" = 'Action'), '2020-12-25');

INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Toy Story', (SELECT id FROM "tags" where "name" = 'Animation'), '1995-12-22');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Monsters Inc.', (SELECT id FROM "tags" where "name" = 'Animation'), '2001-11-14');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Finding Nemo', (SELECT id FROM "tags" where "name" = 'Animation'), '2003-07-4');

INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Arrival', (SELECT id FROM "tags" where "name" = 'Sci-Fi'), '2016-10-24');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('Minority Report', (SELECT id FROM "tags" where "name" = 'Sci-Fi'), '2002-08-02');
INSERT INTO "movies"("name", "tag_id", "created_at") VALUES('The Midnight Sky', (SELECT id FROM "tags" where "name" = 'Sci-Fi'), '2020-12-23');
复制代码
复制代码
-- Generates 3_000_000 movies
INSERT INTO "movies"("name", "tag_id")
SELECT
   generate_series(1,1000000) as "name",
   (SELECT id FROM "tags" where "name" = 'Action')
;

INSERT INTO "movies"("name", "tag_id")
SELECT
   generate_series(1,1000000) as "name",
   (SELECT id FROM "tags" where "name" = 'Animation')
;

INSERT INTO "movies"("name", "tag_id")
SELECT
   generate_series(1,1000000) as "name",
   (SELECT id FROM "tags" where "name" = 'Sci-Fi')
;
复制代码
复制代码
zjh@postgres=# explain analyze
zjh@postgres-# with movies_by_tags (tag_id, name, created_at, rank) as (
zjh@postgres(#   SELECT
zjh@postgres(#     tag_id,
zjh@postgres(#     name,
zjh@postgres(#     created_at,
zjh@postgres(#     ROW_NUMBER() OVER(PARTITION BY tag_id ORDER BY tag_id, created_at DESC)
zjh@postgres(#   FROM movies
zjh@postgres(# )
zjh@postgres-# select *
zjh@postgres-# from movies_by_tags mbt
zjh@postgres-# where mbt.rank < 3
zjh@postgres-# ;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan on mbt  (cost=118008.82..127372.23 rows=89175 width=536) (actual time=2348.745..7563.132 rows=6 loops=1)
   Filter: (mbt.rank < 3)
   Rows Removed by Filter: 2999994
   ->  WindowAgg  (cost=118008.82..124028.15 rows=267526 width=536) (actual time=2348.740..7169.455 rows=3000000 loops=1)
         ->  Sort  (cost=118008.82..118677.63 rows=267526 width=528) (actual time=1804.146..2247.689 rows=3000000 loops=1)
               Sort Key: movies.tag_id, movies.created_at DESC
               Sort Method: external merge  Disk: 96624kB
               ->  Seq Scan on movies  (cost=0.00..21784.26 rows=267526 width=528) (actual time=0.021..589.504 rows=3000000 loops=1)
 Planning Time: 0.205 ms
 Execution Time: 7580.399 ms
(10 rows)
复制代码
复制代码
zjh@postgres=# explain analyze
SELECT *
FROM tags t
JOIN LATERAL (
  SELECT m.*
  FROM movies m
  WHERE m.tag_id = t.id
  ORDER BY m.created_at DESC
  FETCH FIRST 2 ROWS ONLY
) e1 ON true
;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1350.22..189048.15 rows=280 width=1052) (actual time=449.138..1347.527 rows=6 loops=1)
   ->  Seq Scan on tags t  (cost=0.00..11.40 rows=140 width=520) (actual time=0.010..0.014 rows=3 loops=1)
   ->  Limit  (cost=1350.22..1350.22 rows=2 width=532) (actual time=449.163..449.163 rows=2 loops=3)
         ->  Sort  (cost=1350.22..1353.56 rows=1338 width=532) (actual time=449.159..449.160 rows=2 loops=3)
               Sort Key: m.created_at DESC
               Sort Method: top-N heapsort  Memory: 25kB
               ->  Index Scan using movies_tag_id_index on movies m  (cost=0.42..1336.84 rows=1338 width=532) (actual time=0.028..302.523 rows=1000000 loops=3)
                     Index Cond: (tag_id = t.id)
 Planning Time: 0.163 ms
 Execution Time: 1347.567 ms
(10 rows)
复制代码

 

https://www.cybertec-postgresql.com/en/understanding-lateral-joins-in-postgresql/#

https://amandasposito.com/postgresql/performance/2021/01/04/postgres-lateral-join.html

 

https://blog.csdn.net/carcarrot/article/details/107174468

https://blog.csdn.net/horses/article/details/86510905 mysql 8.0.14支持

https://vladmihalcea.com/sql-lateral-join/ Oracle 12c支持

https://blog.csdn.net/carcarrot/article/details/107174468 pg 9.3支持

sql server可通过cross join模拟

posted @   zhjh256  阅读(107)  评论(0编辑  收藏  举报
编辑推荐:
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
阅读排行:
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!
历史上的今天:
2020-02-09 spring boot rabbitmq集成
2017-02-09 错误 1 error LNK2019: 无法解析的外部符号 __imp__pthread_create,该符号在函数 _main 中被引用 解决方法
点击右上角即可分享
微信分享提示