Using hints for Postgresql
本文转自:http://pghintplan.osdn.jp/pg_hint_plan.html
pg_hint_plan 1.1
Name
pg_hint_plan -- controls execution plan with hinting phrases in comment of special form.
Synopsis
PostgreSQL uses cost based optimizer, which utilizes data statistics, not static rules. The planner (optimizer) esitimates costs of each possible execution plans for a SQL statement then the execution plan with the lowest cost finally be executed. The planner does its best to select the best best execution plan, but not perfect, since it doesn't count some properties of the data, for example, correlation between columns.
pg_hint_plan makes it possible to tweak execution plans using so-called "hints", which are simple descriptions in the SQL comment of special form.
Description
Basic Usage
pg_hint_plan reads hinting phrases in a comment of special form given with the target SQL statement. The special form is beginning by the character sequence "/*+" and ends with "*/". Hint phrases are consists of hint name and following parameters enclosed by parentheses and delimited by spaces. Each hinting phrases can be delimited by new lines for readability.
In the example below , hash join is selected as the joning method and scanning pgbench_accounts by sequential scan method.
postgres=# /*+
postgres*# HashJoin(a b)
postgres*# SeqScan(a)
postgres*# */
postgres-# EXPLAIN SELECT *
postgres-# FROM pgbench_branches b
postgres-# JOIN pgbench_accounts a ON b.bid = a.bid
postgres-# ORDER BY a.aid;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=31465.84..31715.84 rows=100000 width=197)
Sort Key: a.aid
-> Hash Join (cost=1.02..4016.02 rows=100000 width=197)
Hash Cond: (a.bid = b.bid)
-> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97)
-> Hash (cost=1.01..1.01 rows=1 width=100)
-> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100)
(7 rows)
postgres=#
The types of hints
Hinting phrases are classified into four types based on what kind of object they can affect. Scaning methods, join methods, joining order, row number correction and GUC setting. You will see the lists of hint phrases of each type in Hint list.
Hints for scan methods
Scan method hints enforce the scanning method on the table specified as parameter. pg_hint_plan recognizes the target table by alias names if any. They are 'SeqScan' , 'IndexScan' and so on.
Scan hints are effective on ordinary tables, inheritance tables, UNLOGGED tables, temporary tables and system catalogs. It cannot be applicable on external(foreign) tables, table functions, VALUES command results, CTEs, Views and Sub-enquiries.
Hints for join methods
Join method hints enforces the join methods of the joins consists of tables specified as parameters.
Ordinary tables, inheritance tables, UNLOGGED tables, temporary tables, external (foreign) tables, system catalogs, table functions, VALUES command results and CTEs are allowed to be in the parameter list. But views and sub query are not.
Hint for joining order
Joining in specific order can be enforced using the "Leading" hint. The objects are joined in the order of the objects in the parameter list.
Hint for row number correction
From the restriction of the planner's capability, it misestimates the number of results on some conditions. This type of hint corrects it.
GUC parameters temporarily setting
'Set' hint changes GUC parameters just while planning. GUC parameter shown in Query Planning can have the expected effects on planning unless any other hint conflicts with the planner method configuration parameters. The last one among hints on the same GUC parameter makes effect. GUC parameters for pg_hint_plan are also settable by this hint but it won't work as your expectation. See Restrictions for details.
GUC parameters for pg_hint_plan
GUC parameters below affect the behavior of pg_hint_planpg_hint_plan.
Parameter name | discription | Default |
---|---|---|
pg_hint_plan.enable_hint | Enbles or disables the function of pg_hint_plan. | on |
pg_hint_plan.debug_print | Enables and select the verbosity of the debug output of pg_hint_plan. off, on, detailed and verbose are valid. | off |
pg_hint_plan.message_level | Specifies the message level of debug prints. error, warning, notice, info, log, debug are valid and fatal and panic are inhibited. | info |
PostgreSQL 9.1 requires a custom variable class to be defined for those GUC parameters. See custom_variable_classes for details.
Installation
This section describes the installation steps.
building binary module
Simplly run "make" in the top of the source tree, then "make install" as appropriate user. The PATH environment variable should be set properly for the target PostgreSQL for this process.
$ tar xzvf pg_hint_plan-1.x.x.tar.gz $ cd pg_hint_plan-1.x.x $ make $ su # make install
Loding pg_hint_plan
Basically pg_hint_plan does not requires CREATE EXTENSION. Simplly loading it by LOAD command will activate it and of course you can load it globally by setting shared_preload_libraries in postgresql.conf. Or you might be interested in ALTER USER SET/ALTER DATABASE SET for automatic loading for specific sessions.
postgres=# LOAD 'pg_hint_plan'; LOAD postgres=#
Do CREATE EXTENSION and SET pg_hint_plan.enable_hint_tables TO on if you are planning to hint tables.
Unistallation
"make uninstall" in the top directory of source tree will uninstall the installed files if you installed from the source tree and it is left available.
$ cd pg_hint_plan-1.x.x $ su # make uninstall
Hint descriptions
This section explains how to spell each type of hints.
Scan method hints
Scan hints have basically has one parameter to specify the target object. This additional parameter for scans using indexes is preferable index name. The target object should be specified by its alias name if any. In the following example, table1 is scanned by sequential scan and table2 is scanned using the primary key index.
postgres=# /*+ postgres*# SeqScan(t1) postgres*# IndexScan(t2 t2_pkey) postgres*# */ postgres-# SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key);
Join method hints
Join hints have two or more objects which compose the join as parameters. If three objects are specified, the hint will be applied when joining any one of them after joining other two objects. In the following example, table1 and table2 are joined fisrt using nested loop and the result is joined against table3 using merge join.
postgres=# /*+ postgres*# NestLoop(t1 t2) postgres*# MergeJoin(t1 t2 t3) postgres*# Leading(t1 t2 t3) postgres*# */ postgres-# SELECT * FROM table1 t1 postgres-# JOIN table table2 t2 ON (t1.key = t2.key) postgres-# JOIN table table3 t3 ON (t2.key = t3.key);
Joining order hints
Although there might be the case that table2 and 3 are joined first and table1 after that and the NestLoop hint won't be in effect after all. "Leading" hint enforces the joining order for the cases. The Leading hint in the above example enforces the joining order to table1, 2, 3 then both join method hints will be effective.
The above form of Leading hint enforces joining order but joining direction (inner/outer or driven/driving assignment) is left to the planner. If you want to also enforce joining directions, the second form of this hint will help.
postgres=# /*+ Leading((t1 (t2 t3))) */ SELECT...
Every pair of parentheses enclose two elements which are an object or nested parentheses. The first element in a pair of parentheses is the driver or outer table and the second is the driven or inner.
Row number correction hints
Planner misestimates the number of the records for joins on some condition. This hint can corrects the number by several methods, which are absolute value, addition/subtraction and multiplication. The parameters are the list of objects compose the targetted join then operation. The following example shows notations to correct the number of the join on a and b by the four correction methods.
postgres=# /*+ Rows(a b #10) */ SELECT... ; Sets rows of join result to 10 postgres=# /*+ Rows(a b +10) */ SELECT... ; Increments row number by 10 postgres=# /*+ Rows(a b -10) */ SELECT... ; Subtracts 10 from the row number. postgres=# /*+ Rows(a b *10) */ SELECT... ; Makes the number 10 times larger.
GUC temporarily setting
"Set" hint sets GUC parameter values during the target statement is under plannning. In the following example, planning for the query is done with random_page_cost is 2.0.
postgres=# /*+ postgres*# Set(random_page_cost 2.0) postgres*# */ postgres-# SELECT * FROM table1 t1 WHERE key = 'value'; ...
Hint syntax
Hint comment location
pg_hint_plan reads hints from only the first block comment, and any characters except alphabets, digits, spaces, underscores, commas and parentheses are not allowed before the comment. In the following example HashJoin(a b) and SeqScan(a) are recognized as Hint and IndexScan(a) and MergeJoin(a b) is not.
postgres=# /*+
postgres*# HashJoin(a b)
postgres*# SeqScan(a)
postgres*# */
postgres-# /*+ IndexScan(a) */
postgres-# EXPLAIN SELECT /*+ MergeJoin(a b) */ *
postgres-# FROM pgbench_branches b
postgres-# JOIN pgbench_accounts a ON b.bid = a.bid
postgres-# ORDER BY a.aid;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=31465.84..31715.84 rows=100000 width=197)
Sort Key: a.aid
-> Hash Join (cost=1.02..4016.02 rows=100000 width=197)
Hash Cond: (a.bid = b.bid)
-> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97)
-> Hash (cost=1.01..1.01 rows=1 width=100)
-> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100)
(7 rows)
postgres=#
Escaping special chacaters in object names
The objects as the hint parameter should be enclosed by double quotes if they includes parentheses, double quotes and white spaces. The escaping rule is the same as PostgreSQL.
Distinction among table occurrences with the same name
Target name duplication caused by multiple occurrences of the same object or objects with the same name in different name spaces can be avoided by giving alias names for each occurrence in the target query and using them in hint phases. The example below, the first SQL statement results in error from using a table name appeared twice in the target query, while the second example works since each occurrence of table t1 is given a distinct alias name and specified in the HashJoin hint using it.
postgres=# /*+ HashJoin(t1 t1)*/
postgres-# EXPLAIN SELECT * FROM s1.t1
postgres-# JOIN public.t1 ON (s1.t1.id=public.t1.id);
INFO: hint syntax error at or near "HashJoin(t1 t1)"
DETAIL: Relation name "t1" is ambiguous.
QUERY PLAN
------------------------------------------------------------------
Merge Join (cost=337.49..781.49 rows=28800 width=8)
Merge Cond: (s1.t1.id = public.t1.id)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: s1.t1.id
-> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)
-> Sort (cost=168.75..174.75 rows=2400 width=4)
Sort Key: public.t1.id
-> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)
(8 行)
postgres=# /*+ HashJoin(pt st) */
postgres-# EXPLAIN SELECT * FROM s1.t1 st
postgres-# JOIN public.t1 pt ON (st.id=pt.id);
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=64.00..1112.00 rows=28800 width=8)
Hash Cond: (st.id = pt.id)
-> Seq Scan on t1 st (cost=0.00..34.00 rows=2400 width=4)
-> Hash (cost=34.00..34.00 rows=2400 width=4)
-> Seq Scan on t1 pt (cost=0.00..34.00 rows=2400 width=4)
(5 行)
postgres=#
Restrictions
Limitations on multiple VALUES lists in FROM clauses
All occurences of VALUES lists in FROM clauses in a query has the same name "*VALUES*" irrespective of aliases syntactically given to them or shown in explain descriptions. So it cannot be hinted at all if appeares twice or more in a target query.
Hinting on inheritance children
Inheritnce children cannot be hinted individually. They share the same hints on their parent.
Setting pg_hint_plan parameters by Set hints
pg_hint_plan paramters changes the behavior of itself so some parameters doesn't work as expected.
- Hints to change enable_hint, enable_hint_tables are ignored, but they are reported as "used hints" in debug logs.
- Setting debug_print and message_level works from midst of the processing of the target query.
Technics to hint on desired targets
Hinting on objecects implicitly used in the target query
Hints are effective on any objects with the target name even if they aren't aparent in the query, specifically objects in views. For that reason, you should create different views in which targetted objects have distinct aliases if you want to hint them differently from the first view.
In the following examples, the first query is assigning the same name "t1" on the two occurrences of the table1 so the hint SeqScan(t1) affects both scans. On the other hand the second assignes the different name 't3' on the one of them so the hint affects only on the rest one.
This mechanism also applies on rewritten queries by rules.
postgres=# CREATE VIEW view1 AS SELECT * FROM table1 t1;
CREATE TABLE
postgres=# /*+ SeqScan(t1) */
postgres=# EXPLAIN SELECT * FROM table1 t1 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1;
QUERY PLAN
-----------------------------------------------------------------
Nested Loop (cost=0.00..358.01 rows=1 width=16)
-> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8)
Filter: (key = 1)
-> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8)
Filter: (key = 1)
(5 rows)
postgres=# /*+ SeqScan(t3) */
postgres=# EXPLAIN SELECT * FROM table1 t3 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1;
QUERY PLAN
--------------------------------------------------------------------------------
Nested Loop (cost=0.00..187.29 rows=1 width=16)
-> Seq Scan on table1 t3 (cost=0.00..179.00 rows=1 width=8)
Filter: (key = 1)
-> Index Scan using foo_pkey on table1 t1 (cost=0.00..8.28 rows=1 width=8)
Index Cond: (key = 1)
(5 rows)
Hinting on the hinheritance children
Hints targeted on inheritance parents automatically affect on all their own children. Child tables cannot have their own hint specified.
Scope of hints on multistatement
One multistatement description can have exactly one hint comment and the hints affects all of the individual statement in the multistatement. Notice that the seemingly multistatement on the interactive interface of psql is internally a sequence of single statements so hints affects only on the statement just following. Conversely, every single statement have their own hint comments affect on them.
Subqueries in some contexts
Subqueries in the following context also can be hinted.
IN (SELECT ... {LIMIT | OFFSET ...} ...) = ANY (SELECT ... {LIMIT | OFFSET ...} ...) = SOME (SELECT ... {LIMIT | OFFSET ...} ...)
For these syntaxes, planner internally assigns the name of "ANY_subquery" to the subquery when planning joins including it, so join hints are applicable on such joins using the implicit name.
postgres=# /*+HashJoin(a1 ANY_subquery)*/ postgres=# EXPLAIN SELECT * postgres=# FROM pgbench_accounts a1 postgres=# WHERE aid IN (SELECT bid FROM pgbench_accounts a2 LIMIT 10); QUERY PLAN --------------------------------------------------------------------------------------------- Hash Semi Join (cost=0.49..2903.00 rows=1 width=97) Hash Cond: (a1.aid = a2.bid) -> Seq Scan on pgbench_accounts a1 (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=0.36..0.36 rows=10 width=4) -> Limit (cost=0.00..0.26 rows=10 width=4) -> Seq Scan on pgbench_accounts a2 (cost=0.00..2640.00 rows=100000 width=4) (6 rows)
Using IndexOnlyScan hint (PostgreSQL 9.2 and later)
You shoud explicitly specify an index that can perform index only scan if you put IndexOnlyScan hint on a table that have other indexes that cannot perform index only scan. Or pg_hint_plan may select them.
Precaution points for NoIndexScan hint (PostgreSQL 9.2 and later)
NoIndexScan hint involes NoIndexOnlyScan.
Errors of hints
pg_hint_plan stops parsing on any error and uses hints already parsed on the most cases. Followings are the typical errors.
Syntax errors
Any syntactical errors or wrong hint names are reported as an syntax error. These errors are reported in the server log with the message level which specified by pg_hint_plan.message_level if pg_hint_plan.debug_print is on and aboves.
Object misspecifications
Object misspecifications results silent ingorance of the hints. This kind of error is reported as "not used hints" in the server log by the same condtion to syntax errors.
Redundant or conflicting hints
The last hint will be active when redundant hints or hints conflicting with each other. This kind of error is reported as "duplication hints" in the server log by the same condition to syntax errors.
Nested comments
Hint comment cannot include another block comment within. If pg_hint_plan finds it, differently from other erros, it stops parsing and abandans all hints already parsed. This kind of error is reported in the same manner as other errors.
Functional limitations
Influences of some planner GUC parameters
The planner does not try to consider joining order for FROM clause entries more than from_collapse_limit. pg_hint_plan cannot affect joining order as expected for the case.
Cases that pg_hint_plan essentially cannot affect
By the nature of pg_hint_plan, it cannot affect some cases that out of scope of the planner like following.
- FULL OUTER JOIN to use nested loop
- To use indexes that does not have columns used in quals
- To do TID scans for queries without ctid conditions
Queries in ECPG
ECPG removes comments in queries written as embedded SQLs so hints cannot be passed form those queries. The only exception is that EXECUTE command passes given string unmodifed. Please consider hint tables for this case.
Effects on query fingerprints
The same queries with different commnets yields the same fingerprint by pg_stat_statements on PostgreSQL 9.2 and later, but they yield different fingerprints on 9.1 and earlier, so the same queires with different hints are summerized as separate queries on such versions.
Requirements
- PostgreSQL versions tested
- Version 9.1, 9.2, 9.3, 9.4
- OS versions tested
- RHEL 6.5, 7.0
See also
PostgreSQL documents
----------------------------------------------------------
《高性能SQL调优精要与案例解析》
blog1:http://www.cnblogs.com/lhdz_bj
blog2:http://blog.itpub.net/8484829
blog3:http://blog.csdn.net/tuning_optmization