06 2013 档案
摘要:当PostgreSQL需要insert 一条记录的时候,它会把记录头放入xmin,xmax等字段。xmin的值,就是当前的Transaction的TransactionId。这是为了满足MVCC的需要。跟踪程序进行了解:/* * Allocate the next XID for a new transaction or subtransaction. * * The new XID is also stored into MyProc before returning. * * Note: when this is called, we are actually already inside
阅读全文
摘要:对于PostgreSQL的 "create index concurrently". 我个人认为其中存在一个bug。我的验证过程如下:我有两个表,tab01和 tab02,这两个表之间没有任何关联。我认为 对 tab02执行 "create index concurrently" 不会对 访问tab01的事务有任何影响,然而事实并非尽然。我第一程序的表现: 通过ecpg执行事务,再通过 "create index concurrently" 给tab02建立索引,成功。我第二程序的表现:通过ecpg执行事务,再通过 "cre
阅读全文
摘要:我给PostgreSQL的源代码加入了调试信息以后,会有如下表现:我执行Prepare:postgres=# prepare s(int) as select * from tst01 t where id < $1;PREPAREpostgres=# 背后的反应:** In PostgresMain In exec_simple_query loop for parsetree_list++++++++++++++++++++++++++Before pg_plan_queries***************In pg_plan_queries -------start.......
阅读全文
摘要:/* * get_baserel_parampathinfo * Get the ParamPathInfo for a parameterized path for a base relation, * constructing one if we don't have one already. * * This centralizes estimating the rowcounts for parameterized paths. * We need to cache those to be sure we use the same rowcount for ...
阅读全文
摘要:看官方文档:http://www.postgresql.org/docs/current/static/sql-prepare.htmlPREPARE creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an
阅读全文
摘要:Improve the planner's ability to use nested loops with inner index scans (Tom Lane)The new "parameterized path" mechanism allows inner index scans to use values from relations that are more than one join level up from the scan. This can greatly improve performance in situations where s
阅读全文
摘要:代码:import java.sql.Connection;import java.sql.DriverManager;import java.sql.PreparedStatement;import java.sql.ResultSet;public class Test02 { public static void main(String argsv[]){ try { Class.forName("org.postgresql.Driver").newInstance(); String url = "jdbc...
阅读全文
摘要:Java 代码,在数据库端,并没有当成 prepared statetment 被处理。C代码通过libpq 访问数据库端,被当成了 prepared statement 处理。也许是因PostgreSQL对JDBC的支持毕竟是后期出现的:下面看代码和运行结果:Java 代码:import java.sql.*; public class Test01 { public static void main(String argsv[]){ try { Class.forName("org.postgresql.Driver").n...
阅读全文
摘要:代码如下:[root@lex tst]# cat testlibpq.c/* * testlibpq.c * Test the C version of LIBPQ, the POSTGRES frontend library. */#include <stdio.h>#include <stdlib.h>#include "libpq-fe.h"static voidexit_nicely(PGconn *conn){ PQfinish(conn); exit(EXIT_SUCCESS);}intmain(){ int nFields; int i
阅读全文
摘要:程序:[root@lex tst]# cat testlibpq.c/* * testlibpq.c * Test the C version of LIBPQ, the POSTGRES frontend library. */#include <stdio.h>#include <stdlib.h>#include "libpq-fe.h"static voidexit_nicely(PGconn *conn){ PQfinish(conn); exit(EXIT_SUCCESS);}intmain(){ char *pghost, *pgpor
阅读全文
摘要:http://www.postgresql.org/docs/9.2/static/sql-prepare.htmlPREPARE creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE c
阅读全文
摘要:继续分析/* * final_cost_hashjoin * Final estimate of the cost and result size of a hashjoin path. * * Note: the numbatches estimate is also saved into 'path' for use later * * 'path' is already filled in except for the rows and cost fields and * num_batches * 'workspace' is the r
阅读全文
摘要:看hash join 里的概念:numbuckets与numbatches:numbuckets是 在内存中,hash表里面,"桶"的个数。numbatches是 如果hash表过大,在内存中放不下,则要分多次进行。voidfinal_cost_hashjoin(PlannerInfo *root, HashPath *path, JoinCostWorkspace *workspace, SpecialJoinInfo *sjinfo, SemiAntiJoinFactors *...
阅读全文
摘要:Path的定义:/* * Type "Path" is used as-is for sequential-scan paths, as well as some other * simple plan types that we don't need any extra information in the path for. * For other path types it is the first component of a larger struct. * * "pathtype" is the NodeTag of the Plan
阅读全文
摘要:此处,分析 add_paths_to_joinrel:/* * add_paths_to_joinrel * Given a join relation and two component rels from which it can be made, * consider all possible paths that use the two component rels as outer * and inner rel respectively. Add these paths to the join rel's pathlist * if the...
阅读全文
摘要:其代码结构:/* * build_join_rel * Returns relation entry corresponding to the union of two given rels, * creating a new relation entry if none already exists. * * 'joinrelids' is the Relids set that uniquely identifies the join * 'outer_rel' and 'inner_rel' are relation nodes for t
阅读全文
摘要:对RelOptInfo *make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) 函数进行研究:看看 inner join 时候发生的事情:RelOptInfo *make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2){ ... SpecialJoinInfo *sjinfo; ... /* Check validity and determine join type. */ if (!join_is_legal...
阅读全文
摘要:在PostgreSQL的源代码中,有如下调用关系:query_planner -->generate_base_implied_equalities -->generate_base_implied_qualities_const -->process_implied_equality -->distribute_qual_to_relsdistribute_qual_to_rels的函数定义如下:static voiddistribute_qual_to_rels(PlannerInfo *root, Node *clause, ...
阅读全文
摘要:以A表和B表的连接而言:A.col=B.colJOIN_INNER: 就是等值连接。找出A中有,B中也有,A和B的对应字段相等的记录的信息。postgres=# select * from sales s inner join customers c on s.cust_id = c.cust_id; cust_id | item | cust_id | cust_name ---------+----------+---------+----------- 2 | camera | 2 | John Doe 3 | computer | ...
阅读全文
摘要:对Join的成本计算的调用路径:make_one_rel ->make_rel_from_joinlist ->standard_join_search -> join_search_one_level -> make_join_rel -> add_paths_to_joinrel ->hash_inner_and_outer ->try_hashjoin_path ->initial_cost_hashjoin在make_join_rel函...
阅读全文
摘要:开始用 explain来看hash join:postgres=# explain select dept.no_emps,emp.age from dept,emp where emp.name = dept.mgr and dept.dept_name = 'shoe'; QUERY PLAN ------------------------------------------------------------------ Hash Join (cost=19.30..45.07...
阅读全文
摘要:由于用单纯的SQL语句来探查代码,看得还是不够清楚。所以我再采用如下的方法:postgres=# explain select dept.no_emps,emp.age from dept,emp where emp.name = dept.mgr and dept.dept_name = 'shoe'; QUERY PLAN ------------------------------------------------------------------ Hash Join (co...
阅读全文
摘要:Oracle 的SQL文,可以强制指定各种 hint。但是在PostgreSQL中是不支持的。其wiki 是这样说的:http://wiki.postgresql.org/wiki/OptimizerHintsDiscussionWe are not interested in implementing hints in the exact ways they are commonly implemented on other databases. Proposals based on "because they've got them" will not be w
阅读全文
摘要:看到了其源代码中的一段注释,似乎认识又提高了一层:/* * INTERFACE ROUTINES * ExecInitNode - initialize a plan node and its subplans * ExecProcNode - get a tuple by executing the plan node * ExecEndNode - shut down a plan node and its subplans * * NOTES * T...
阅读全文
摘要:PostgreSQL中,有一种 tidscan。当我在使用cursor的时候,会用到此种scan:[postgres@lex pgsql]$ cat ./data/test.sqlCREATE OR REPLACE Function FindCourse ( name_in IN varchar ) RETURNS integer LANGUAGE plpgsql AS $$DECLARE cnumber integer; c1 CURSOR FOR SELECT course_number, instructor from course_t...
阅读全文
摘要:postgres=# select FindCourse('aaaa');ERROR: cannot begin/end transactions in PL/pgSQLHINT: Use a BEGIN block with an EXCEPTION clause instead.CONTEXT: PL/pgSQL function findcourse(character varying) line 21 at SQL statement出现这种错误的原因是:看我改后的代码:[postgres@lex pgsql]$ cat ./data/test.sqlCREATE OR
阅读全文
摘要:接着追踪->pathlist:初始化可能是在这里完成的?/* * add_path * Consider a potential implementation path for the specified parent rel, * and add it to the rel's pathlist if it is worthy of consideration. * A path is worthy if it has a better sort order (better pathkeys) or * cheaper cost (on either ...
阅读全文
摘要:接前面,继续分析 : cheapest_startup_path = cheapest_total_path = NULL; have_parameterized_paths = false; foreach(p, parent_rel->pathlist) { Path *path = (Path *) lfirst(p); int cmp; /* We only consider unparameterized paths in this step */ if (path->p...
阅读全文
摘要:接前面。从 cheapest_total_path 来看:query_planner 会先调用 build_simple_rel,这里完成了 cheapest_total_path 的初始化。(query_planner --> add_base_rels_to_query --> query_planner)然后,query_planner 再在后面调用 create_plan,最终设置 plan_rows。那么,是在何处 设置了 cheapest_total_path 的值呢?经过查找,看到了如下的调用关系:make_one_rel--> set_base_rel_pat
阅读全文
摘要:接前面,从 cheapeast_path 的角度,关注 query_planner 函数,对其进行简化:voidquery_planner(PlannerInfo *root, List *tlist, double tuple_fraction, double limit_tuples, Path **cheapest_path, Path **sorted_path, double *num_groups){ Query *parse = root->parse; List *...
阅读全文
摘要:接前面。从PostgreSQL的 log中,看到 计划树中有一个 plan_rows的东西。分析它的来源和来龙去脉:grouping_planner --> create_plan --> create_plan_recurse --> create_scan_plan--> create_seqscan_plan --> copy_path_costsize而copy_path_costsize 中:/* * Copy cost and size info from a Path node to the Plan node created from it. *
阅读全文
摘要:为了少走弯路,有必要了解如下概念:http://www.postgresql.org/docs/9.2/static/querytree.htmlThe rule system is located between the parser and the planner. It takes the output of the parser, one query tree, and the user-defined rewrite rules, which are also query trees with some extra information, and creates zero or m
阅读全文
摘要:在看PostgreSQL的源代码的时候,总是看到 CTE。所谓CTE,就是 common table express。这里有一个小例子:WITH test(x) AS (SELECT 1 UNION SELECT 2)SELECT * FROM test;这个是官方说明:http://www.postgresql.org/docs/9.2/static/queries-with.htmlWITH provides a way to write auxiliary statements for use in a larger query. These statements, which are.
阅读全文
摘要:开始 /* * Ready to do the primary planning. */ final_rel = make_one_rel(root, joinlist);展开:/* * make_one_rel * Finds all possible access paths for executing a query, returning a * single rel that represents the join of all base rels in the query. */RelOptInfo *make_one_rel(Plan...
阅读全文
摘要:继续分析 query_planner:/* * query_planner * Generate a path (that is, a simplified plan) for a basic query, * which may involve joins but not any fancier features. * * Since query_planner does not handle the toplevel processing (grouping, * sorting, etc) it cannot select the best path by itsel...
阅读全文
摘要:接前面: Query *parse = root->parse; List *tlist = parse->targetList; int64 offset_est = 0; int64 count_est = 0; double limit_tuples = -1.0; Plan *result_plan; List *current_pathkeys; double dNumGroups = 0; bool use_ha...
阅读全文
摘要:接前面,继续对 subquery_planner来分析:下面这一段都是对 表达式进行处理的,对我的简单查询,可以忽略。 /* * Do expression preprocessing on targetlist and quals, as well as other * random expressions in the querytree. Note that we do not need to * handle sort/group expressions explicitly, because they are actually * part o...
阅读全文
摘要:接着,分析:/*-------------------- * subquery_planner * Invokes the planner on a subquery. We recurse to here for each * sub-SELECT found in the query tree. * * glob is the global state for the current planner run. * parse is the querytree produced by the parser & rewriter. * parent_root is the...
阅读全文
摘要:再次上溯:/*-------------------- * subquery_planner * Invokes the planner on a subquery. We recurse to here for each * sub-SELECT found in the query tree. * * glob is the global state for the current planner run. * parse is the querytree produced by the parser & rewriter. * parent_root is the ...
阅读全文
摘要:接前面,再上溯:set_base_rel_pathlists -->set_rel_pathlist/* * set_base_rel_pathlists * Finds all paths available for scanning each base-relation entry. * Sequential scan and any available indices are considered. * Each useful path is attached to its relation's 'pathlist' field. */static void
阅读全文
摘要:现在,将要进入最为重要的部分:分析如何制作执行计划的。先把它简化,看看NodeTag 在 PortalStart 之前,是如何得到的。先找到第一个相关程序:每一种执行计划(结点单位),都有一个对应的path。/***************************************************************************** * PATH NODE CREATION ROUTINES *****************************************************************************//*...
阅读全文
摘要:回溯:PortalRun --> PortalRunSelectboolPortalRun(Portal portal, long count, bool isTopLevel, DestReceiver *dest, DestReceiver *altdest, char *completionTag){ ... portal->status = PORTAL_ACTIVE; ... PG_TRY(); { ActivePortal = portal; CurrentResourceOwner = ...
阅读全文
摘要:再次上溯:可以知道,在 ExecutePlan入口参数里面,start_block 就已经是0了。/* ---------------------------------------------------------------- * ExecutePlan * * Processes the query plan until we have processed 'numberTuples' tuples, * moving in the specified direction. * * Runs to completion if nu...
阅读全文
摘要:基本上是没有声明头文件造成可以 gcc 加 -E 参数然后,用文本编辑器打开生成的 .o文件,看其具体缺少哪些信息的状况。
阅读全文
摘要:接前面,再次上溯。可以知道:ExecProcNode: 会根据情况不同,将PlanNode转为各种类型Node再参与运算:TupleTableSlot *ExecProcNode(PlanState *node){ //ExecProcNode fprintf(stderr,"ExecProcNode:node->ss_currentScanDesc->rs_startblock is: %d by process %d\n", ((SeqScanState *) node)->ss_currentScanDesc->rs_startblock,ge
阅读全文
摘要:接前面,看被SeqNext所调用的heap_getnext:HeapTupleheap_getnext(HeapScanDesc scan, ScanDirection direction){ /* Note: no locking manipulations needed */ HEAPDEBUG_1; /* heap_getnext( info ) */ if (scan->rs_pageatatime) heapgettup_pagemode(scan, direction,scan->rs_nkeys, scan->rs_k...
阅读全文