05 2013 档案
摘要:接前面,看 SeqNext 函数:/* ---------------------------------------------------------------- * SeqNext * * This is a workhorse for ExecSeqScan * ---------------------------------------------------------------- */static TupleTableSlot *SeqNext(SeqScanState *node){ HeapTuple tuple; Heap...
阅读全文
摘要:接前面,这次重点分析 ExecScan:其for 循环内部: for (;;) { TupleTableSlot *slot; CHECK_FOR_INTERRUPTS(); slot = ExecScanFetch(node, accessMtd, recheckMtd); /* * if the slot returned by the accessMtd contains NULL, then it means * there is nothing more to scan so we j...
阅读全文
摘要:这里又遇到了函数指针:executor.h头文件中,定义了 ExecScanAccessMtd 指针,或者定义了一个ExecScanAccessMtd 函数原型的指针/* * prototypes from functions in execScan.c */typedef TupleTableSlot *(*ExecScanAccessMtd) (ScanState *node);之后,在NodeSeqScan.c处有一个实现:/* ---------------------------------------------------------------- * SeqNex...
阅读全文
摘要:看:TupleTableSlot *ExecProcNode(PlanState *node){ TupleTableSlot *result; CHECK_FOR_INTERRUPTS(); if (node->chgParam != NULL) /* something changed */ ExecReScan(node); /* let ReScan handle this */ if (node->instrument) InstrStartNode(node->instrument); switch (node...
阅读全文
摘要:接前面:/* ---------------------------------------------------------------- * ExecutorRun * * This is the main routine of the executor module. It accepts * the query descriptor from the traffic cop and executes the * query plan. * * ExecutorStart must have been called ...
阅读全文
摘要:接前面深入考察 PortalRun: 初步判断,其核心在于boolPortalRun(Portal portal, long count, bool isTopLevel, DestReceiver *dest, DestReceiver *altdest, char *completionTag){ ... portal->status = PORTAL_ACTIVE; ... PG_TRY(); { ActivePortal = portal; CurrentResourceOwner = por...
阅读全文
摘要:接前面,回溯调用关系:exec_simple_query --> PortalStart --> ExecutorStart --> StandardExecutorStart --> InitPlan再回到 exec_simple_query 来:事前知道,表 tst04 对应的文件名为 16393。postgres=# select oid from pg_class where relname='tst04'; oid ------- 16393(1 row)postgres=# 看 exec_simple_query,加点调试信息:static
阅读全文
摘要:接前面,在 PortalStart 中调用了 ExecutorStart,ExecutorStart 会调用 InitPlan:/* ---------------------------------------------------------------- * InitPlan * * Initializes the query plan: open files, allocate storage * and start up the rule manager * -----------------------------------------...
阅读全文
摘要:接前面,继续观察 PortalStart,其中有: /* * Create QueryDesc in portal's context; for the moment, set * the destination to DestNone. */ queryDesc = CreateQueryDesc((PlannedStmt *) linitial(portal->stmts), ...
阅读全文
摘要:回到上一个层面,继续看 PortalStart的处理:voidPortalStart(Portal portal, ParamListInfo params, int eflags, bool use_active_snapshot){ ... PG_TRY(); { ... /* * Determine the portal execution strategy */ portal->strategy = ChoosePortalStrategy(portal->stmts); ...
阅读全文
摘要:接前面,继续分析:PortalStrategyChoosePortalStrategy(List *stmts){ int nSetTag; ListCell *lc; /* * PORTAL_ONE_SELECT and PORTAL_UTIL_SELECT need only consider the * single-statement case, since there are no rewrite rules that can add * auxiliary queries to a SELECT or a util...
阅读全文
摘要:接前面,继续分析 ChoosePortalStrategy:/* * ChoosePortalStrategy * Select portal execution strategy given the intended statement list. * * The list elements can be Querys, PlannedStmts, or utility statements. * That's more general than portals need, but plancache.c uses this too. * * See the comments ...
阅读全文
摘要:simple_exec_query函数中,有如下一句:plantree_list = pg_plan_queries(querytree_list, 0, NULL);那么,plantree_list 里面,到底有什么,让我来给它大卸八块:plantree_list 是List *类型(指向List 的指针):typedef struct List{ NodeTag type; /* T_List, T_IntList, or T_OidList */ int length; ListCell *head; ...
阅读全文
摘要:接前面,仔细看这个 :这 add_base_rels_to_query 是个递归调用嘛。想像一下: select * from tst01 where id in (select sid from tst02) or id in (select sid from tst03) 之类的,此函数将层层深入,构造一个二叉树式样的语法树。voidadd_base_rels_to_query(PlannerInfo *root, Node *jtnode){ if (jtnode == NULL) return; if (IsA(jtnode, RangeTblRef)) ...
阅读全文
摘要:接前面,再次上溯一个层次,看代码(planmain.c :query_planner):voidquery_planner(PlannerInfo *root, List *tlist, double tuple_fraction, double limit_tuples, Path **cheapest_path, Path **sorted_path, double *num_groups){ ... /* * Make a flattened version of the rangetable...
阅读全文
摘要:再次梳理 build_simple_rel 的执行内容:/* * build_simple_rel * Construct a new RelOptInfo for a base relation or 'other' relation. */RelOptInfo *build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind){ RelOptInfo *rel; RangeTblEntry *rte; /* Rel should not exist already */ Assert...
阅读全文
摘要:接前面,回到get_relation_info(plancat.c)函数 上:relation 是由 heap_open 函数调用后获得的。voidget_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, RelOptInfo *rel){ Index varno = rel->relid; Relation relation; bool hasindex; List *indexinfos = NI...
阅读全文
摘要:再次回到 estimate_rel_size 我发现,在入口参数 rel中,rel->rd_rel->reltuples 的值已经完全准备好了:/* * estimate_rel_size - estimate # pages and # tuples in a table or index * * We also estimate the fraction of the pages that are marked all-visible in * the visibility map, for use in estimation of index-only scans. * *
阅读全文
摘要:接前面。回到程序调用关系上来:estimate_rel_size -> RelationGetNumberOfBlocks->RelationGetNumberOfBlocksINFork->Smgrnblocks->mdnblocks...折腾了一圈,就是为了评估一个表的大小。那么,我们所获得的block,它到底是个什么单位?BlockNumbermdnblocks(SMgrRelation reln, ForkNumber forknum){ MdfdVec *v = mdopen(reln, forknum, EXTENSION_FAIL); BlockNumbe
阅读全文
摘要:接前面:回到mdopen上来,看看是谁调用了 mdopen,又获得了什么。/* * mdnblocks() -- Get the number of blocks stored in a relation. * * Important side effect: all active segments of the relation are opened * and added to the mdfd_chain list. If this routine has not been * called, then only segments up ...
阅读全文
摘要:看代码:/* * mdopen() -- Open the specified relation. * * Note we only open the first segment, when there are multiple segments. * * If first segment is not present, either ereport or return NULL according * to "behavior". We treat EXTENSION_CREATE the same as EXTENSION_FAIL; * EXTENSION_CREAT
阅读全文
摘要:在网上学到的,备忘:[root@lex tst]# cat gao3.c#include <stdio.h>#include <string.h>char * function1 (char *p){ printf("In function1 %s\n",p); return p;}char * function2 (char *p){ printf("In function2 %s\n",p); return p;}char * function3 (char *p){ printf("In function3 %s\
阅读全文
摘要:再回过头来看/* * open a file in an arbitrary directory * * NB: if the passed pathname is relative (which it usually is), * it will be interpreted relative to the process' working directory * (which should always be $PGDATA when this code is running). */FilePathNameOpenFile(FileName fileName, int fileF
阅读全文
摘要:看PostgreSQL中的代码:/* Debugging.... */#ifdef FDDEBUG#define DO_DB(A) A#else#define DO_DB(A) /* A */#endif此后,在为定义 FDDEBUG的情况下,如果执行 DO_DB(function1());就相当于 /*function1();*/,也就是什么都不执行而且,如果对PostgreSQL进行查找,发现 没有定义 FDDEBUG 的地方,估计哪个开发者自己手工加入的吧:[root@lex ttt]# find ./ | xargs grep "FDDEBUG"Bina...
阅读全文
摘要:接前面,由于看到对BasicOpenFile 函数的调用。自然想到,如果两个进程同时访问一个表(即同一文件),会否有冲突或效率的问题。开两个psql客户端,连接数据库后,后台生成两个进程,分别运行 select * from tst01 进行观察...发现各进程之间互相不干扰。我实验的方法,加入调试代码:/* * BasicOpenFile --- same as open(2) except can free other FDs if needed * * This is exported for use by places that really want a plain kernel F
阅读全文
摘要:PostgreSQL中, cluster是 根,是一个目录,一般是base。base之下,一个数据库拥有一个目录。每建立一个数据库,就会在base下再新建一个目录。如果在某数据库下建立了表,则在该数据库的目录下,会建立一个文件对应着这个表。如何查询呢:假如我有一个表名为 tst01,则可以查询其oid:postgres=# select oid from pg_class where relname='tst01'; oid ------- 16384(1 row)postgres=# 查看base下的目录,可以看到同名文件:[root@lex 12788]# ls /usr/
阅读全文
摘要:VFD是为了解决文件句柄的限制,防止把OS级别的文件句柄用光。原来我认为VFD是各个进程间共有的。但是根据观察,发现每一个进程都拥有自己的VFD数组指针。看看下面两段加了调试信息后的代码:InitFileAccess:从VfdCache = (Vfd *) malloc(sizeof(Vfd)) 基本可以断定,没有使用共享内存方式/* * InitFileAccess --- initialize this module during backend startup * * This is called during either normal or standalone backend st
阅读全文
摘要:在PostgreSQL运行的时候,通过对其代码进行跟踪,发现不断有进程被生成,访问InitFileAccess,过了大约20秒左右,就死掉了。这个过程反复地进行着,进程号不断增加。我对其进一步进行了跟踪,发现似乎是这些进程是为了 autovacuum 而被创建出来,然后自己消亡的。在PostgreSQL9.2源代码中加入:InitFileAccess(void){ fprintf(stderr,"In %s ...by Process %d\n", __FUNCTION__,getpid()); fprintf(stderr,"------------------
阅读全文
摘要:我个人的理解:其实质,和Java里的Hash表有点类似。在C语言中是为了解决数组无法扩展的缺陷。例子:看 PostgreSQL对 VFD的处理:初始化:/* * Virtual File Descriptor array pointer and size. This grows as * needed. 'File' values are indexes into this array. * Note that VfdCache[0] is not a usable VFD, just a list header. */static Vfd *VfdCache;static S
阅读全文
摘要:继续:/* * estimate_rel_size - estimate # pages and # tuples in a table or index * * We also estimate the fraction of the pages that are marked all-visible in * the visibility map, for use in estimation of index-only scans. * * If attr_widths isn't NULL, it points to the zero-index entry of the * r
阅读全文
摘要:接着分析:build_simple_rel 函数/* * build_simple_rel * Construct a new RelOptInfo for a base relation or 'other' relation. */RelOptInfo *build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind){ ... /* Check type of rtable entry */ switch (rte->rtekind) { case RTE_RELAT...
阅读全文
摘要:如此:接前面,看 add_base_rels_to_query函数:/* * add_base_rels_to_query * * Scan the query's jointree and create baserel RelOptInfos for all * the base relations (ie, table, subquery, and function RTEs) * appearing in the jointree. * * The initial invocation must pass root->parse->jointree as t...
阅读全文
摘要:Path莫非指的就是 物理访问路径?/* * query_planner * Generate a path (that is, a simplified plan) for a basic query, * which may involve joins but not any fancier features. * * Since query_planner does not handle the toplevel processing (grouping, * sorting, etc) it cannot select the best path by itself...
阅读全文
摘要:继续:/*-------------------- * grouping_planner * Perform planning steps related to grouping, aggregation, etc. * This primarily means adding top-level processing to the basic * query plan produced by query_planner. * * tuple_fraction is the fraction of tuples we expect will be retrieved...
阅读全文
摘要:接前面,对 subquery_planner,进行进一步的分析:/*-------------------- * subquery_planner * Invokes the planner on a subquery. We recurse to here for each * sub-SELECT found in the query tree. * * glob is the global state for the current planner run. * parse is the querytree produced by the parser & rewr...
阅读全文
摘要:接前面,继续进行分析:前面已经说过,在planner函数运行时,发生了实际物理磁盘访问。/***************************************************************************** * * Query optimizer entry point * * To support loadable plugins that monitor or modify planner behavior, * we provide a hook variable that lets a plugin get control before...
阅读全文
摘要:回到 exec_simple_query函数上来。/* * exec_simple_query * * Execute a "simple Query" protocol message. */static voidexec_simple_query(const char *query_string){ ... start_xact_command(); ... parsetree_list = pg_parse_query(query_string); ... /* * Run through the raw parsetree(s) an...
阅读全文
摘要:今天进行了一个小实验,发现一个奇怪的现象:我当前有这样的数据:postgres=# select * from tab01; id | val ----+----- 1 | 100 2 | 200 3 | 300(3 rows)postgres=# 并且通过查询数据字典,知道 tab01 对应的文件名是:/usr/local/pgsql/data/base/12788/16385然后重新启动数据库。开启两个psql 客户端第一个客户端执行:postgres=# select * from tab01 where id=1; id | val ----+----- 1 | 100(1 r...
阅读全文
摘要:调用关系:PortalRun ->PortalRunSelect -> ExecutorRunExecutorRun,实际上会去运行 standard_ExecutorRun ->ExecutePlan:/* ---------------------------------------------------------------- * ExecutorRun * * This is the main routine of the executor module. It accepts * the query descriptor from the...
阅读全文
摘要:在PortalRun里要调用 PortalRunSelect,具体的过程缩略如下:/* * PortalRunSelect * Execute a portal's query in PORTAL_ONE_SELECT mode, and also * when fetching from a completed holdStore in PORTAL_ONE_RETURNING, * PORTAL_ONE_MOD_WITH, and PORTAL_UTIL_SELECT cases. * * This handles simple N-rows-fo...
阅读全文
摘要:前面已经说过,exec_simple_query要运行 PortalStart和 PortalRun。可以说,PortalRun是重头戏,sql的真正执行,就在这里完成。/* * PortalRun * Run a portal's query or queries. * * count <= 0 is interpreted as a no-op: the destination gets started up * and shut down, but nothing else happens. Also, count == FETCH_ALL is * interprete.
阅读全文
摘要:前面说过 PortalStart明确执行策略后,要执行 ExecutorStart。那么ExecutorStart 到底作了什么呢。以下是缩略:/* ---------------------------------------------------------------- * ExecutorStart * * This routine must be called at the beginning of any execution of any * query plan * * Takes a QueryDesc previously crea...
阅读全文
摘要:在定义了Portal之后,需要运行:PortalStart,它主要的任务是明确执行策略,然后再执行ExecutorStart:代码太长,进行缩略:voidPortalStart(Portal portal, ParamListInfo params, int eflags, bool use_active_snapshot){ ... PG_TRY(); { ActivePortal = portal; CurrentResourceOwner = portal->resowner; PortalContext ...
阅读全文
摘要:看看portal生成完毕后,干了什么(PortalDefineQuery): 1 /* 2 * Create unnamed portal to run the query or queries in. If there 3 * already is one, silently drop it. 4 */ 5 portal = CreatePortal("", true, true); 6 /* Don't display the portal in pg_cursors */ 7 ...
阅读全文
摘要:前面已经说过,在 exec_simple_query中,完成sql文的执行。具体地说,是要构造portal,然后运行PortalStart ,PortalRun...下面就先看看 portal如何构造:在 exec_simple_query中,有这么一段:1 /*2 * Create unnamed portal to run the query or queries in. If there3 * already is one, silently drop it.4 */5 portal = CreateP...
阅读全文
摘要:在工作中使用过一些开源软件,有过一些美好的体验,也有一些不好的体验。对于那些纯粹开源,不掺杂任何商业考量的贡献者,我感激他们的辛苦工作,但仍然希望他们的作品可以更好。对那些借助开源社区力量,着眼商业的各种组织,我仍然感激他们的辛苦工作,也希望他们的作品可以更好。我想澄清一件事情,那就是:并不是因为一个软件开源,并且没有直接从我的手里获得收益,我就不能指责它的缺点和不足:也许存在一些特别的例子,某些个人和组织,故意把开源软件的文档写的晦涩难懂,以此来逼迫客户购买技术支持服务。当节约成本成为强大的压力,客户、开发组织需要开发者、使用者使用某些流行开的开源软件的时候,开源软件的不足,需要各方加以重视
阅读全文
摘要:在exec_simple_query中,代码如下: 1 /* 2 * exec_simple_query 3 * 4 * Execute a "simple Query" protocol message. 5 */ 6 static void 7 exec_simple_query(const char *query_string) 8 { 9 CommandDest dest = whereToSendOutput; 10 MemoryContext oldcontext; 11 List *parsetree_list; 12...
阅读全文
摘要:如果我开一个psql窗口,来输入sql文,它在数据库的何处被解析?在何处被"真正"处理?postgres.c 的intPostgresMain(int argc, char *argv[], const char *username)函数中,在PostgresMain 的 for 循环中,调用 static void exec_simple_query(const char *query_string),来构建语法树,并完成SQL 处理。 “纯”语法树由pg_parse_query 来生成,生成后,由 exec_simple_query的其它部分来利用执行访问数据库。调用
阅读全文
摘要:小例子:定义一个宏:#define mysleep(_sec) fprintf(stderr,"sleep AT line:%d\n",__LINE__);sleep(_sec);fprintf(stderr,"after sleep\n");然后,在程序中使用:mysleep(10);....mysleep(10);
阅读全文
摘要:接前面的例子:程序中,调用Bison和Flex结合的小例子要做出存储语法树结构的全局变量: 1 [root@lex ~]# cd /soft/total 2 [root@lex total]# ls 3 lexer.l lex.yy.c myparser myparser.c myparser.h parser.y y.tab.c y.tab.h 4 [root@lex total]# cat myparser.h 5 typedef struct ABlock{ 6 int left; 7 int right; 8 }AB; 9 10 typedef struct MB...
阅读全文
摘要:接着前面的 <程序中,调用Bison和Flex结合的小例子> 那一篇,再来整点新东西:http://www.cnblogs.com/gaojian/archive/2013/05/17/3083662.html我想,实际的工作中,绝对不会像逆波兰式计算器那样简单,直接就在语法分析器里完成计算。常常需要构造一颗语法树,然后把它传出来!例如各种数据库系统处理SQL文就是这样子。上程序: 1 [root@lex total]# cat lexer.l 2 %{ 3 4 #include "y.tab.h" 5 #include <stdio.h> 6 7
阅读全文
摘要:网上的很多程序,对初次接触Bison和Flex的人而言,都有点复杂,看最简单的例子更好些:http://stackoverflow.com/questions/1920604/how-to-make-yy-input-point-to-a-string-rather-than-stdin-in-lex-yacc-solaris我稍微修改一下,说说自己的理解,也作为一个备忘:Flex程序: 1 [root@lex total]# cat lexer.l 2 %{ 3 4 #include "y.tab.h" 5 #include <stdio.h> 6 7 8 #
阅读全文
摘要:网上很多例子,都是yacc和lex结合的。而我想找一个单纯使用 lex的例子。而且可以从我的主程序来调用它。上程序:第一步,编写 flex 文件:example.flex从网上找到的,其功能是对行数和字符数计数。 1 [root@cop01 tst]# cat example.flex 2 /* name: example.flex */ 3 %option noyywrap 4 %{ 5 int num_lines =0, num_chars=0; 6 %} 7 %% 8 9 \n ++num_lines; ++num_chars;10 . ++num_chars;11 12 %%13 1.
阅读全文