Overview how plpgsql function gets executed

postgre version 17(commit hash: a17aa50d67bad4ee39a94988c679d2c2fed0e934)

The SQL

CREATE FUNCTION process_orders()
RETURNS void AS $$
DECLARE 
   c record;
BEGIN
  FOR c IN SELECT * FROM orders LOOP
    RAISE NOTICE 'Processing: % %', c.id, c.total; 
  END LOOP;
END;
$$ LANGUAGE plpgsql;

create table orders(id int, total int);
-- if you don't insert some rows, statements
-- inside the `for` loop will not be handled by plpgsql.
insert into orders values(1, 10);

select process_orders();

Before Execution

On creating function process_orders, only syntax is checked for plpgsql body. If the function does not have any syntax errors(even the table does not exist on creating function), one row for process_orders is inserted into table pg_proc. Here's code path into syntax checking:

ProcessUtilitySlow // step in
CreateFunction
ProcedureCreate
OidFunctionCall1Coll
FunctionCall1Coll
// the following functions are defined inside `plpgsql` extension
plpgsql_validator
plpgsql_compile
do_compile
plpgsql_yyparse
**check_sql_expr**

A note regarding // step in: it suggests that the line following the current line is considered as the child of the current line. In this context, each function, placed on its own line, is executed from within the function listed above it.

plpgsql_yyparse processes plpgsql statements. In short words, it could extract sql related structure from the plpgsql statements. In this context, it extracts three sql expressions,

  1. SELECT * FROM orders
  2. c.id
  3. c.total

All of them are passed to check_sql_expr. If you want to quickly find how plpgsql statements are parsed by bison, you could add code here into pl_gram.y.

On Execution

when seelct process_orders() gets executed, it goes through the normal query processing flow:

  1. analyze
  2. rewrite // ignored here
  3. plan
  4. execute // ignored here

The first 3 steps are processed by SPI API SPI_prepare_extended.

analyze

For the expression c.id and c.total, plpgsql creates two select statements, i.e.

select c.id;
select c.total;

These two expressions are analyzed by function parse_analyze_withcb and finally are passed to function transformColumnRef. In contrast to the typical processing logic, the parsing logic of the two expressions is redirected back to the plpgsql extension through p_post_columnref_hook. In this context, the post hook function is resolve_column_ref inside which c.id gets resolved.

Here's the code path for this phase,

ExecInterpExpr // step in
plpgsql_call_handler
plpgsql_exec_function
exec_toplevel_block
exec_stmt_block
exec_stmts
exec_stmt_fors
exec_run_select
exec_prepare_plan
SPI_prepare_extended
_SPI_prepare_plan
pg_analyze_and_rewrite_withcb
parse_analyze_withcb

plan

After analyzing, plpgsql extension will finally invoke pg_plan_queries to generate the execution plan.

For select * from orders, plpgsql invokes SPI_cursor_open_with_paramlist,

ExecInterpExpr // step in
plpgsql_call_handler
plpgsql_exec_function
exec_toplevel_block
exec_stmt_block
exec_stmts
exec_stmt_fors
exec_run_select
SPI_cursor_open_with_paramlist
SPI_cursor_open_internal
GetCachedPlan
BuildCachedPlan
pg_plan_queries

In this context, params passed to SPI_cursor_open_with_paramlist is NULL.

For c.id and c.total, plpgsql invokes SPI_plan_get_cached_plan.

exec_for_query // step in
exec_stmts
exec_stmt_raise
exec_eval_expr
exec_prepare_plan
exec_simple_check_plan
SPI_plan_get_cached_plan
GetCachedPlan
BuildCachedPlan
pg_plan_queries

posted on 2024-01-30 20:06  winter-loo  阅读(6)  评论(0编辑  收藏  举报

导航