openGauss源码解析（99）

openGauss源码解析：SQL引擎源解析（14）

（2）合法连接。

由于RelOptInfo会导致搜索空间膨胀，如果上来就对两个RelOptInfo进行最终的合法连接检查会导致搜索时间过长，这也就是为什么要提前做初步检查和精确检查的原因，可以减少搜索时间其实达到了“剪枝”的效果。

对于合法连接，主要代码在join_is_legal中，它主要就是判断两个RelOptInfo可不可以进行连接生成物理路径，入参就是两个RelOpInfo。对于两个待选的RelptInfo，仍不清楚他们之间的逻辑连接关系，有可能是Inner Join、LeftJoin、SemiJoin，或者压根不存在合法的逻辑连接关系，故这时候就需要确定他们的连接关系，主要分成两个步骤。

步骤1：对root中join_info_list链表中的SpecialJoinInfo进行遍历，看是否可以找到一个“合法”的SpecialJoinInfo，因为除InnerJoin外的其他逻辑连接关系都会生成对应的一个SpecialJoinInfo，并在SpecialJoinInfo中还记录了合法的链接顺序。

步骤2：对RelOptInfo中的Lateral关系进行排查，查看找到的SpecialJoinInfo是否符合Lateral语义指定的连接顺序要求。

2）建立连接路径

至此已经筛选出两个满足条件的RelOptInfo，那么下一步就是要对他们中的路径建立物理连接关系。通常的物理连接路径有NestLoop、Merge Join和Hash Join三种，这里主要是借由sort_inner_and_outer、match_unsorted_outer和hash_inner_and_outer函数实现的。

像sort_inner_and_outer函数主要是生成MergeJoin路径，其特点是假设内表和外表的路径都是无序的，所以必须要对其进行显示排序，内外表只要选择总代价最低的路径即可。而matvh_unsorted_outer函数则是代表外表已经有序，这时候只需要对内表进行显示排序就可以生成MergeJoin路径或者生成NestLoop以及参数化路径。最后的选择就是对两表连接建立HashJoin路径，也就是要建立哈希表。

为了方便MergeJoin的建立，首先需要对约束条件进行处理，故把适用于MergeJoin的约束条件从中筛选出来（select_mergejoin_clauses函数），这样在sort_inner_and_outer和match_unsorted_outer函数中都可以利用这个Mergejoinable连接条件。代码如下：

//提取可以进行Merge Join的条件

foreach (l, restrictlist) {

RestrictInfo* restrictinfo = (RestrictInfo*)lfirst(l);

//如果当前是外连接并且是一个过滤条件，那么就忽略

if (isouterjoin && restrictinfo->is_pushed_down)

continue;

//对连接条件是否可以做Merge Join进行一个初步的判断

//restrictinfo->can_join和restrictinfo->mergeopfamilies都是在distribute_qual_to_rels生成

if (!restrictinfo->can_join || restrictinfo->mergeopfamilies == NIL) {

//忽略FULL JOIN ON FALSE情况

if (!restrictinfo->clause || !IsA(restrictinfo->clause, Const))

have_nonmergeable_joinclause = true;

continue; /* not mergejoinable */

}

//检查约束条件是否是outer op inner或者inner op outer的形式

if (!clause_sides_match_join(restrictinfo, outerrel, innerrel)) {

have_nonmergeable_joinclause = true;

continue; /* no good for these input relations */

}

//更新并使用最终的等价类

//"规范化"pathkeys，这样约束条件就能和pathkeys进行匹配

update_mergeclause_eclasses(root, restrictinfo);

if (EC_MUST_BE_REDUNDANT(restrictinfo->left_ec) || EC_MUST_BE_REDUNDANT(restrictinfo->right_ec)) {

have_nonmergeable_joinclause = true;

continue; /* can't handle redundant eclasses */

}

result_list = lappend(result_list, restrictinfo);

}

（1） sort_inner_and_outer函数。

sort_inner_and_outer函数主要用于生成MergeJoin路径，它需要显式地对两个字RelOptInfo进行排序，只考虑子RelOptInfo中的cheapest_total_path函数即可。通过MergeJoinable（能够用来生成Merge Join的）的连接条件来生成pathkeys，然后不断地调整pathkeys中pathke的顺序来获得不同的pathkeys集合，再根据不同顺序的pathkeys来决定内表的innerkeys和外表的outerkeys。代码如下：

//对外表和内表中的每一条路径进行连接尝试遍历

foreach (lc1, outerrel->cheapest_total_path) {

Path* outer_path_orig = (Path*)lfirst(lc1);

Path* outer_path = NULL;

j = 0;

foreach (lc2, innerrel->cheapest_total_path) {

Path* inner_path = (Path*)lfirst(lc2);

outer_path = outer_path_orig;

//参数化路径不可生成MergeJoin路径

if (PATH_PARAM_BY_REL(outer_path, innerrel) ||

PATH_PARAM_BY_REL(inner_path, outerrel))

return;

//必须满足外表和内表最低代价路径

if (outer_path != linitial(outerrel->cheapest_total_path) &&

inner_path != linitial(innerrel->cheapest_total_path)) {

if (!join_used[(i - 1) * num_inner + j - 1]) {

j++;

continue;

}

//生成唯一化路径

jointype = save_jointype;

if (jointype == JOIN_UNIQUE_OUTER) {

outer_path = (Path*)create_unique_path(root, outerrel, outer_path, sjinfo);

jointype = JOIN_INNER;

} else if (jointype == JOIN_UNIQUE_INNER) {

inner_path = (Path*)create_unique_path(root, innerrel, inner_path, sjinfo);

jointype = JOIN_INNER;

}

//根据之前提取的条件确定可供MergeJoin路径生成的PathKeys集合

all_pathkeys = select_outer_pathkeys_for_merge(root, mergeclause_list, joinrel);

//处理上面pathkeys集合中每一个Pathkey尝试生成MergeJoin路径

foreach (l, all_pathkeys) {

……

//生成内表的Pathkey

innerkeys = make_inner_pathkeys_for_merge(root, cur_mergeclauses, outerkeys);

//生成外表的Pathkey

merge_pathkeys = build_join_pathkeys(root, joinrel, jointype, outerkeys);

//根据pathkey以及内外表路径生成MergeJoin路径

try_mergejoin_path(root, ……,innerkeys);

}

j++;

}

i++;

}

posted @ 2024-04-30 10:35 openGauss-bot 阅读(12) 评论(0) 收藏举报

刷新页面返回顶部

openGauss-bot

openGauss源码解析（99）

（2） 合法连接。

2） 建立连接路径

（1） sort_inner_and_outer函数。

公告

（2）合法连接。

2）建立连接路径