fxjwind Calcite分析 - Volcano模型
参考,https://matt33.com/2019/03/17/apache-calcite-planner/
Volcano模型使用,分为下面几个步骤,
//1. 初始化
VolcanoPlanner planner = new VolcanoPlanner();
//2.addRelTrait
planner.addRelTraitDef(ConventionTraitDef.INSTANCE);
planner.addRelTraitDef(RelDistributionTraitDef.INSTANCE);
//3.添加rule, logic to logic
planner.addRule(FilterJoinRule.FilterIntoJoinRule.FILTER_ON_JOIN);
planner.addRule(ReduceExpressionsRule.PROJECT_INSTANCE);
//4.添加ConverterRule, logic to physical
planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
planner.addRule(EnumerableRules.ENUMERABLE_SORT_RULE);
//5. setRoot 方法注册相应的RelNode
planner.setRoot(relNode);
//6. find best plan
relNode = planner.findBestExp();
1和2 初始化
addRelTraitDef,就是把traitDef加到这个结构里面
/** * Holds the currently registered RelTraitDefs. */ private final List<RelTraitDef> traitDefs = new ArrayList<>();
3. 增加Rule
public boolean addRule(RelOptRule rule) {
//加到ruleSet
final boolean added = ruleSet.add(rule);
mapRuleDescription(rule);
// Each of this rule's operands is an 'entry point' for a rule call.
// Register each operand against all concrete sub-classes that could match
// it.
for (RelOptRuleOperand operand : rule.getOperands()) {
for (Class<? extends RelNode> subClass
: subClasses(operand.getMatchedClass())) {
classOperands.put(subClass, operand);
}
}
// If this is a converter rule, check that it operates on one of the
// kinds of trait we are interested in, and if so, register the rule
// with the trait.
if (rule instanceof ConverterRule) {
ConverterRule converterRule = (ConverterRule) rule;
final RelTrait ruleTrait = converterRule.getInTrait();
final RelTraitDef ruleTraitDef = ruleTrait.getTraitDef();
if (traitDefs.contains(ruleTraitDef)) {
ruleTraitDef.registerConverterRule(this, converterRule);
}
}
return true;
}
a. 更新classOperands
记录Relnode和Rule的match关系,
multimap,一个relnode可以对应于多条rule的operand
/**
* Operands that apply to a given class of {@link RelNode}.
*
* <p>Any operand can be an 'entry point' to a rule call, when a RelNode is
* registered which matches the operand. This map allows us to narrow down
* operands based on the class of the RelNode.</p>
*/
private final Multimap<Class<? extends RelNode>, RelOptRuleOperand>
classOperands = LinkedListMultimap.create();
首先取出rule的operands,
/**
* Operand that determines whether a {@link RelOptRule}
* can be applied to a particular expression.
*
* <p>For example, the rule to pull a filter up from the left side of a join
* takes operands: <code>Join(Filter, Any)</code>.</p>
*
* <p>Note that <code>children</code> means different things if it is empty or
* it is <code>null</code>: <code>Join(Filter <b>()</b>, Any)</code> means
* that, to match the rule, <code>Filter</code> must have no operands.</p>
*/
public class RelOptRuleOperand {
//~ Instance fields --------------------------------------------------------
private RelOptRuleOperand parent;
private RelOptRule rule;
private final Predicate<RelNode> predicate;
// REVIEW jvs 29-Aug-2004: some of these are Volcano-specific and should be
// factored out
public int[] solveOrder;
public int ordinalInParent;
public int ordinalInRule;
public final RelTrait trait;
private final Class<? extends RelNode> clazz;
private final ImmutableList<RelOptRuleOperand> children;
operands包含所有rule中的operand,以flatten的方式
operands,操作符,用于表示rule适用于何种表达,Relnode的封装?
比如,Join(Filter, Any),operands应该包含,Join,Filter,Any
operand.getMatchedClass,operand对于的relnode的类型,比如Filter
subClasses,这个函数如下,
/** Returns sub-classes of relational expression. */
public Iterable<Class<? extends RelNode>> subClasses(
final Class<? extends RelNode> clazz) {
return Util.filter(classes, c -> {
// RelSubset must be exact type, not subclass
if (c == RelSubset.class) { //对于RelSubset,需要类型相等
return c == clazz;
}
return clazz.isAssignableFrom(c); //c的派生类
});
}
classes,这个结构会记录plan中所有RelNode的class
AbstractRelOptPlanner
private final Set<Class<? extends RelNode>> classes = new HashSet<>();
初始化的时候,先加入基类
// Add abstract RelNode classes. No RelNodes will ever be registered with
// these types, but some operands may use them.
classes.add(RelNode.class);
classes.add(RelSubset.class);
后续通过registerClass函数注册,这个会在registerImpl中被调用注册新的RelNode;subClasses就是找出当前plan的classes中有哪些class和当前Operand相关的
理论上,如果是先addRule,后setRoot的话,这个时候classes只有基类
所以这里的逻辑是,找出哪些Rule中operands的class和plan中的RelNode的class是可以匹配上的
把这个对应关系加到classOperands,有了这个关系,我们后面在遍历的时候,就知道有哪些rule和这个RelNode可能会match上,缩小搜索空间
这里记录的是operand,不是rule,从operand本身可以取到rule
b. 将ConverterRule注册到RelTraitDef
// If this is a converter rule, check that it operates on one of the
// kinds of trait we are interested in, and if so, register the rule
// with the trait.
if (rule instanceof ConverterRule) {
ConverterRule converterRule = (ConverterRule) rule;
final RelTrait ruleTrait = converterRule.getInTrait();
final RelTraitDef ruleTraitDef = ruleTrait.getTraitDef();
if (traitDefs.contains(ruleTraitDef)) {
ruleTraitDef.registerConverterRule(this, converterRule);
}
}
首先看下什么是ConverterRule
个人理解,从一种Trait转换到另一种Trait,保持语义不变,即inTrait和outTrait的def必须要一样
比如如果是排序,只能从一种排序到另一种排序,排序的选择是不影响语义的,对于distribution也是一样
/**
* Abstract base class for a rule which converts from one calling convention to
* another without changing semantics.
*/
public abstract class ConverterRule
extends RelRule<ConverterRule.Config> {
//~ Instance fields --------------------------------------------------------
private final RelTrait inTrait;
private final RelTrait outTrait;
protected final Convention out;
//~ Constructors -----------------------------------------------------------
/** Creates a <code>ConverterRule</code>. */
protected ConverterRule(Config config) {
super(config);
this.inTrait = Objects.requireNonNull(config.inTrait());
this.outTrait = Objects.requireNonNull(config.outTrait());
// Source and target traits must have same type
assert inTrait.getTraitDef() == outTrait.getTraitDef();
// Most sub-classes are concerned with converting one convention to
// another, and for them, the "out" field is a convenient short-cut.
this.out = outTrait instanceof Convention ? (Convention) outTrait
: null;
}
这里直接的意思,如果ConverterRule的inTrait的Def和Plan中注册的traitDef相同时,需要注册一下这个Rule
5. SetRoot
public void setRoot(RelNode rel) {
// We've registered all the rules, and therefore RelNode classes,
// we're interested in, and have not yet started calling metadata providers.
// So now is a good time to tell the metadata layer what to expect.
registerMetadataRels();
this.root = registerImpl(rel, null);
if (this.originalRoot == null) {
this.originalRoot = rel;
}
rootConvention = this.root.getConvention(); //root的ConventionDef对应的trait
ensureRootConverters();
}
核心调用是 registerImpl
/**
* Registers a new expression <code>exp</code> and queues up rule matches.
* If <code>set</code> is not null, makes the expression part of that
* equivalence set. If an identical expression is already registered, we
* don't need to register this one and nor should we queue up rule matches.
*
* @param rel relational expression to register. Must be either a
* {@link RelSubset}, or an unregistered {@link RelNode}
* @param set set that rel belongs to, or <code>null</code>
* @return the equivalence-set
*/
private RelSubset registerImpl(
RelNode rel,
RelSet set) {
//刚开始set为null,但是在递归调用时传入rel应当属于的set,如果rel是subset已经注册过,要把两个set合并
if (rel instanceof RelSubset) {
return registerSubset(set, (RelSubset) rel);
}
// Ensure that its sub-expressions are registered.
// 1. 递归注册
rel = rel.onRegister(this);
// 2. 记录下该RelNode是由哪个Rule Call产生的
if (ruleCallStack.isEmpty()) {
provenanceMap.put(rel, Provenance.EMPTY);
} else {
final VolcanoRuleCall ruleCall = ruleCallStack.peek();
provenanceMap.put(
rel,
new RuleProvenance(
ruleCall.rule,
ImmutableList.copyOf(ruleCall.rels),
ruleCall.id));
}
// 3. 注册RelNode树的class和trait
registerClass(rel);
registerCount++;
//4. 注册RelNode到RelSet
final int subsetBeforeCount = set.subsets.size();
RelSubset subset = addRelToSet(rel, set);
final RelNode xx = mapDigestToRel.put(key, rel);
// 5. 更新importance
if (rel == this.root) {
ruleQueue.subsetImportances.put(
subset,
1.0); // root的importance固定为1
}
//把inputs也加入到RelSubset里面
for (RelNode input : rel.getInputs()) {
RelSubset childSubset = (RelSubset) input;
childSubset.set.parents.add(rel);
// 由于调整了RelSubset结构,重新计算importance
ruleQueue.recompute(childSubset);
}
// 6. Fire rules
fireRules(rel, true);
// It's a new subset.
if (set.subsets.size() > subsetBeforeCount) {
fireRules(subset, true);
}
return subset;
}
5.1 rel = rel.onRegister(this)
onRegister,目的就是递归的对RelNode树上的每个节点调用registerImpl,特别注意这里是递归调用,所以对每个RelNode都会调用
取出RelNode的inputs,这里bottom up的,join的inputs就是left,right children
然后对于每个input,调用ensureRegistered
最终这个函数返回的是,经过注册的Root,应该是Subset,并且Root的子树也应该是完成注册的subsets
public RelNode onRegister(RelOptPlanner planner) {
List<RelNode> oldInputs = getInputs();
List<RelNode> inputs = new ArrayList<>(oldInputs.size());
for (final RelNode input : oldInputs) {
RelNode e = planner.ensureRegistered(input, null);
}
inputs.add(e);
}
RelNode r = this;
if (!Util.equalShallow(oldInputs, inputs)) {
r = copy(getTraitSet(), inputs);
}
r.recomputeDigest();
return r;
}
ensureRegistered
getSubset, 从IdentityHashMap<RelNode, RelSubset> mapRel2Subset中获取relnode对应的subset,这里是IdentityHM,用==比较key的对象地址,而不是用equal比value
public RelSubset ensureRegistered(RelNode rel, RelNode equivRel) {
RelSubset result;
final RelSubset subset = getSubset(rel);
if (subset != null) { //如果node有对应的subset,说明register过
if (equivRel != null) {
final RelSubset equivSubset = getSubset(equivRel); //如果有equivRel,获取相应的subset
if (subset.set != equivSubset.set) {
merge(equivSubset.set, subset.set); //合并到equivSubset的Relset中
}
}
result = canonize(subset); //不断的取subset的relset的equivalentSet,取到leader的subset
} else { //没有对应的subset,没有注册过,注册
result = register(rel, equivRel);
}
return result;
}
public RelSubset register(
RelNode rel,
RelNode equivRel) {
final RelSet set;
if (equivRel == null) {
set = null;
} else { //如果有equivRel
equivRel = ensureRegistered(equivRel, null); //保证equiv是注册过的
set = getSet(equivRel); //
}
return registerImpl(rel, set); //register其实就调用registerImpl,只是需要加上equivSet
}
5.2 一系列判断
a. Provenance(出处),用于标记 Where a RelNode came from,包含UnknownProvenance(不知道source),DirectProvenance(直接从其他node copy过来),RuleProvenance(通过触发rule生成)
记录下由那个Rule Call,产生这个RelNode;
ruleCallStack,用来记录最新触发的rulCall,如果有,说明当前新产生的RelNode应该是由该rule生成的
if (ruleCallStack.isEmpty()) {
provenanceMap.put(rel, Provenance.EMPTY); //Empty,即UnknownProvenance
} else {
final VolcanoRuleCall ruleCall = ruleCallStack.peek();
provenanceMap.put( //生成RelNode和Call的关联
rel,
new RuleProvenance(
ruleCall.rule,
ImmutableList.copyOf(ruleCall.rels),
ruleCall.id));
}
b. 判断是否存在equivExp的RelNode
// If it is equivalent to an existing expression, return the set that
// the equivalent expression belongs to.
RelDigest digest = rel.getRelDigest();
RelNode equivExp = mapDigestToRel.get(digest); //是否有相同digest的RelNode
if (equivExp == null) {
// do nothing
} else if (equivExp == rel) { //如果有,且就是同一个RelNode对象,说明RelNode注册过,直接返回Subset
return getSubset(rel);
} else {
checkPruned(equivExp, rel); //如果不是同一个对象,判断一下pruned,如果equiv已经被pruned,当前的也要pruned1
RelSet equivSet = getSet(equivExp);
if (equivSet != null) {
return registerSubset(set, getSubset(equivExp)); //合并两个equivSet
}
}
c. converter判断
什么是converter?
首先是一种RelNode,这种node不会改变语义,只会改变物理属性,即trait;这里出于简单考虑,一个converter一次只能改变一个trait
注释里面说,Planer会把所有在逻辑上等价的但是具有不同的物理trait的RelNode都放到一个RelSet中
所以,我们上面和后面会看到经常会做merge RelSet的操作,因为只要是逻辑上等价都可以放到一个RelSet中
个人理解,上面converterRule往往会产生conventer进行真正的trait转换
/** * A relational expression implements the interface <code>Converter</code> to * indicate that it converts a physical attribute, or * {@link org.apache.calcite.plan.RelTrait trait}, of a relational expression * from one value to another. * * <p>Sometimes this conversion is expensive; for example, to convert a * non-distinct to a distinct object stream, we have to clone every object in * the input.</p> * * <p>A converter does not change the logical expression being evaluated; after * conversion, the number of rows and the values of those rows will still be the * same. By declaring itself to be a converter, a relational expression is * telling the planner about this equivalence, and the planner groups * expressions which are logically equivalent but have different physical traits * into groups called <code>RelSet</code>s. * * <p>In principle one could devise converters which change multiple traits * simultaneously (say change the sort-order and the physical location of a * relational expression). In which case, the method {@link #getInputTraits()} * would return a {@link org.apache.calcite.plan.RelTraitSet}. But for * simplicity, this class only allows one trait to be converted at a * time; all other traits are assumed to be preserved.</p> */ public interface Converter extends RelNode
// Converters are in the same set as their children.
if (rel instanceof Converter) {
final RelNode input = ((Converter) rel).getInput();
final RelSet childSet = getSet(input);
if ((set != null)
&& (set != childSet)
&& (set.equivalentSet == null)) {
merge(set, childSet); //用于converter不会改变数据,所以input的set需要和当前的合并,因为他们都是逻辑等价的
// During the mergers, the child set may have changed, and since
// we're not registered yet, we won't have been informed. So
// check whether we are now equivalent to an existing
// expression.
if (fixUpInputs(rel)) {
digest = rel.getRelDigest();
RelNode equivRel = mapDigestToRel.get(digest);
if ((equivRel != rel) && (equivRel != null)) {
// make sure this bad rel didn't get into the
// set in any way (fixupInputs will do this but it
// doesn't know if it should so it does it anyway)
set.obliterateRelNode(rel); //
// There is already an equivalent expression. Use that
// one, and forget about this one.
return getSubset(equivRel); //
}
}
} else {
set = childSet; //如果set为空,直接用input的set即可,因为逻辑等价
}
}
5.3 创建RelSet
如果到这,RelSet还为空,说明这个等价集合第一次出现
创建新的RelSet,并加入到allSets中,allSets是List,仅仅用于debug
这里equivalentSet的设计,不知为何有这样的概念,对于RelSet既然equiv就能merge,保留这样的Set chain意义何在?
// Place the expression in the appropriate equivalence set.
if (set == null) {
set = new RelSet( //
nextSetId++,
Util.minus(
RelOptUtil.getVariablesSet(rel),
rel.getVariablesSet()),
RelOptUtil.getVariablesUsed(rel));
this.allSets.add(set); //
}
// Chain to find 'live' equivalent set, just in case several sets are
// merging at the same time.
while (set.equivalentSet != null) { //如果equivalentSet存在,一直找到leader
set = set.equivalentSet;
}
5.4 registerClass
将Relnode的Class和traits注册到相应的机构中,记录planner包含何种RelNode和Traits
private final Set<Class<? extends RelNode>> classes = new HashSet<>();
//private final Set<RelTrait> traits = new HashSet<>();
private final Set<Convention> conventions = new HashSet<>();
这里老版本中是注册RelTrait,而新版中只是注册Convention,Convention只是一种traits
public void registerClass(RelNode node) {
final Class<? extends RelNode> clazz = node.getClass();
if (classes.add(clazz)) {
onNewClass(node);
}
if (conventions.add(node.getConvention())) {
node.getConvention().register(this);
}
}
5.5 addRelToSet
RelSubset subset = addRelToSet(rel, set);
先调用add,将RelNode加入到subset或生产新的subset
private RelSubset addRelToSet(RelNode rel, RelSet set) {
RelSubset subset = set.add(rel);
mapRel2Subset.put(rel, subset);
// While a tree of RelNodes is being registered, sometimes nodes' costs
// improve and the subset doesn't hear about it. You can end up with
// a subset with a single rel of cost 99 which thinks its best cost is
// 100. We think this happens because the back-links to parents are
// not established. So, give the subset another change to figure out
// its cost.
final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
subset.propagateCostImprovements(this, mq, rel, new HashSet<>());
return subset;
}
主要就是注册RelNode所对应的RelSubset
注意这里是IdentityHashMap,所以比较的是RelNode的reference,而不是hashcode,不同的RelNode对象,就会对应各自不同的RelSubset
因为一个RelNode对象一定在一个RelSet中,但是不同的RelSet中可能包含相同RelNode对象实例,比如都有join对象
/** * Map each registered expression ({@link RelNode}) to its equivalence set * ({@link RelSubset}). * * <p>We use an {@link IdentityHashMap} to simplify the process of merging * {@link RelSet} objects. Most {@link RelNode} objects are identified by * their digest, which involves the set that their child relational * expressions belong to. If those children belong to the same set, we have * to be careful, otherwise it gets incestuous.</p> */ private final IdentityHashMap<RelNode, RelSubset> mapRel2Subset = new IdentityHashMap<>();
propagateCostImprovements,因为RelSet发生变化,可能产生新的best cost,所以把当前的change告诉其他的节点,更新cost,看看是否产生新的best cost
void propagateCostImprovements(VolcanoPlanner planner, RelMetadataQuery mq,
RelNode rel, Set<RelSubset> activeSet) {
Queue<Pair<RelSubset, RelNode>> propagationQueue = new ArrayDeque<>();
for (RelSubset subset : set.subsets) { //
if (rel.getTraitSet().satisfies(subset.traitSet)) { //看看set下的所有subset哪些是和这个node的trait符合的,加入到PropagationQueue中
propagationQueue.offer(Pair.of(subset, rel));
}
}
while (!propagationQueue.isEmpty()) { //
Pair<RelSubset, RelNode> p = propagationQueue.poll();
p.left.propagateCostImprovements0(planner, mq, p.right, activeSet, propagationQueue); //
}
}
只要queue不空,不断的调用propagateCostImprovements0,传入activeSet避免circle,传入propagationQueue需要往上层propagation,要把上层符合相同trait的subset也放到queue中
void propagateCostImprovements0(VolcanoPlanner planner, RelMetadataQuery mq,
RelNode rel, Set<RelSubset> activeSet,
Queue<Pair<RelSubset, RelNode>> propagationQueue) {
if (!activeSet.add(this)) { //add过,重复即有环,
LOGGER.trace("cyclic: {}", this);
return;
}
try {
RelOptCost cost = planner.getCost(rel, mq); //通Metadata计算该RelNode的Cost
// Update subset best cost when we find a cheaper rel or the current
// best's cost is changed
if (cost.isLt(bestCost)) { //发现新的best cost
bestCost = cost;
best = rel;
upperBound = bestCost;
// since best was changed, cached metadata for this subset should be removed
mq.clearCache(this);
// Propagate cost change to parents
for (RelNode parent : getParents()) { //向上传递
// removes parent cached metadata since its input was changed
mq.clearCache(parent);
final RelSubset parentSubset = planner.getSubset(parent);
// parent subset will clear its cache in propagateCostImprovements0 method itself
for (RelSubset subset : parentSubset.set.subsets) {
if (parent.getTraitSet().satisfies(subset.traitSet)) { //如果parent的Subset满足相同trait,也放到queue中
propagationQueue.offer(Pair.of(subset, parent));
}
}
}
}
} finally {
activeSet.remove(this);
}
}
注册digest,如果已经注册过,直接返回
final RelNode xx = mapDigestToRel.putIfAbsent(digest, rel);
// This relational expression may have been registered while we
// recursively registered its children. If this is the case, we're done.
if (xx != null) {
return subset;
}
===============OLD========================================
5.5 importance
importance用于表示RelSubset的优先级,优先级越高,越先进行优化
在RuleQueue里面,用这个结构来保存各个subset的importance
/** * The importance of each subset. */ final Map<RelSubset, Double> subsetImportances = new HashMap<>();
importance的计算方法如下,
Computes the importance of a node. Importance is defined as follows:
the root RelSubset has an importance of 1
其实很简单,
比如root cost 3,两个child的cost,2,5;而root的importance为1
那么两个child的importance就是,0.2和0.5
所以越top的节点,importance越大,cost越大的节点,importance越大
==============================================================
5.6 fireRules
这里的fireRules,一般都是选择DeferringRuleCall,所以不是马上执行rule的,因为那样比较低效,而是等真正需要的时候才去执行
void fireRules(RelNode rel) {
for (RelOptRuleOperand operand : classOperands.get(rel.getClass())) {
if (operand.matches(rel)) {
final VolcanoRuleCall ruleCall;
ruleCall = new DeferringRuleCall(this, operand);
ruleCall.match(rel);
}
}
}
classOperands里面保存,每个RelNode所匹配到的所有的RuleOperand
classOperands只是说明当前Operands和RelNode匹配,但是当前RelNode子树是否匹配Rule,需要进一步看
可以看到这里会Recurse的match,match的逻辑很长,这里就不看了
每匹配一次,solve+1,当solve == operands.size(),说明对整个Rule完成匹配
会调用onMatch
/**
* Applies this rule, with a given relational expression in the first slot.
*/
void match(RelNode rel) {
assert getOperand0().matches(rel) : "precondition";
final int solve = 0;
int operandOrdinal = getOperand0().solveOrder[solve];
this.rels[operandOrdinal] = rel;
matchRecurse(solve + 1);
}
/**
* Recursively matches operands above a given solve order.
*
* @param solve Solve order of operand (> 0 and ≤ the operand count)
*/
private void matchRecurse(int solve) {
assert solve > 0;
assert solve <= rule.operands.size();
final List<RelOptRuleOperand> operands = getRule().operands;
if (solve == operands.size()) {
// We have matched all operands. Now ask the rule whether it
// matches; this gives the rule chance to apply side-conditions.
// If the side-conditions are satisfied, we have a match.
if (getRule().matches(this)) {
onMatch();
}
} else {......}}
对于DeferringRuleCall,
onMatch的逻辑,就是封装成VolcanoRuleMatch,并丢到RuleQueue里面去
并没有真正的执行Rule的onMatch,这就是Deferring
其实RuleQueue,RuleMatch, Importance 这些概念都是为了实现Deferring创造出来的,如果直接fire,机制就很简单
/**
* Rather than invoking the rule (as the base method does), creates a
* {@link VolcanoRuleMatch} which can be invoked later.
*/
protected void onMatch() {
final VolcanoRuleMatch match =
new VolcanoRuleMatch(
volcanoPlanner,
getOperand0(),
rels,
nodeInputs);
volcanoPlanner.ruleDriver.getRuleQueue().addMatch(match);
}
}
6. findBestExp
新版本的findBestExp
/**
* Finds the most efficient expression to implement the query given via
* {@link org.apache.calcite.plan.RelOptPlanner#setRoot(org.apache.calcite.rel.RelNode)}.
*
* @return the most efficient RelNode tree found for implementing the given
* query
*/
public RelNode findBestExp() {
ensureRootConverters();
registerMaterializations();
ruleDriver.drive(); //
RelNode cheapest = root.buildCheapestPlan(this); //
return cheapest;
}
6.1 RuleDriver
抽象出RuleDriver,这样可以简单的实现多种search算法
核心就是RuleQueue和drive
/**
* A rule driver applies rules with designed algorithms.
*/
interface RuleDriver {
/**
* Gets the rule queue.
*/
RuleQueue getRuleQueue();
/**
* Apply rules.
*/
void drive();
/**
* Callback when new RelNodes are added into RelSet.
* @param rel the new RelNode
* @param subset subset to add
*/
void onProduce(RelNode rel, RelSubset subset);
/**
* Callback when RelSets are merged.
* @param set the merged result set
*/
void onSetMerged(RelSet set);
/**
* Clear this RuleDriver.
*/
void clear();
}
官方给出两种实现,
IterativeRuleDriver,这就是简单的反复执行
TopDownRuleDriver,TopDown的遍历RelNode,并触发相应的在Queue中的match
看下drive的实现,
逻辑就是往Tasks里面加入task,然后调用task.perfrom
tasks的实现是个stack,所以先放入的task会被后执行,注意了!
private Stack<Task> tasks = new Stack<>();
默认会加入OptimizeGroup
@Override public void drive() {
TaskDescriptor description = new TaskDescriptor();
// Starting from the root's OptimizeGroup task.
tasks.push(new OptimizeGroup(planner.root, planner.infCost));
// ensure materialized view roots get explored.
// Note that implementation rules or enforcement rules are not applied
// unless the mv is matched
exploreMaterializationRoots();
try {
// Iterates until the root is fully optimized
while (!tasks.isEmpty()) {
Task task = tasks.pop();
description.log(task);
task.perform(); //
}
} catch (VolcanoTimeoutException ex) {
LOGGER.warn("Volcano planning times out, cancels the subsequent optimization.");
}
}
exploreMaterializationRoots,会加入OptimizeMExpr task
planner.explorationRoots意思是Extra roots for explorations
private void exploreMaterializationRoots() {
for (RelSubset extraRoot : planner.explorationRoots) {
RelSet rootSet = VolcanoPlanner.equivRoot(extraRoot.set);
if (rootSet == planner.root.set) {
continue;
}
for (RelNode rel : extraRoot.set.rels) {
if (planner.isLogical(rel)) {
tasks.push(new OptimizeMExpr(rel, extraRoot, true));
}
}
}
}
OptimizeMExpr会先于OptimizeGroup被执行,下面再看这个task的实现
看下OptimizeGroup task的实现,task本身是接口,关键的函数是perform
看下OptimizeGroup的perform主要是生成3种task
/**
* Optimize a RelSubset.
* It schedule optimization tasks for RelNodes in the RelSet.
*/
private class OptimizeGroup implements Task {
private final RelSubset group;
private RelOptCost upperBound;
@Override public void perform() {
RelOptCost winner = group.getWinnerCost(); //判断taskState,如果是completed,返回best
if (winner != null) {
return;
}
if (group.taskState != null && upperBound.isLe(group.upperBound)) {
// either this group failed to optimize before or it is a ring,优化过?
return;
}
group.startOptimize(upperBound); //把state设置成optimizing
// cannot decide an actual lower bound before MExpr are fully explored
// so delay the lower bound checking
// a gate keeper to update context
tasks.push(new GroupOptimized(group)); //放入GroupOptimized任务,Mark the RelSubset optimized,最先加入,所以最后执行mark
// optimize mExprs in group
List<RelNode> physicals = new ArrayList<>();
for (RelNode rel : group.set.rels) {
if (planner.isLogical(rel)) {
tasks.push(new OptimizeMExpr(rel, group, false)); //对于logicalNode,放入OptimizedMExpr任务
} else if (rel.isEnforcer()) {
// Enforcers have lower priority than other physical nodes
physicals.add(0, rel);
} else {
physicals.add(rel);
}
}
// always apply O_INPUTS first so as to get an valid upper bound
for (RelNode rel : physicals) {
Task task = getOptimizeInputTask(rel, group); //对于physicalNode放入OptimizeInput任务
if (task != null) {
tasks.add(task);
}
}
}
分别看下这3种任务,
最先调用的是getOptimizeInputTask,决定如何优化物理节点
这里又有3个子任务,
// Decide how to optimize a physical node.
private Task getOptimizeInputTask(RelNode rel, RelSubset group) {
boolean unProcess = false;
for (RelNode input : rel.getInputs()) { //看下是否有input没有优化
RelOptCost winner = ((RelSubset) input).getWinnerCost(); //通过看每个input的state
if (winner == null) {
unProcess = true;
break;
}
}
// If the inputs are all processed, only DeriveTrait is required.
if (!unProcess) {
return new DeriveTrait(rel, group); //如果全优化过了,加入DeriveTrait任务,Apply enforcing rules
}
// If part of the inputs are not optimized, schedule for the node an OptimizeInput task,
// which tried to optimize the inputs first and derive traits for further execution.
if (rel.getInputs().size() == 1) {
return new OptimizeInput1(rel, group); //有一个没有优化
}
return new OptimizeInputs(rel, group); //多个没有优化
}
在OptimizeInputs里面就不展开了,主要是加入这两个任务,这里已经递归调用到OptimizeGroup
tasks.push(new CheckInput(null, mExpr, input, 0, upperForInput));
tasks.push(new OptimizeGroup(input, upperForInput));
然后再调用OptimizeMExpr,这个应该是主要的task,
/**
* Optimize a logical node, including exploring its input and applying rules for it.
*/
private class OptimizeMExpr implements Task {
private final RelNode mExpr;
private final RelSubset group;
// when true, only apply transformation rules for mExpr
private final boolean explore;
OptimizeMExpr(RelNode mExpr,
RelSubset group, boolean explore) {
this.mExpr = mExpr;
this.group = group;
this.explore = explore;
}
@Override public void perform() {
if (explore && group.isExplored()) {
return;
}
// 1. explode input
// 2. apply other rules
tasks.push(new ApplyRules(mExpr, group, explore));
for (int i = mExpr.getInputs().size() - 1; i >= 0; --i) {
tasks.push(new ExploreInput(mExpr, i));
}
}
主要是加入两种task,
首先是对于input,加入ExploreInput,
主要逻辑,对inputs进行OptimizedMExpr,然后设置成explored
/**
* Explore an input for a RelNode.
*/
private class ExploreInput implements Task {
private final RelSubset group;
private final RelNode parent;
private final int inputOrdinal;
@Override public void perform() {
if (!group.explore()) { //如果已经explored,返回
return;
}
tasks.push(new EnsureGroupExplored(group, parent, inputOrdinal)); //set explored
for (RelNode rel : group.set.rels) {
if (planner.isLogical(rel)) {
tasks.push(new OptimizeMExpr(rel, group, true)); //
}
}
}
然后是ApplyRules,
/**
* Extract rule matches from rule queue and add them to task stack.
*/
private class ApplyRules implements Task {
private final RelNode mExpr;
private final RelSubset group;
private final boolean exploring;
@Override public void perform() {
Pair<RelNode, Predicate<VolcanoRuleMatch>> category =
exploring ? Pair.of(mExpr, planner::isTransformationRule)
: Pair.of(mExpr, m -> true); //Predicate是谓词函数,单参数,boolean返回,exploring,这里只有TransformationRule为true,否则,一直为true
VolcanoRuleMatch match = ruleQueue.popMatch(category); //
while (match != null) {
tasks.push(new ApplyRule(match, group, exploring)); //
match = ruleQueue.popMatch(category); //
}
}
逻辑就是不断的从ruleQueue中popMatch,然后ApplyRule,直到pop不出为止
popMatch的实现,
public VolcanoRuleMatch popMatch(Pair<RelNode, Predicate<VolcanoRuleMatch>> category) {
List<VolcanoRuleMatch> queue = matches.get(category.left); //Map<RelNode, List<VolcanoRuleMatch>> matches,拿到RelNode相关的所有matches
if (queue == null) {
return null;
}
Iterator<VolcanoRuleMatch> iterator = queue.iterator();
while (iterator.hasNext()) {
VolcanoRuleMatch next = iterator.next();
if (category.right != null && !category.right.test(next)) { //用predicate验证,不满足就continue
continue;
}
iterator.remove(); //取出item
if (!skipMatch(next)) { //如果这个match没有被pruned
return next;
}
}
return null;
}
applyRule任务,
/**
* Apply a rule match.
*/
private class ApplyRule implements GeneratorTask {
private final VolcanoRuleMatch match;
private final RelSubset group;
private final boolean exploring;
@Override public void perform() {
applyGenerator(this, match::onMatch); //
}
private void applyGenerator(GeneratorTask task, Procedure proc) {
GeneratorTask applying = this.applying; //保存当前applying task
this.applying = task; //
try {
proc.exec(); //执行proc
} finally {
this.applying = applying; //恢复applying
}
}
===============================旧版本的findBestExp============================================
public RelNode findBestExp() {
int cumulativeTicks = 0; //总步数,tick代表优化一次,触发一个RuleMatch
//这个for只会执行一次,因为只有Optimize phase里面加了RuleMatch,其他都是空的
//RuleQueue.addMatch中,phaseRuleSet != ALL_RULES 会过滤到其他的phase
for (VolcanoPlannerPhase phase : VolcanoPlannerPhase.values()) {
setInitialImportance(); //初始化impoartance
RelOptCost targetCost = costFactory.makeHugeCost(); //目标cost,设为Huge
int tick = 0; //如果for执行一次,等同于cumulativeTicks
int firstFiniteTick = -1; //第一次找到可执行plan用的tick数
int giveUpTick = Integer.MAX_VALUE; //放弃优化的tick数
while (true) {
++tick; //开始一次优化,tick+1
++cumulativeTicks;
if (root.bestCost.isLe(targetCost)) { //如果bestcost < targetCost,说明找到可执行的计划
if (firstFiniteTick < 0) { //如果是第一次找到
firstFiniteTick = cumulativeTicks; //更新firstFiniteTick
clearImportanceBoost(); //清除ImportanceBoost,RelSubset中有个field,boolean boosted,表示是否被boost
}
if (ambitious) {
// 会试图找到更优的计划
targetCost = root.bestCost.multiplyBy(0.9); //适当降低targetCost
//如果impatient,需要设置giveUpTick
//giveUpTick初始MAX_VALUE,当成功找到一个计划后,才会设置成相应的值
if (impatient) {
if (firstFiniteTick < 10) { //如果第一次找到计划,步数小于10
//下一轮如果25步找不到更优计划,放弃
giveUpTick = cumulativeTicks + 25;
} else {
//如果计划比较复杂,步数放宽些
giveUpTick =
cumulativeTicks
+ Math.max(firstFiniteTick / 10, 25);
}
}
} else {
break; //非ambitious,有可用的计划就行,结束
}
} else if (cumulativeTicks > giveUpTick) { //放弃优化
// We haven't made progress recently. Take the current best.
break;
} else if (root.bestCost.isInfinite() && ((tick % 10) == 0)) {
//步数为整10,仍然没有找到可用的计划
//bestCost的初始值就是Infinite
injectImportanceBoost(); //提高某些RelSubSet的Importance,加快cost降低
}
VolcanoRuleMatch match = ruleQueue.popMatch(phase); //从RuleQueue中找到importance最大的Match
if (match == null) {
break;
}
match.onMatch(); //触发match
// The root may have been merged with another
// subset. Find the new root subset.
root = canonize(root);
}
ruleQueue.phaseCompleted(phase);
}
RelNode cheapest = root.buildCheapestPlan(this);
return cheapest;
}
injectImportanceBoost
把仅仅包含Convention.NONE的RelSubSets的Importance提升,意思就让这些RelSubsets先被优化
/**
* Finds RelSubsets in the plan that contain only rels of
* {@link Convention#NONE} and boosts their importance by 25%.
*/
private void injectImportanceBoost() {
final Set<RelSubset> requireBoost = new HashSet<>();
SUBSET_LOOP:
for (RelSubset subset : ruleQueue.subsetImportances.keySet()) {
for (RelNode rel : subset.getRels()) {
if (rel.getConvention() != Convention.NONE) {
continue SUBSET_LOOP;
}
}
requireBoost.add(subset);
}
ruleQueue.boostImportance(requireBoost, 1.25);
}
Convention.NONE,都是infinite cost,所以先优化他们会更有效的降低cost
public interface Convention extends RelTrait {
/**
* Convention that for a relational expression that does not support any
* convention. It is not implementable, and has to be transformed to
* something else in order to be implemented.
*
* <p>Relational expressions generally start off in this form.</p>
*
* <p>Such expressions always have infinite cost.</p>
*/
Convention NONE = new Impl("NONE", RelNode.class);
PopMatch
找出importance最大的match,并且返回
/**
* Removes the rule match with the highest importance, and returns it.
*
* <p>Returns {@code null} if there are no more matches.</p>
*
* <p>Note that the VolcanoPlanner may still decide to reject rule matches
* which have become invalid, say if one of their operands belongs to an
* obsolete set or has importance=0.
*
* @throws java.lang.AssertionError if this method is called with a phase
* previously marked as completed via
* {@link #phaseCompleted(VolcanoPlannerPhase)}.
*/
VolcanoRuleMatch popMatch(VolcanoPlannerPhase phase) {
PhaseMatchList phaseMatchList = matchListMap.get(phase);
final List<VolcanoRuleMatch> matchList = phaseMatchList.list;
VolcanoRuleMatch match;
for (;;) {
if (matchList.isEmpty()) {
return null;
}
if (LOGGER.isTraceEnabled()) {
//...
} else {
match = null;
int bestPos = -1;
int i = -1;
//找出importance最大的match
for (VolcanoRuleMatch match2 : matchList) {
++i;
if (match == null
|| MATCH_COMPARATOR.compare(match2, match) < 0) {
bestPos = i;
match = match2;
}
}
match = matchList.remove(bestPos);
}
if (skipMatch(match)) {
LOGGER.debug("Skip match: {}", match);
} else {
break;
}
}
// A rule match's digest is composed of the operand RelNodes' digests,
// which may have changed if sets have merged since the rule match was
// enqueued.
match.recomputeDigest();
phaseMatchList.matchMap.remove(
planner.getSubset(match.rels[0]), match);
return match;
}
onMatch
/**
* Called when all operands have matched.
*/
protected void onMatch() {
volcanoPlanner.ruleCallStack.push(this);
try {
getRule().onMatch(this);
} finally {
volcanoPlanner.ruleCallStack.pop();
}
}
ruleCallStack记录当前在执行的RuleCall
最终调用到具体Rule的onMatch函数,做具体的转换
========================================================================================
6.2 buildCheapestPlan
/**
* Recursively builds a tree consisting of the cheapest plan at each node.
*/
RelNode buildCheapestPlan(VolcanoPlanner planner) {
CheapestPlanReplacer replacer = new CheapestPlanReplacer(planner);
final RelNode cheapest = replacer.visit(this, -1, null);
return cheapest;
}
可以看到逻辑其实比较简单,就是遍历RelSubSet树,然后从上到下都选best的RelNode形成新的树
/**
* Visitor which walks over a tree of {@link RelSet}s, replacing each node
* with the cheapest implementation of the expression.
*/
static class CheapestPlanReplacer {
VolcanoPlanner planner;
CheapestPlanReplacer(VolcanoPlanner planner) {
super();
this.planner = planner;
}
public RelNode visit(
RelNode p,
int ordinal,
RelNode parent) {
if (p instanceof RelSubset) {
RelSubset subset = (RelSubset) p;
RelNode cheapest = subset.best; //取出SubSet中的best
p = cheapest; //替换
}
List<RelNode> oldInputs = p.getInputs();
List<RelNode> inputs = new ArrayList<>();
for (int i = 0; i < oldInputs.size(); i++) {
RelNode oldInput = oldInputs.get(i);
RelNode input = visit(oldInput, i, p); //递归执行visit
inputs.add(input); //新的input
}
if (!inputs.equals(oldInputs)) {
final RelNode pOld = p;
p = p.copy(p.getTraitSet(), inputs); //生成新的p
planner.provenanceMap.put(
p, new VolcanoPlanner.DirectProvenance(pOld));
}
return p;
}
}
}
BestCost是如何变化的?
每个RelSubSet都会记录,
bestCost和bestPlan
/**
* cost of best known plan (it may have improved since)
*/
RelOptCost bestCost;
/**
* The set this subset belongs to.
*/
final RelSet set;
/**
* best known plan
*/
RelNode best;
初始化
首先RelSubSet初始化的时候,会执行computeBestCost
private void computeBestCost(RelOptPlanner planner) {
bestCost = planner.getCostFactory().makeInfiniteCost(); //bestCost初始化成,Double.POSITIVE_INFINITY
final RelMetadataQuery mq = getCluster().getMetadataQuery();
for (RelNode rel : getRels()) {
final RelOptCost cost = planner.getCost(rel, mq);
if (cost.isLt(bestCost)) {
bestCost = cost;
best = rel;
}
}
}
getCost
public RelOptCost getCost(RelNode rel, RelMetadataQuery mq) {
if (rel instanceof RelSubset) {
return ((RelSubset) rel).bestCost; //如果是RelSubSet直接返回结果,因为动态规划,重用之前的结果,不用反复算
}
if (noneConventionHasInfiniteCost //Convention.NONE的cost为InfiniteCost,返回
&& rel.getTraitSet().getTrait(ConventionTraitDef.INSTANCE) == Convention.NONE) {
return costFactory.makeInfiniteCost();
}
RelOptCost cost = mq.getNonCumulativeCost(rel); //算cost
if (!zeroCost.isLt(cost)) {
// cost must be positive, so nudge it
cost = costFactory.makeTinyCost(); //如果算出负的cost,用TinyCost替代,1.0
}
for (RelNode input : rel.getInputs()) {
cost = cost.plus(getCost(input, mq)); //递归把整个数的cost都加到root
}
return cost;
}
getNonCumulativeCost
/**
* Estimates the cost of executing a relational expression, not counting the
* cost of its inputs. (However, the non-cumulative cost is still usually
* dependent on the row counts of the inputs.) The default implementation
* for this query asks the rel itself via {@link RelNode#computeSelfCost},
* but metadata providers can override this with their own cost models.
*
* @return estimated cost, or null if no reliable estimate can be
* determined
*/
RelOptCost getNonCumulativeCost();
/** Handler API. */
interface Handler extends MetadataHandler<NonCumulativeCost> {
RelOptCost getNonCumulativeCost(RelNode r, RelMetadataQuery mq);
}
getNonCumulativeCost最终调用的是RelNode#computeSelfCost
这是个抽象接口,每个RelNode的实现不同,看下比较简单的Filter的实现,
@Override public RelOptCost computeSelfCost(RelOptPlanner planner,
RelMetadataQuery mq) {
double dRows = mq.getRowCount(this);
double dCpu = mq.getRowCount(getInput());
double dIo = 0;
return planner.getCostFactory().makeCost(dRows, dCpu, dIo);
}
这里的实现就是单纯用rowCount来表示cost
makeCost也是直接封装成VolcanoCost对象
节点变更
各个地方当产生新的RelNode时,会调用Register,ensureRegistered,或registerImpl进行注册
public RelSubset register(
RelNode rel,
RelNode equivRel) {
final RelSet set;
if (equivRel == null) {
set = null;
} else {
set = getSet(equivRel);
}
final RelSubset subset = registerImpl(rel, set);
return subset;
}
public RelSubset ensureRegistered(RelNode rel, RelNode equivRel) {
final RelSubset subset = getSubset(rel);
if (subset != null) {
if (equivRel != null) {
final RelSubset equivSubset = getSubset(equivRel);
if (subset.set != equivSubset.set) {
merge(equivSubset.set, subset.set);
}
}
return subset;
} else {
return register(rel, equivRel);
}
}
registerImpl调用addRelToSet,registerImpl的实现前面有
private RelSubset addRelToSet(RelNode rel, RelSet set) {
RelSubset subset = set.add(rel);
mapRel2Subset.put(rel, subset);
// While a tree of RelNodes is being registered, sometimes nodes' costs
// improve and the subset doesn't hear about it. You can end up with
// a subset with a single rel of cost 99 which thinks its best cost is
// 100. We think this happens because the back-links to parents are
// not established. So, give the subset another change to figure out
// its cost.
final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
subset.propagateCostImprovements(this, mq, rel, new HashSet<>());
return subset;
}
propagateCostImprovements
/**
* Checks whether a relexp has made its subset cheaper, and if it so,
* recursively checks whether that subset's parents have gotten cheaper.
*
* @param planner Planner
* @param mq Metadata query
* @param rel Relational expression whose cost has improved
* @param activeSet Set of active subsets, for cycle detection
*/
void propagateCostImprovements(VolcanoPlanner planner, RelMetadataQuery mq,
RelNode rel, Set<RelSubset> activeSet) {
for (RelSubset subset : set.subsets) {
if (rel.getTraitSet().satisfies(subset.traitSet)) {
subset.propagateCostImprovements0(planner, mq, rel, activeSet);
}
}
}
void propagateCostImprovements0(VolcanoPlanner planner, RelMetadataQuery mq,
RelNode rel, Set<RelSubset> activeSet) {
++timestamp;
if (!activeSet.add(this)) { //检测到环
// This subset is already in the chain being propagated to. This
// means that the graph is cyclic, and therefore the cost of this
// relational expression - not this subset - must be infinite.
LOGGER.trace("cyclic: {}", this);
return;
}
try {
final RelOptCost cost = planner.getCost(rel, mq); //获取cost
if (cost.isLt(bestCost)) {
bestCost = cost;
best = rel;
// Lower cost means lower importance. Other nodes will change
// too, but we'll get to them later.
planner.ruleQueue.recompute(this); //cost变了,所以importance要重新算
//递归的执行propagateCostImprovements
for (RelNode parent : getParents()) {
final RelSubset parentSubset = planner.getSubset(parent);
parentSubset.propagateCostImprovements(planner, mq, parent,
activeSet);
}
planner.checkForSatisfiedConverters(set, rel);
}
} finally {
activeSet.remove(this);
}
}