Hive QL模块执行计划优化分析
Transform
查询优化的基类,优化过程通过子类重写transform(ParseContext pctx)
函数实现。
/**
* Optimizer interface. All the rule-based optimizations implement this
* interface. All the transformations are invoked sequentially. They take the
* current parse context (which contains the operator tree among other things),
* perform all the optimizations, and then return the updated parse context.
*/
public abstract class Transform {
/**
* All transformation steps implement this interface.
*
* @param pctx
* input parse context
* @return ParseContext
* @throws SemanticException
*/
public abstract ParseContext transform(ParseContext pctx) throws SemanticException;
public void beginPerfLogging() {
PerfLogger perfLogger = SessionState.getPerfLogger();
perfLogger.perfLogBegin(this.getClass().getName(), PerfLogger.OPTIMIZER);
}
public void endPerfLogging() {
PerfLogger perfLogger = SessionState.getPerfLogger();
perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER);
}
public void endPerfLogging(String additionalInfo) {
PerfLogger perfLogger = SessionState.getPerfLogger();
perfLogger.perfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER, additionalInfo);
}
}
SimpleFetchOptimizer
通过对执行计划的判断,根据数据量以及是否需要group by等操作判断是否可以直接返回hdfs中的数据,避免执行mapreduce过程。
/**
* Tries to convert simple fetch query to single fetch task, which fetches rows directly
* from location of table/partition.
*/