dremio ClassCompilerSelector 简单说明
ClassCompilerSelector 核心是基于配置的策略选择不同的类编译器,然后编译为字节数组
当前包含了基于jdk 的以及janino
ClassCompiler实现类图
使用到的类
直接使用主要包含CodeCompiler以及QueryClassLoader,间接的包含了不少,主要是对于生成的代码进行编译,具体代码生成利用了codemodel 包
同时从下边的调用关系可以看出dremio 的核心sabot 引擎的操作使用到了不少动态代码生成(包含了dremio 实际任务的执行)
- 参考调用关系
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 9) cost in 413 ms, listenerId: 29
ts=2022-12-28 05:38:35;thread_name=e1 - 1c5429a4-5f93-2ddc-5df9-a480c18a6500:frag:0:0;id=dd;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
@com.dremio.exec.compile.CodeCompiler.getInstances()
at com.dremio.exec.compile.CodeCompiler.getImplementationClass(CodeCompiler.java:79)
at com.dremio.exec.compile.CodeCompiler.getImplementationClass(CodeCompiler.java:63)
at com.dremio.exec.expr.CodeGenerator.getImplementationClass(CodeGenerator.java:161)
at com.dremio.exec.store.CoercionReader.setupProjector(CoercionReader.java:164)
at com.dremio.exec.store.CoercionReader.newSchema(CoercionReader.java:136)
at com.dremio.exec.store.CoercionReader.setup(CoercionReader.java:119)
at com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser(ScanOperator.java:343)
at com.dremio.sabot.op.scan.ScanOperator.setupReader(ScanOperator.java:334)
at com.dremio.sabot.op.scan.ScanOperator.setup(ScanOperator.java:298)
at com.dremio.sabot.driver.SmartOp$SmartProducer.setup(SmartOp.java:592)
at com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer(Pipe.java:79)
at com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer(Pipe.java:63)
at com.dremio.sabot.driver.SmartOp$SmartProducer.accept(SmartOp.java:562)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.Pipeline.setup(Pipeline.java:71)
at com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution(FragmentExecutor.java:598)
at com.dremio.sabot.exec.fragment.FragmentExecutor.run(FragmentExecutor.java:430)
at com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700(FragmentExecutor.java:106)
at com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run(FragmentExecutor.java:973)
at com.dremio.sabot.task.AsyncTaskWrapper.run(AsyncTaskWrapper.java:121)
at com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop(SlicingThread.java:247)
at com.dremio.sabot.task.slicing.SlicingThread.run(SlicingThread.java:171)
ts=2022-12-28 05:38:35;thread_name=e1 - 1c5429a4-5f93-2ddc-5df9-a480c18a6500:frag:0:0;id=dd;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
@com.dremio.exec.compile.CodeCompiler.getImplementationClass()
at com.dremio.exec.compile.CodeCompiler.getImplementationClass(CodeCompiler.java:63)
at com.dremio.exec.expr.CodeGenerator.getImplementationClass(CodeGenerator.java:161)
at com.dremio.exec.store.CoercionReader.setupProjector(CoercionReader.java:164)
at com.dremio.exec.store.CoercionReader.newSchema(CoercionReader.java:136)
at com.dremio.exec.store.CoercionReader.setup(CoercionReader.java:119)
at com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser(ScanOperator.java:343)
at com.dremio.sabot.op.scan.ScanOperator.setupReader(ScanOperator.java:334)
at com.dremio.sabot.op.scan.ScanOperator.setup(ScanOperator.java:298)
at com.dremio.sabot.driver.SmartOp$SmartProducer.setup(SmartOp.java:592)
at com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer(Pipe.java:79)
at com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer(Pipe.java:63)
at com.dremio.sabot.driver.SmartOp$SmartProducer.accept(SmartOp.java:562)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.StraightPipe.setup(StraightPipe.java:102)
at com.dremio.sabot.driver.Pipeline.setup(Pipeline.java:71)
at com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution(FragmentExecutor.java:598)
at com.dremio.sabot.exec.fragment.FragmentExecutor.run(FragmentExecutor.java:430)
at com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700(FragmentExecutor.java:106)
at com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run(FragmentExecutor.java:973)
at com.dremio.sabot.task.AsyncTaskWrapper.run(AsyncTaskWrapper.java:121)
at com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop(SlicingThread.java:247)
at com.dremio.sabot.task.slicing.SlicingThread.run(SlicingThread.java:171)
对于实际代码生成以及代码的编译是基于了模版TemplateClassDefinition
构造函数如下
// 需要包含TemplateClassDefinition 定义,目前系统已经包含了不少实现
CodeGenerator(CodeCompiler compiler, MappingSet mappingSet, TemplateClassDefinition<T> definition, FunctionContext functionContext) {
Preconditions.checkNotNull(definition.getSignature(),
"The signature for defintion %s was incorrectly initialized.", definition);
this.definition = definition;
this.compiler = compiler;
this.className = definition.getExternalInterface().getSimpleName() + "Gen" + definition.getNextClassNumber();
this.fqcn = PACKAGE_NAME + "." + className;
try {
// 基于codeModel
this.model = new JCodeModel();
JDefinedClass clazz = model._package(PACKAGE_NAME)._class("GenericGenerated");
clazz = clazz._extends(model.directClass(definition.getTemplateClassName()));
clazz.constructor(JMod.PUBLIC).body().invoke(SignatureHolder.INIT_METHOD);
rootGenerator = new ClassGenerator<>(this, mappingSet, definition.getSignature(), new EvaluationVisitor(functionContext), clazz, model);
this.functionContext = functionContext;
} catch (JClassAlreadyExistsException e) {
throw new IllegalStateException(e);
}
}
实际调用
public T getImplementationClass(){
// 基于CodeCompiler 进行加载
return compiler.getImplementationClass(this);
}
public List<T> getImplementationClass(final int instanceCount){
// 基于CodeCompiler 进行加载
return compiler.getImplementationClass(this, instanceCount);
}
实际类编译
// 基于缓存编译,QueryClassLoader 结合ClassCompilerSelector 进行实际类编译的处理
private class GeneratedCodeToCompiledClazzCacheLoader extends CacheLoader<CodeGenerator<?>, GeneratedClassEntry> {
@Override
public GeneratedClassEntry load(final CodeGenerator<?> cg) throws Exception {
logger.debug("In Cache load; Compile code");
final QueryClassLoader loader = new QueryClassLoader(selector);
final Class<?> c = transformer.getImplementationClass(loader, cg.getDefinition(),
cg.getGeneratedCode(), cg.getMaterializedClassName());
logger.debug("Exit Cache load");
return new GeneratedClassEntry(c);
}
}
- 参考处理
比如CoercionReader
protected void setupProjector(VectorContainer projectorOutput) {
if (DEBUG_PRINT) {
debugPrint(projectorOutput);
}
if (incoming.getSchema() == null || incoming.getSchema().getFieldCount() == 0) {
return;
}
// 通过OperatorContext的ClassProducer进行类的创建以及编译,同时不同的处理会包含自己的模版定义
// Projector 的为: TemplateClassDefinition<Projector> TEMPLATE_DEFINITION = new TemplateClassDefinition<Projector>(Projector.class, ProjectorTemplate.class);
final ClassGenerator<Projector> cg = context.getClassProducer().createGenerator(Projector.TEMPLATE_DEFINITION).getRoot();
final IntHashSet transferFieldIds = new IntHashSet();
final List<TransferPair> transfers = Lists.newArrayList();
try {
splitter = ProjectOperator.createSplitterWithExpressions(incoming, exprs, transfers, cg,
transferFieldIds, context, projectorOptions, projectorOutput, targetSchema);
// 此处还会包含支持GANDIVA模式的处理
splitter.setupProjector(projectorOutput, javaCodeGenWatch, gandivaCodeGenWatch);
} catch (Exception e) {
throw Throwables.propagate(e);
}
javaCodeGenWatch.start();
this.projector = cg.getCodeGenerator().getImplementationClass();
this.projector.setup(context.getFunctionContext(), incoming, projectorOutput, transfers, name -> null);
javaCodeGenWatch.stop();
OperatorStats stats = context.getStats();
stats.addLongStat(ScanOperator.Metric.JAVA_BUILD_TIME_NS, javaCodeGenWatch.elapsed(TimeUnit.NANOSECONDS));
stats.addLongStat(ScanOperator.Metric.GANDIVA_BUILD_TIME_NS, gandivaCodeGenWatch.elapsed(TimeUnit.NANOSECONDS));
gandivaCodeGenWatch.reset();
javaCodeGenWatch.reset();
// when individual fields of a struct column are projected, currently it results
// in setting schema changed flag. Resetting the flag in iceberg flow, since
// schema learning should not happen in iceberg flow
outputMutator.getAndResetSchemaChanged();
}
说明
dremio 实际的执行计划使用了不少动态代码生成技术,同时如果基于jprofiler 等分析工具也会看到不少的类加载处理
参加资料
sabot/kernel/src/main/java/com/dremio/exec/compile/ClassCompilerSelector.java
sabot/kernel/src/main/java/com/dremio/exec/expr/CodeGenerator.java
sabot/kernel/src/main/java/com/dremio/exec/expr/SingleClassStringWriter.java
sabot/kernel/src/main/java/com/dremio/exec/expr/ClassGenerator.java
sabot/kernel/src/main/java/com/dremio/sabot/exec/context/OperatorContext.java
sabot/kernel/src/main/java/com/dremio/exec/expr/ClassProducer.java
sabot/kernel/src/main/java/com/dremio/exec/compile/TemplateClassDefinition.java