dremio 队列类型判断处理简单说明

使用过dremio 的基本都支持dremio 包含了一个队列的概念,以下简单说明下dremio 对于队列判断的处理

目前定义的队列类型

public enum QueueType {
  // TODO figure out split between capacities for below queues
  SMALL(30D),
  LARGE(30D),
  REFLECTION_SMALL(25D),
  REFLECTION_LARGE(15D);
 
  private double capacity;
 
  QueueType(double capacity) {
    this.capacity = capacity;
  }
 
  public double getCapacity() {
    return capacity;
  }
}

内部判定处理

  • 资源分配器对于队列的处理
    BasicResourceAllocator 中的方法
public ResourceSchedulingResult allocate(
    final ResourceSchedulingContext queryContext,
    final ResourceSchedulingProperties resourceSchedulingProperties,
    final ResourceSchedulingObserver resourceSchedulingObserver,
    final Consumer<ResourceSchedulingDecisionInfo> schedulingDecisionInfoConsumer) {
 
  final ResourceSchedulingDecisionInfo resourceSchedulingDecisionInfo =
      new ResourceSchedulingDecisionInfo();
  // 此处进行判断
  final QueueType queueType =
      getQueueNameFromSchedulingProperties(queryContext, resourceSchedulingProperties);
  resourceSchedulingDecisionInfo.setQueueName(queueType.name());
  resourceSchedulingDecisionInfo.setQueueId(queueType.name());
  resourceSchedulingDecisionInfo.setWorkloadClass(
      queryContext.getQueryContextInfo().getPriority().getWorkloadClass());
  schedulingDecisionInfoConsumer.accept(resourceSchedulingDecisionInfo);
 
  resourceSchedulingObserver.beginQueueWait();
  final Pointer<DistributedSemaphore.DistributedLease> lease = new Pointer();
  ListenableFuture<ResourceSet> futureAllocation =
      executorService.submit(
          () -> {
            lease.value = acquireQuerySemaphoreIfNecessary(queryContext, queueType);
 
            // update query limit based on the queueType
            final OptionManager options = queryContext.getOptions();
            final boolean memoryControlEnabled =
                options.getOption(BasicResourceConstants.ENABLE_QUEUE_MEMORY_LIMIT);
            // TODO REFLECTION_SMALL, REFLECTION_LARGE was not there before - was it a bug???
            final long memoryLimit =
                (queueType == QueueType.SMALL || queueType == QueueType.REFLECTION_SMALL)
                    ? options.getOption(BasicResourceConstants.SMALL_QUEUE_MEMORY_LIMIT)
                    : options.getOption(BasicResourceConstants.LARGE_QUEUE_MEMORY_LIMIT);
            long queryMaxAllocation = queryContext.getQueryContextInfo().getQueryMaxAllocation();
            if (memoryControlEnabled && memoryLimit > 0) {
              queryMaxAllocation = Math.min(memoryLimit, queryMaxAllocation);
            }
            final UserBitShared.QueryId queryId = queryContext.getQueryId();
            final long queryMaxAllocationFinal = queryMaxAllocation;
 
            final ResourceSet resourceSet =
                new BasicResourceSet(
                    queryId, lease.value, queryMaxAllocationFinal, queueType.name());
 
            return resourceSet;
          });
  Futures.addCallback(
      futureAllocation,
      new FutureCallback<ResourceSet>() {
        @Override
        public void onSuccess(@Nullable ResourceSet resourceSet) {
          // don't need to do anything additional
        }
 
        @Override
        public void onFailure(Throwable throwable) {
          // need to close lease
          releaseLease(lease.value);
        }
      },
      executorService);
 
  final ResourceSchedulingResult resourceSchedulingResult =
      new ResourceSchedulingResult(resourceSchedulingDecisionInfo, futureAllocation);
  return resourceSchedulingResult;
}
  • getQueueNameFromSchedulingProperties 处理
protected QueueType getQueueNameFromSchedulingProperties(
    final ResourceSchedulingContext queryContext,
    final ResourceSchedulingProperties resourceSchedulingProperties) {
  final Double cost = resourceSchedulingProperties.getQueryCost();
 
  Preconditions.checkNotNull(cost, "Queue Cost is not provided, Unable to determine " + "queue.");
  // 可以看到核心是基于开销判定的
  final long queueThreshold =
      queryContext.getOptions().getOption(BasicResourceConstants.QUEUE_THRESHOLD_SIZE);
  final QueueType queueType;
  if (queryContext
      .getQueryContextInfo()
      .getPriority()
      .getWorkloadClass()
      .equals(UserBitShared.WorkloadClass.BACKGROUND)) {
    // 后台任务类型的是反射,否则是正常的请求,是基于配置指定的大小
    queueType = (cost > queueThreshold) ? QueueType.REFLECTION_LARGE : QueueType.REFLECTION_SMALL;
  } else {
    queueType = (cost > queueThreshold) ? QueueType.LARGE : QueueType.SMALL;
  }
  return queueType;
}
  • resourceSchedulingProperties.getQueryCost() 的处理
    此方法实际上去pojo 的数据,数据的赋值是基于physicalPlan 物理计划处理的
    ResourceTracker.java 中的方法
void allocate(PhysicalPlan physicalPlan, MaestroObserver observer)
    throws ExecutionSetupException, ResourceAllocationException {
// 物理计划获取到的开销
  final double planCost = physicalPlan.getCost();
  ResourceSchedulingProperties resourceSchedulingProperties = new ResourceSchedulingProperties();
  resourceSchedulingProperties.setQueryCost(planCost);
  resourceSchedulingProperties.setRoutingQueue(context.getSession().getRoutingQueue());
  resourceSchedulingProperties.setRoutingTag(context.getSession().getRoutingTag());
  resourceSchedulingProperties.setQueryType(
      Utilities.getHumanReadableWorkloadType(context.getWorkloadType()));
  resourceSchedulingProperties.setRoutingEngine(context.getSession().getRoutingEngine());
  resourceSchedulingProperties.setQueryLabel(context.getSession().getQueryLabel());
  • physicalPlan.getCost 处理
    此部分实际上就是dremio 的逻辑计划到物理计划操作器生成评估的过程,之后会转换为json 格式的计划任务,直接节点可以获取到相关的信息
public double getCost() {
  double totalCost = 0;
  for (final PhysicalOperator ops : getSortedOperators()) {
    totalCost += ops.getProps().getCost();
  }
  return totalCost;
}

说明

以上是一个简单的说明,关于物理操作器的开销部分没说明,后边结合实际分析完善下

参考资料

sabot/kernel/src/main/java/com/dremio/exec/maestro/ResourceTracker.java
sabot/kernel/src/main/java/com/dremio/exec/physical/PhysicalPlan.java
services/resourcescheduler/src/main/java/com/dremio/resource/basic/BasicResourceAllocator.java

posted on 2024-06-05 08:00  荣锋亮  阅读(7)  评论(0编辑  收藏  举报

导航