dremio 元数据自动提升为物理数据集的功能简单说明

dremio包含了一个元数据自动提升为物理数据集的功能,对于文件系统我们就不用进行格式提升了,dremio 就可以直接查询了,配置如下


当然此功能的前提是数据格式可以被自动发现(dremio 的easy foramt 能力),以下是一个内部处理的简单说明

一个参考调用链

stack com.dremio.exec.store.DatasetRetrievalOptions autoPromote

Affect(class count: 3 , method count: 1) cost in 687 ms, listenerId: 1
ts=2024-02-14 01:54:43;thread_name=1a33e29b-b1aa-abdf-ab00-a4b882830200/0:foreman-planning;id=256;is_daemon=true;priority=10;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    @com.dremio.exec.store.DatasetRetrievalOptions.autoPromote()
        at com.dremio.exec.store.dfs.FileSystemPlugin.getDatasetWithFormat(FileSystemPlugin.java:723)
        at com.dremio.exec.store.dfs.FileSystemPlugin.getDatasetHandle(FileSystemPlugin.java:1772)
        at com.dremio.exec.catalog.ManagedStoragePlugin.getDatasetHandle(ManagedStoragePlugin.java:1012)
        at com.dremio.exec.catalog.DatasetManager.getTableFromPlugin(DatasetManager.java:428)
        at com.dremio.exec.catalog.DatasetManager.getTable(DatasetManager.java:244)
        at com.dremio.exec.catalog.CatalogImpl.getTableHelper(CatalogImpl.java:880)
        at com.dremio.exec.catalog.CatalogImpl.getTable(CatalogImpl.java:245)
        at com.dremio.exec.catalog.CatalogImpl.getTableForQuery(CatalogImpl.java:907)
        at com.dremio.exec.catalog.SourceAccessChecker.lambda$getTableForQuery$5(SourceAccessChecker.java:164)
        at com.dremio.exec.catalog.SourceAccessChecker.getIfVisible(SourceAccessChecker.java:114)
        at com.dremio.exec.catalog.SourceAccessChecker.getTableForQuery(SourceAccessChecker.java:164)
        at com.dremio.exec.catalog.DelegatingCatalog.getTableForQuery(DelegatingCatalog.java:125)
        at com.dremio.exec.catalog.CachingCatalog.lambda$getTableForQuery$6(CachingCatalog.java:189)
        at com.dremio.exec.catalog.CachingCatalog.timedGet(CachingCatalog.java:246)
        at com.dremio.exec.catalog.CachingCatalog.getTableForQuery(CachingCatalog.java:189)
        at com.dremio.exec.ops.PlannerCatalogImpl.getValidatedTableWithSchema(PlannerCatalogImpl.java:108)
        at com.dremio.exec.ops.DremioCatalogReader.getTable(DremioCatalogReader.java:106)
        at com.dremio.exec.ops.DremioCatalogReader.getTable(DremioCatalogReader.java:82)
        at org.apache.calcite.sql.validate.DremioEmptyScope.resolveTable(DremioEmptyScope.java:44)
        at org.apache.calcite.sql.validate.DremioEmptyScope.resolveTable(DremioEmptyScope.java:34)
        at org.apache.calcite.sql.validate.DelegatingScope.resolveTable(DelegatingScope.java:203)
        at org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl(IdentifierNamespace.java:129)
        at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:199)
        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:982)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:963)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3212)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3194)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3471)
        at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
        at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:982)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:963)
        at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:247)
        at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:938)
        at com.dremio.exec.planner.sql.SqlValidatorImpl.validate(SqlValidatorImpl.java:129)
        at com.dremio.exec.planner.sql.SqlValidatorAndToRelContext.validate(SqlValidatorAndToRelContext.java:80)
        at com.dremio.exec.planner.sql.handlers.SqlToRelTransformer.validateNode(SqlToRelTransformer.java:165)
        at com.dremio.exec.planner.sql.handlers.SqlToRelTransformer.validateAndConvert(SqlToRelTransformer.java:140)
        at com.dremio.exec.planner.sql.handlers.SqlToRelTransformer.validateAndConvert(SqlToRelTransformer.java:102)
        at com.dremio.exec.planner.sql.handlers.query.NormalHandler.getPlan(NormalHandler.java:73)
        at com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan(HandlerToExec.java:59)
        at com.dremio.exec.work.foreman.AttemptManager.plan(AttemptManager.java:561)
        at com.dremio.exec.work.foreman.AttemptManager.lambda$run$4(AttemptManager.java:458)
        at com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3(ReleasableBoundCommandPool.java:140)
        at com.dremio.service.commandpool.CommandWrapper.run(CommandWrapper.java:70)
        at com.dremio.context.RequestContext.run(RequestContext.java:109)
        at com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4(ContextMigratingExecutorService.java:227)
        at com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run(ContextMigratingExecutorService.java:207)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
 
ts=2024-02-14 01:55:02;thread_name=metadata-refresh-modifiable-scheduler-23;id=291;is_daemon=true;priority=10;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    @com.dremio.exec.store.DatasetRetrievalOptions.autoPromote()
        at com.dremio.exec.store.dfs.FileSystemPlugin.getDatasetHandle(FileSystemPlugin.java:1758)
        at com.dremio.exec.catalog.NamespaceListing$TransformingIterator.populateNextHandle(NamespaceListing.java:135)
        at com.dremio.exec.catalog.NamespaceListing$TransformingIterator.hasNext(NamespaceListing.java:91)
        at com.dremio.exec.catalog.MetadataSynchronizer.synchronizeDatasets(MetadataSynchronizer.java:199)
        at com.dremio.exec.catalog.MetadataSynchronizer.go(MetadataSynchronizer.java:136)
        at com.dremio.exec.catalog.SourceMetadataManager$RefreshRunner.refreshFull(SourceMetadataManager.java:466)
        at com.dremio.exec.catalog.SourceMetadataManager$BackgroundRefresh.run(SourceMetadataManager.java:580)
        at com.dremio.exec.catalog.SourceMetadataManager.wakeup(SourceMetadataManager.java:288)
        at com.dremio.exec.catalog.SourceMetadataManager.access$300(SourceMetadataManager.java:100)
        at com.dremio.exec.catalog.SourceMetadataManager$WakeupWorker.run(SourceMetadataManager.java:227)
        at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:252)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

参考处理

实际处理还是在sql 查询的catalog 校验中,对于支持元数据配置此选项的source 插件,对于dataset handle 的处理需要特殊化

  • 参考代码
    结合上边的调用链,可以看出先是catalog 的getTable,然后是DatasetManager 中的getTable
    catalog getTable 处理
  public DremioTable getTable(NamespaceKey key) {
    final NamespaceKey resolvedKey = resolveToDefault(key);
 
    if (resolvedKey != null) {
      final DremioTable table = getTableHelper(resolvedKey);
      if (table != null) {
        return table;
      }
    }
 
    return getTableHelper(key);
  }

getTableHelper 会调用DatasetManager 的getTable 同时发现之后还会更新元数据信息

  private DremioTable getTableHelper(NamespaceKey key) {
    Span.current().setAttribute("dremio.namespace.key.schemapath", key.getSchemaPath());
    final DremioTable table = datasets.getTable(key, options, false);
    if (table != null) {
       // 找到之后还会按需更新catalog 的元数据信息(对于viewtable)
      return updateTableIfNeeded(key, table);
    }
    return null;
  }

DatasetManager getTable 处理

   public DremioTable getTable(
      NamespaceKey key,
      MetadataRequestOptions options,
      boolean ignoreColumnCount
      ){
 
    final DatasetConfig config = getConfig(key);
 
    if(config != null) {
      // canonicalize the path.
      key = new NamespaceKey(config.getFullPathList());
    }
 
    if(isAmbiguousKey(key)) {
      key = getCanonicalKey(key);
    }
 
    String pluginName = key.getRoot();
    final ManagedStoragePlugin plugin = plugins.getPlugin(pluginName, false);
 
    if (config == null) {
      logger.debug("Got a null config");
    } else {
      logger.debug("Got config id {}", config.getId());
    }
 
    if(plugin != null) {
 
      // if we have a plugin and the info isn't a vds (this happens in home, where VDS are intermingled with plugin datasets).
      if(config == null || config.getType() != DatasetType.VIRTUAL_DATASET) {
        return getTableFromPlugin(key, config, plugin, options, ignoreColumnCount);
      }
    }
 
    if(config == null) {
      return null;
    }
 
    // at this point, we should only be looking at virtual datasets.
    if(config.getType() != DatasetType.VIRTUAL_DATASET) {
      // if we're not looking at a virtual dataset, it must mean that we hit a race condition where the source has been removed but the dataset was retrieved just before.
      return null;
    }
 
    return createTableFromVirtualDataset(config, options);
  }
  • getTableFromPlugin 内部处理
    final Optional<DatasetHandle> handle;
    try {
      // 会使用托管存储插件的getDatasetHandle,对于文件系统就是内部的处理
      handle = plugin.getDatasetHandle(key, datasetConfig, retrievalOptions);
    } catch (ConnectorException e) {
      throw UserException.validationError(e)
          .message("Failure while retrieving dataset [%s].", key)
          .build(logger);
    }
  • ManagedStoragePlugin 内部处理
    参考处理
  public Optional<DatasetHandle> getDatasetHandle(
      NamespaceKey key,
      DatasetConfig datasetConfig,
      DatasetRetrievalOptions retrievalOptions
  ) throws ConnectorException {
    try (AutoCloseableLock ignored = readLock()) {
      checkState();
      final EntityPath entityPath;
      if(datasetConfig != null) {
        entityPath = new EntityPath(datasetConfig.getFullPathList());
      } else {
        entityPath = MetadataObjectsUtils.toEntityPath(key);
      }
 
      // include the full path of the dataset
      // 调用实际内部的FileSystemPlugin插件方法
      Span.current().setAttribute("dremio.dataset.path", PathUtils.constructFullPath(entityPath.getComponents()));
      return plugin.getDatasetHandle(entityPath,
          retrievalOptions.asGetDatasetOptions(datasetConfig));
    }
  }
  • FileSystemPlugin文件系统存储插件的处理
 public Optional<DatasetHandle> getDatasetHandle(EntityPath datasetPath, GetDatasetOption... options)
      throws ConnectorException {
    BatchSchema currentSchema = CurrentSchemaOption.getSchema(options);
    FileConfig fileConfig = FileConfigOption.getFileConfig(options);
    List<String> sortColumns = SortColumnsOption.getSortColumns(options);
    List<Field> droppedColumns = CurrentSchemaOption.getDroppedColumns(options);
    List<Field> updatedColumns = CurrentSchemaOption.getUpdatedColumns(options);
    boolean isSchemaLearningEnabled = CurrentSchemaOption.isSchemaLearningEnabled(options);
 
    FormatPluginConfig formatPluginConfig = null;
    if (fileConfig != null) {
      formatPluginConfig = PhysicalDatasetUtils.toFormatPlugin(fileConfig, Collections.<String>emptyList());
    }
 
    InternalMetadataTableOption internalMetadataTableOption = InternalMetadataTableOption.getInternalMetadataTableOption(options);
    if (internalMetadataTableOption != null) {
      TimeTravelOption.TimeTravelRequest timeTravelRequest = Optional.ofNullable(TimeTravelOption.getTimeTravelOption(options))
        .map(TimeTravelOption::getTimeTravelRequest)
        .orElse(null);
      return getDatasetHandleForInternalMetadataTable(datasetPath, formatPluginConfig, timeTravelRequest, internalMetadataTableOption);
    }
 
    Optional<DatasetHandle> handle = Optional.empty();
 
    try {
      handle = getDatasetHandleForNewRefresh(
        MetadataObjectsUtils.toNamespaceKey(datasetPath),
        fileConfig,
        DatasetRetrievalOptions.of(options));
    } catch (AccessControlException e) {
      if (!DatasetRetrievalOptions.of(options).ignoreAuthzErrors()) {
        logger.debug(e.getMessage());
        throw UserException.permissionError(e)
          .message("Not authorized to read table %s at path ", datasetPath)
          .build(logger);
      }
    } catch (IOException e) {
      logger.debug("Failed to create table {}", datasetPath, e);
    }
 
    if(handle.isPresent()) {
      // handle is UnlimitedSplitsDatasetHandle, dataset is parquet
      if(DatasetRetrievalOptions.of(options).autoPromote() ) {
        // autoPromote will allow this handle to work, regardless whether dataset is/is-not promoted
        return handle;
      } else if(fileConfig != null){
        // dataset has already been promoted
        return handle;
      } else {
        // dataset not promoted, handle cannot be used without incorrectly triggering auto-promote
        return Optional.empty();
      }
    }
   //  实际首次会执行的是这个,上边获取的dataset handle 依然为null,会调用getDatasetWithFormat 方法,
    final PreviousDatasetInfo pdi = new PreviousDatasetInfo(fileConfig, currentSchema, sortColumns, droppedColumns, updatedColumns, isSchemaLearningEnabled);
    try {
      return Optional.ofNullable(getDatasetWithFormat(MetadataObjectsUtils.toNamespaceKey(datasetPath), pdi,
          formatPluginConfig, DatasetRetrievalOptions.of(options), SystemUser.SYSTEM_USERNAME));
    } catch (Exception e) {
      Throwables.propagateIfPossible(e, ConnectorException.class);
      throw new ConnectorException(e);
    }
  }
  • FileSystemPlugin getDatasetWithFormat 内部的处理

如果包含了autoPromote,会结合实践的文件格式,选择不同的formatplugin

if (datasetAccessor == null &&
          retrievalOptions.autoPromote()) {
        boolean formatFound = false;
        for (final FormatMatcher matcher : matchers) {
          try {
            final FileSelectionProcessor fileSelectionProcessor = matcher.getFormatPlugin().getFileSelectionProcessor(fs, fileSelection);
            if (matcher.matches(fs, fileSelection, codecFactory)) {
              formatFound = true;
              final DatasetType type = fs.isDirectory(Path.of(fileSelection.getSelectionRoot()))
                      ? DatasetType.PHYSICAL_DATASET_SOURCE_FOLDER : DatasetType.PHYSICAL_DATASET_SOURCE_FILE;
 
              final FileSelection normalizedFileSelection = fileSelectionProcessor.normalizeForPlugin(fileSelection);
              final FileUpdateKey updateKey = fileSelectionProcessor.generateUpdateKey();
 
              datasetAccessor = matcher.getFormatPlugin()
                  .getDatasetAccessor(type, oldConfig, fs, normalizedFileSelection, this, datasetPath,
                      updateKey, retrievalOptions.maxMetadataLeafColumns(), retrievalOptions.getTimeTravelRequest());
              if (datasetAccessor != null) {
                break;
              }
            }
          } catch (IOException e) {
            logger.debug("File read failed.", e);
          }
        }

说明

以上是关于文件系统类的source 自动元数据提升的一个简单说明,当然在以上处理的基础上,因为已经获取到了dataset handle 我们还需要进行dataset 的存储以及更新一些统计信息
参考处理DatasetManager 中的 getTableFromPlugin 之后返回一个NamespaceTable

 
   boolean opportunisticSave = (datasetConfig == null);
    if (opportunisticSave) {
      datasetConfig = MetadataObjectsUtils.newShallowConfig(handle.get());
    }
    logger.debug("Attempting inline refresh for  key : {} , canonicalKey : {} ", key, canonicalKey);
    try {
      plugin.getSaver()
          .save(datasetConfig, handle.get(), plugin.unwrap(StoragePlugin.class), opportunisticSave, retrievalOptions,
                userName);
    } catch (ConcurrentModificationException cme) {
      // Some other query, or perhaps the metadata refresh, must have already created this dataset. Re-obtain it
      // from the namespace
      assert opportunisticSave : "Non-opportunistic saves should have already handled a CME";
      try {
        datasetConfig = userNamespaceService.getDataset(canonicalKey);
      } catch (NamespaceException e) {
        // We got a concurrent modification exception because a dataset existed. It shouldn't be the case that it
        // no longer exists. In the very rare case of this code racing with both another update *and* a dataset deletion
        // we should act as if the delete won
        logger.warn("Unable to obtain dataset {}. Likely race with dataset deletion", canonicalKey);
        return null;
      }

参考资料

sabot/kernel/src/main/java/com/dremio/exec/catalog/CatalogImpl.java
sabot/kernel/src/main/java/com/dremio/exec/ops/PlannerCatalog.java
sabot/kernel/src/main/java/com/dremio/exec/catalog/EntityExplorer.java
sabot/kernel/src/main/java/com/dremio/exec/ops/PlannerCatalogImpl.java
sabot/kernel/src/main/java/com/dremio/exec/ops/DremioCatalogReader.java
sabot/kernel/src/main/java/com/dremio/exec/catalog/DatasetManager.java
sabot/kernel/src/main/java/com/dremio/exec/store/dfs/FileSystemPlugin.java
sabot/kernel/src/main/java/com/dremio/exec/store/dfs/easy/EasyFormatPlugin.java
sabot/kernel/src/main/java/com/dremio/exec/store/easy/json/JSONFormatPlugin.java

posted on 2024-02-26 08:00  荣锋亮  阅读(24)  评论(0编辑  收藏  举报

导航