dremio 自动提升分片字段处理简单说明

dremio自动提升支持自动将文件夹转换为一个列,同时可以实现数据的过滤查询,是一个很不错的功能,比如我们在一些数据归档类的应用中可以基于事件进行分区,之后通过自动提升可以方便的进行数据查询

效果

  • 查询效果

  • debug 效果

内部处理

对于dremio 自动提升的处理我已经介绍过了,实际上核心是分区字段的处理(对应的查询上就是表元数据)

  • 保存处理

DatasetManager 中的getTableFromPlugin 方法

try {
      plugin.getSaver()
          .save(datasetConfig, handle.get(), plugin.unwrap(StoragePlugin.class), opportunisticSave, retrievalOptions,
                userName);
    } catch (ConcurrentModificationException cme) {
      // Some other query, or perhaps the metadata refresh, must have already created this dataset. Re-obtain it
      // from the namespace
      assert opportunisticSave : "Non-opportunistic saves should have already handled a CME";
      try {
        datasetConfig = userNamespaceService.getDataset(canonicalKey);
      } catch (NamespaceException e) {
        // We got a concurrent modification exception because a dataset existed. It shouldn't be the case that it
        // no longer exists. In the very rare case of this code racing with both another update *and* a dataset deletion
        // we should act as if the delete won
        logger.warn("Unable to obtain dataset {}. Likely race with dataset deletion", canonicalKey);
        return null;
      }
      final NamespaceTable namespaceTable = getTableFromNamespace(key, datasetConfig, plugin, accessUserName, options);
      options.getStatsCollector()
          .addDatasetStat(canonicalKey.getSchemaPath(), MetadataAccessType.CACHED_METADATA.name(),
              stopwatch.elapsed(TimeUnit.MILLISECONDS));
      return namespaceTable;
    } catch (AccessControlException ignored) {
      return null;
    }
 
    options.getStatsCollector()
        .addDatasetStat(canonicalKey.getSchemaPath(), MetadataAccessType.PARTIAL_METADATA.name(),
            stopwatch.elapsed(TimeUnit.MILLISECONDS));
 
    plugin.checkAccess(canonicalKey, datasetConfig, accessUserName, options);
 
    // TODO: use MaterializedSplitsPointer if metadata is not too big!
    final TableMetadata tableMetadata = new TableMetadataImpl(plugin.getId(), datasetConfig,
      accessUserName, DatasetSplitsPointer.of(userNamespaceService, datasetConfig),
      getPrimaryKey(plugin.getPlugin(), datasetConfig, schemaConfig, key,
        false /* The metadata for the table is either incomplete, missing or out of date.
        Storing in namespace can cause issues if the metadata is missing. Don't save here.
        */));
    return new NamespaceTable(tableMetadata, optionManager.getOption(FULL_NESTED_SCHEMA_SUPPORT));
  }
  • 内部保存对于datasetconf 的扩展

DatasetSaverImpl saveUsingV1Flow

private void saveUsingV1Flow(DatasetHandle handle,
                               DatasetRetrievalOptions options,
                               DatasetConfig datasetConfig,
                               SourceMetadata sourceMetadata,
                               boolean opportunisticSave,
                               Function<DatasetConfig, DatasetConfig> datasetMutator,
                               NamespaceKey canonicalKey,
                               NamespaceAttribute... attributes) {
    if (datasetConfig.getId() == null) {
      // this is a new dataset, otherwise save will fail
      datasetConfig.setId(new EntityId(UUID.randomUUID().toString()));
    }
 
    NamespaceService.SplitCompression splitCompression = NamespaceService.SplitCompression.valueOf(optionManager.getOption(CatalogOptions.SPLIT_COMPRESSION_TYPE).toUpperCase());
    try (DatasetMetadataSaver saver = systemNamespace.newDatasetMetadataSaver(canonicalKey, datasetConfig.getId(), splitCompression,
            optionManager.getOption(CatalogOptions.SINGLE_SPLIT_PARTITION_MAX), optionManager.getOption(NamespaceOptions.DATASET_METADATA_CONSISTENCY_VALIDATE))) {
      final PartitionChunkListing chunkListing = sourceMetadata.listPartitionChunks(handle,
              options.asListPartitionChunkOptions(datasetConfig));
 
      final long recordCountFromSplits = saver == null || chunkListing == null ? 0 :
              CatalogUtil.savePartitionChunksInSplitsStores(saver,chunkListing);
      // 此处会基于实际的dataset handle 进行数据集元数据的获取,比如txt,json 格式的EasyFormatDatasetAccessor handle
      final DatasetMetadata datasetMetadata = sourceMetadata.getDatasetMetadata(handle, chunkListing,
              options.asGetMetadataOptions(datasetConfig));
 
      Optional<ByteString> readSignature = Optional.empty();
      if (sourceMetadata instanceof SupportsReadSignature) {
        final BytesOutput output = ((SupportsReadSignature) sourceMetadata).provideSignature(handle, datasetMetadata);
        //noinspection ObjectEquality
        if (output != BytesOutput.NONE) {
          readSignature = Optional.of(MetadataObjectsUtils.toProtostuff(output));
        }
      }
      // 处理会扩展分区信息
      MetadataObjectsUtils.overrideExtended(datasetConfig, datasetMetadata, readSignature,
              recordCountFromSplits, options.maxMetadataLeafColumns());
      datasetConfig = datasetMutator.apply(datasetConfig);
 
      saver.saveDataset(datasetConfig, opportunisticSave, attributes);
      updateListener.metadataUpdated(canonicalKey);
    } catch (DatasetMetadataTooLargeException e) {
      datasetConfig.setRecordSchema(null);
      datasetConfig.setReadDefinition(null);
      try {
        systemNamespace.addOrUpdateDataset(canonicalKey, datasetConfig);
      } catch (NamespaceException ignored) {
      }
      throw UserException.validationError(e)
              .build(logger);
    } catch (NamespaceException| IOException e) {
      throw UserException.validationError(e)
              .build(logger);
    }
  }
}
  • EasyFormatDatasetAccessor 参考实现

getBatchSchema 是一个核心实现,进行了扩展

private BatchSchema getBatchSchema(BatchSchema oldSchema, final FileSelection selection, final FileSystem dfs) throws Exception {
    final SabotContext context = formatPlugin.getContext();
    try (
        BufferAllocator sampleAllocator = context.getAllocator().newChildAllocator("sample-alloc", 0, Long.MAX_VALUE);
        OperatorContextImpl operatorContext = new OperatorContextImpl(context.getConfig(), context.getDremioConfig(), sampleAllocator, context.getOptionManager(), 1000, context.getExpressionSplitCache());
        SampleMutator mutator = new SampleMutator(sampleAllocator)
    ) {
     // 查找隐藏的列信息
      final ImplicitFilesystemColumnFinder explorer = new ImplicitFilesystemColumnFinder(context.getOptionManager(),
          dfs, GroupScan.ALL_COLUMNS, ImplicitFilesystemColumnFinder.Mode.ALL_IMPLICIT_COLUMNS);
 
      Optional<FileAttributes> fileName = selection.getFileAttributesList().stream().filter(input -> input.size() > 0).findFirst();
      final FileAttributes file = fileName.orElse(selection.getFileAttributesList().get(0));
 
      EasyDatasetSplitXAttr dataset = EasyDatasetSplitXAttr.newBuilder()
          .setStart(0L)
          .setLength(Long.MAX_VALUE)
          .setPath(file.getPath().toString())
          .build();
      try (RecordReader reader = new AdditionalColumnsRecordReader(operatorContext, ((EasyFormatPlugin) formatPlugin)
          .getRecordReader(operatorContext, dfs, dataset, GroupScan.ALL_COLUMNS), explorer.getImplicitFieldsForSample(selection), sampleAllocator)) {
        reader.setup(mutator);
        Map<String, ValueVector> fieldVectorMap = new HashMap<>();
        int i = 0;
        for (VectorWrapper<?> vw : mutator.getContainer()) {
          fieldVectorMap.put(vw.getField().getName(), vw.getValueVector());
          if (++i > maxLeafColumns) {
            throw new ColumnCountTooLargeException(maxLeafColumns);
          }
        }
        reader.allocate(fieldVectorMap);
        reader.next();
        mutator.getContainer().buildSchema(BatchSchema.SelectionVectorMode.NONE);
        // 进行采样的显式字段以及隐式的schema 合并,其中就有我们说的文件夹隐式字段
        return getMergedSchema(oldSchema,
          mutator.getContainer().getSchema(),
          oldConfig.isSchemaLearningEnabled(),
          oldConfig.getDropColumns(),
          oldConfig.getModifiedColumns(),
          context.getOptionManager().getOption(ExecConstants.ENABLE_INTERNAL_SCHEMA),
          tableSchemaPath.getPathComponents(),
          file.getPath().toString());
      }
    }
  }

一个小技巧

从我上边的图中可以看出还有几个隐藏字段也可以实现(文件名以及修改时间),是可以通过配置显示的

  • 相关配置
    ImplicitFilesystemColumnFinder
  public static final StringValidator IMPLICIT_FILE_FIELD_LABEL = new StringValidator("dremio.store.file.file-field-label", "$file");
  public static final StringValidator IMPLICIT_MOD_FIELD_LABEL = new StringValidator("dremio.store.file.mod-field-label", "$mtime");
  public static final BooleanValidator IMPLICIT_FILE_FIELD_ENABLE = new BooleanValidator("dremio.store.file.file-field-enabled", false);
  public static final BooleanValidator IMPLICIT_DIRS_FIELD_ENABLE = new BooleanValidator("dremio.store.file.dir-field-enabled", true);
  public static final BooleanValidator IMPLICIT_MOD_FIELD_ENABLE = new BooleanValidator("dremio.store.file.mod-field-enabled", false);

说明

以上是一个关于自动提升分区字段的简单说明,详细的可以参考源码学习,我只是简单介绍了EasyFormatDatasetAccessor,当然还会有其他一些实现

参考资料

sabot/kernel/src/main/java/com/dremio/exec/catalog/DatasetManager.java
sabot/kernel/src/main/java/com/dremio/exec/store/dfs/implicit/AdditionalColumnsRecordReader.java
sabot/kernel/src/main/java/com/dremio/exec/store/easy/EasyFormatDatasetAccessor.java
sabot/kernel/src/main/java/com/dremio/exec/catalog/DatasetSaverImpl.java
sabot/kernel/src/main/java/com/dremio/exec/store/dfs/implicit/ImplicitFilesystemColumnFinder.java
sabot/kernel/src/main/java/com/dremio/exec/util/MetadataSupportsInternalSchema.java
sabot/kernel/src/main/java/com/dremio/exec/store/dfs/implicit/ImplicitFilesystemColumnFinder.java

posted on 2024-02-27 08:00  荣锋亮  阅读(14)  评论(0编辑  收藏  举报

导航