dremio ResultsCleanupService 服务简单说明

dremio 支持对于jobresult 的定时清理,jobresult 的清理可以减少不少存储空间,尤其查询比较多的时候,默认dremio 每个执行的查询会对于
结果集进行本地cache,同时展示使用,同时sys.job_results.<jobid> 也会使用此数据

服务创建

DACDaemonModule 模块中,可以看出依赖SchedulerService,JobResultsStoreConfig,OptionManager

  //  ResultsCleanupService 清理任务在执行器节点运行
  if(isExecutor){
      registry.bindSelf(new ContextInformationFactory());
      taskPoolInitializer = new TaskPoolInitializer(
        registry.provider(OptionManager.class),
        config);
      registry.bindSelf(taskPoolInitializer);
      registry.bindProvider(TaskPool.class, taskPoolInitializer::getTaskPool);
 
      final WorkloadTicketDepotService workloadTicketDepotService = new WorkloadTicketDepotService(
        registry.provider(BufferAllocator.class),
        registry.provider(TaskPool.class),
        registry.provider(DremioConfig.class)
      );
      registry.bindSelf(workloadTicketDepotService);
      registry.bindProvider(WorkloadTicketDepot.class, workloadTicketDepotService::getTicketDepot);
 
      ExecToCoordTunnelCreator execToCoordTunnelCreator =
        new ExecToCoordTunnelCreator(registry.provider(FabricService.class));
 
      registry.bind(MaestroClientFactory.class,
        new MaestroSoftwareClientFactory(execToCoordTunnelCreator));
      registry.bind(JobTelemetryExecutorClientFactory.class,
        new JobTelemetrySoftwareClientFactory(execToCoordTunnelCreator));
      registry.bind(JobResultsClientFactory.class,
        new JobResultsSoftwareClientFactory(execToCoordTunnelCreator));
 
      final FragmentWorkManager fragmentWorkManager = new FragmentWorkManager(bootstrap,
        config.getSabotConfig(),
        registry.provider(NodeEndpoint.class),
        registry.provider(SabotContext.class),
        registry.provider(FabricService.class),
        registry.provider(CatalogService.class),
        registry.provider(ContextInformationFactory.class),
        registry.provider(WorkloadTicketDepot.class),
        registry.provider(TaskPool.class),
        registry.provider(MaestroClientFactory.class),
        registry.provider(JobTelemetryExecutorClientFactory.class),
        registry.provider(JobResultsClientFactory.class));
 
      registry.bindSelf(fragmentWorkManager);
 
      registry.bindProvider(WorkStats.class, fragmentWorkManager::getWorkStats);
 
      registry.bindProvider(ExecutorService.class, fragmentWorkManager::getExecutorService);
      registry.bind(ResultsCleanupService.class, new ResultsCleanupService(registry.provider(SchedulerService.class),jobResultsStoreConfigProvider,registry.provider(OptionManager.class)));
    } else {
      registry.bind(WorkStats.class, WorkStats.NO_OP);
    }

服务处理

ResultsCleanupService 实际上就是利用系统配置的参数基于定时任务进行job 结果的清理

  • start 方法
public void start() throws Exception {
    try {
      jobResultsStoreConfigProvider.get();
    } catch (Exception e) {
      logger.info("JobResultsStoreConfig is not available exiting...");
      return;
    }
    final OptionManager optionManager = optionManagerProvider.get();
    // 通过配置看出看出支持是否清理, dremio.results_cleanup.enabled  默认为false,说明任务是没有开启的
    if (!optionManager.getOption(ExecConstants.RESULTS_CLEANUP_SERVICE_ENABLED)) {
      logger.info("Results cleanup service is disabled, quitting...");
      return;
    }
 
    logger.info("Starting ResultsCleanupService..");
 
    // jobresult 最大保留的天数,默认是一天
    final long maxJobResultsAgeInDays = optionManager.getOption(ExecConstants.RESULTS_MAX_AGE_IN_DAYS);
    if (maxJobResultsAgeInDays != DISABLE_CLEANUP_VALUE) {
     // 默认清理job 启动的时间
      final long jobResultsCleanupStartHour = optionManager.getOption(ExecConstants.JOB_RESULTS_CLEANUP_START_HOUR);
      final LocalTime startTime = LocalTime.of((int) jobResultsCleanupStartHour, 0);
     // resultSchedule 定义
      final Schedule resultSchedule = Schedule.Builder.everyDays(1, startTime)
        .withTimeZone(ZoneId.systemDefault())
        .build();
     // ResultsCleanup 是清理热任务
      jobResultsCleanupTask = schedulerService.get().schedule(resultSchedule, new ResultsCleanup());
    }
}
  • ResultsCleanup 处理
public void cleanup() {
      //  通过配置获取获取jobresult 的信息
      FileSystem dfs = jobResultsStoreConfigProvider.get().getFileSystem();
      Path resultsFolder = jobResultsStoreConfigProvider.get().getStoragePath();
      final OptionManager optionManager = optionManagerProvider.get();
      long maxAgeInMillis = optionManager.getOption(ExecConstants.DEBUG_RESULTS_MAX_AGE_IN_MILLISECONDS);
      long maxAgeInDays = optionManager.getOption(ExecConstants.RESULTS_MAX_AGE_IN_DAYS);
      long jobResultsMaxAgeInMillis = (maxAgeInDays * ONE_DAY_IN_MILLIS) + maxAgeInMillis;
 
      try {
        long cutOffTime = System.currentTimeMillis() - jobResultsMaxAgeInMillis;
        if (!(dfs.exists(resultsFolder))) {
          //directory does not exist so nothing to clean up
          return;
        }
       // 基于配置使用底层的文件系统的遍历,基于文件的时间属性信息,删除文件
        DirectoryStream<FileAttributes> listOfAttributes = jobResultsStoreConfigProvider.get().getFileSystem().list(resultsFolder);
        //iterate through the directory and cleanup files created before the cutoff time
        for (FileAttributes attr : listOfAttributes) {
          FileTime creationTime = attr.creationTime();
          if (creationTime.toMillis() < cutOffTime) {
            //cleanup
            if (!jobResultsStoreConfigProvider.get().getFileSystem().delete(attr.getPath(), true)) {
              logger.info("Failed to delete directory, {}", attr.getPath());
            }
          }
        }
 
      } catch (Exception e) {
        logger.error("An exception occured while running ResultsCleanupService", e);
      }
}

说明

以上是对于ResultsCleanupService服务的简单说明,了解清理机制,我们可以进行方便的调整,减少存储空间的占用,而且目前从代码上看任务是没开启的,需要自己开启,而且从执行器的日志也可以看出来

2024-02-21 07:08:37,697 [main] INFO  c.d.e.server.ResultsCleanupService - Results cleanup service is disabled, quitting...

参考资料

dac/backend/src/main/java/com/dremio/dac/daemon/DACDaemonModule.java
sabot/kernel/src/main/java/com/dremio/exec/server/ResultsCleanupService.java
sabot/kernel/src/main/java/com/dremio/exec/store/JobResultsStoreConfig.java
sabot/kernel/src/main/java/com/dremio/exec/ExecConstants.java
services/options/src/main/java/com/dremio/options/OptionManager.java

posted on 2024-03-12 08:00  荣锋亮  阅读(3)  评论(0编辑  收藏  举报

导航