【Flink 系列二十三】hudi 消失的 HIVE_CONF_DIR,HIVE 读不到 hive-site.xml 读不到

问题现象

Unable to find config file hive-site.xml
Unable to find config file hivemetastore-site.xml
Unable to find config file metastore-site.xml

本文记录这个问题是如何导致的,并记录如何向 Hive、Hudi 提供 hive-site.xml 以便正确加载。

问题分析: HiveMetaStore 是如何查找配置文件路径的

位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#findConfigFile

private static URL findConfigFile(ClassLoader classLoader, String name) {
    // First, look in the classpath
    URL result = classLoader.getResource(name);
    if (result == null) {
      // Nope, so look to see if our conf dir has been explicitly set
      result = seeIfConfAtThisLocation("METASTORE_CONF_DIR", name, false);
      if (result == null) {
        // Nope, so look to see if our home dir has been explicitly set
        result = seeIfConfAtThisLocation("METASTORE_HOME", name, true);
        if (result == null) {
          // Nope, so look to see if Hive's conf dir has been explicitly set
          result = seeIfConfAtThisLocation("HIVE_CONF_DIR", name, false);
          if (result == null) {
            // Nope, so look to see if Hive's home dir has been explicitly set
            result = seeIfConfAtThisLocation("HIVE_HOME", name, true);
            if (result == null) {
              // Nope, so look to see if we can find a conf file by finding our jar, going up one
              // directory, and looking for a conf directory.
              URI jarUri = null;
              try {
                jarUri = MetastoreConf.class.getProtectionDomain().getCodeSource().getLocation().toURI();
              } catch (Throwable e) {
                LOG.warn("Cannot get jar URI", e);
              }
              result = seeIfConfAtThisLocation(new File(jarUri).getParent(), name, true);
              // At this point if we haven't found it, screw it, we don't know where it is
              if (result == null) {
                LOG.info("Unable to find config file " + name);
              }
            }
          }
        }
      }
    }
    LOG.info("Found configuration file " + result);
    return result;
  }

显然是因为 classpath 没有,METASTORE_CONF_DIR、METASTORE_HOME、HIVE_CONF_DIR、HIVE_HOME, 这些位置相应的都没有

并且甚至 MetastoreConf 类所在的 jar 包内也没有

寻找原因:为什么所有的位置都没有读取到 hive-site.xml

位置:org.apache.hadoop.hive.metastore.conf.MetastoreConf#newMetastoreConf

 if(hiveSiteURL == null) {
      /*
       * this 'if' is pretty lame - QTestUtil.QTestUtil() uses hiveSiteURL to load a specific
       * hive-site.xml from data/conf/<subdir> so this makes it follow the same logic - otherwise
       * HiveConf and MetastoreConf may load different hive-site.xml  ( For example,
       * HiveConf uses data/conf/spark/hive-site.xml and MetastoreConf data/conf/hive-site.xml)
       */
      hiveSiteURL = findConfigFile(classLoader, "hive-site.xml");
    }
    if (hiveSiteURL != null) {
      conf.addResource(hiveSiteURL);
    }

当 hiveSiteURL 静态变量未设置的时候,才调用 findConfigFile,这个是正常情况。

Flink 相关的
位置:
org.apache.flink.table.catalog.hive.HiveCatalog.createHiveConf

        // ignore all the static conf file URLs that HiveConf may have set
        HiveConf.setHiveSiteLocation(null);

结论:

  • Flink 清理了这个静态变量,导致进入 findConfigFile。
  • MetastoreConf 看样子不支持 HDFS上的 hive-site.xml
  • Flink 如果new了HiveCatalog,一定导致查找过程

CLASSPATH 分析

Flink的 CLASSPATH 已经提供了为何仍然加载不了 hive-site.xml

lib/hive-site.xml

但 classLoader.getResource(name); 仍然加载不了,推测是因为 name应当是 "lib/hive-site.xml" 才能正确加载 ?

结论:
需要指定 HIVE_CONF_DIR

解决方案

给 Flink 程序传入 HIVE_CONF_DIR,那么具体怎么做的?可以参考 kyuubi

即:

-Dcontainerized.master.env.HIVE_CONF_DIR=/etc/hive/conf
posted @ 2024-10-10 18:23  一杯半盏  阅读(24)  评论(0编辑  收藏  举报