Hive Metastore 连接报错

背景

项目中需要通过一些自定义的组件来操控hive的元数据,于是使用了remote方式来存储hive元数据,使用一个服务后台作为gateway,由它来控制hive元数据。

现象

在windows上连接hive metastore的时候,无端的会报NullPointerException,非常费解。

分析

看了代码后发现,连接后会获取本地用户所在的用户组信息(org.apache.hadoop.hive.metastore.HiveMetaStoreClient中的open方法):

          if (isConnected && !useSasl && conf.getBoolVar(ConfVars.METASTORE_EXECUTE_SET_UGI)){
            // Call set_ugi, only in unsecure mode.
            try {
              UserGroupInformation ugi = Utils.getUGI();
              client.set_ugi(ugi.getUserName(), Arrays.asList(ugi.getGroupNames()));
            } catch (LoginException e) {
              LOG.warn("Failed to do login. set_ugi() is not successful, " +
                       "Continuing without it.", e);
            } catch (IOException e) {
              LOG.warn("Failed to find ugi of client set_ugi() is not successful, " +
                  "Continuing without it.", e);
            } catch (TException e) {
              LOG.warn("set_ugi() not successful, Likely cause: new client talking to old server. "
                  + "Continuing without it.", e);
            }
          }
ugi.getGroupNames()会去调用本地命令在windows平台上会使用一个叫winutils的工具,但是作为客户端开发的话不会在windows端安装这些二进制文件,所以代码流程就出错了
  /**
   * a Unix command to get a given user's groups list.
   * If the OS is not WINDOWS, the command will get the user's primary group
   * first and finally get the groups list which includes the primary group.
   * i.e. the user's primary group will be included twice.
   */
  public static String[] getGroupsForUserCommand(final String user) {
    //'groups username' command return is non-consistent across different unixes
    return (WINDOWS)? new String[] { WINUTILS, "groups", "-F", "\"" + user + "\""}
                    : new String [] {"bash", "-c", "id -gn " + user
                                     + "&& id -Gn " + user};
WINUTILS的初始化在如下函数中,如果path中找不到的话会返回null
  /** a Windows utility to emulate Unix commands */
  public static final String WINUTILS = getWinUtilsPath();

  public static final String getWinUtilsPath() {
    String winUtilsPath = null;

    try {
      if (WINDOWS) {
        winUtilsPath = getQualifiedBinPath("winutils.exe");
      }
    } catch (IOException ioe) {
       LOG.error("Failed to locate the winutils binary in the hadoop binary path",
         ioe);
    }

    return winUtilsPath;
  }
在java.lang.ProcessBuilder.java中的start中有如下判断:
public Process start() throws IOException {
        // Must convert to array first -- a malicious user-supplied
        // list might try to circumvent the security check.
        String[] cmdarray = command.toArray(new String[command.size()]);
        cmdarray = cmdarray.clone();

        for (String arg : cmdarray)
            if (arg == null)
                throw new NullPointerException();
        // Throws IndexOutOfBoundsException if command is empty
        String prog = cmdarray[0];

由于cmdarray中的第一个元素就是null,所以马上甩出NullPointerException

toString() 中的null值检测

另外在org.apache.hadoop.util.Shell中

ShellCommandExecutor

这个类中存在一个问题,就是toString方面没有对成员为null的情况进行判断如:

    /**
     * Returns the commands of this instance.
     * Arguments with spaces in are presented with quotes round; other
     * arguments are presented raw
     *
     * @return a string representation of the object.
     */
    @Override
    public String toString() {
      StringBuilder builder = new StringBuilder();
      String[] args = getExecString();
      for (String s : args) {
        if (s.indexOf(' ') >= 0) {
          builder.append('"').append(s).append('"');
        } else {
          builder.append(s);
        }
        builder.append(' ');
      }
      return builder.toString();
    }

即假如我们的命令args中有元素是null,那么这个toString也会抛出NullPointerException,因为在没有判断的情况下直接引用了对象方法(s.indexOf),记得这个问题似乎在Effective Java里看到过。一般并不会触发这问题,可是在打开调试器的时候,它会去执行当前环境里对象的toString方法。所以每当debug到相关代码段时,总是莫名其妙的就突然爆出个NullPointerException,着实费解了一些时间。

 

posted @ 2015-06-04 21:20  卖程序的小歪  阅读(5748)  评论(0编辑  收藏  举报