Hive SQLException: Method not supported问题

概述

项目使用到 impala/hive 查询引擎，ELK记录每天都要抛出差不多一两条报错信息：java.net.SocketTimeoutException: Read timed out。原因应该是SQL比较复杂，查询超时。故而可以考虑设置超时时间。

参考SocketTimeoutException: Read timed out 问题解决，以为万事大吉，结果代码上线之后，出现另外一个报错：SQLException: Method not supported

问题

并且是疯狂（半小时内成千上万条）报错：java.sql.SQLException: Method not supported。

改动有两点：

URL链接jdbc:hive2://impala.aaacorp.com:25005/edw增加配置，变成jdbc:hive2://impala.aaacorp.com:25005/edw?hive.metastore.client.socket.timeout=1800&hive.server.read.socket.timeout=1800&hive.server.write.socket.timeout=1800&hive.server.thrift.socket.timeout=1800&hive.client.thrift.socket.timeout=1800

statement = connection.createStatement();
// 新增
statement.setQueryTimeout(queryTimeout);

去掉放在Apollo配置里面的URL更改，没有解决问题，报错依旧。

那就是代码的问题，找到源码：

public class HiveStatement implements Statement {
	@Override
	public void setQueryTimeout(int seconds) throws SQLException {
		throw new SQLException("Method not supported");
	}
}

找到问题根源！！！

使用的hive-jdbc版本：

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>1.1.0-cdh5.7.1</version>
</dependency>

另外，hadoop版本：2.6.0-cdh5.7.1

经过与数仓同事咨询，搭建的集群版本为cdh-6.2版本。在maven仓库搜索新版本，升级到如下版本解决这个报错：

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>2.1.1-cdh6.2.1</version>
</dependency>

查看源码得到验证：

public void setQueryTimeout(int seconds) throws SQLException {
    this.queryTimeout = seconds;
}

SQLException: Method not supported

比较详细的报错日志如下：

2022-05-05 18:19:18.069 [ERROR][http-nio-8081-exec-2]:c.alibaba.druid.pool.DruidPooledPreparedStatement [<init>:82] getMaxFieldSize error
java.sql.SQLException: Method not supported
    at org.apache.hive.jdbc.HiveStatement.getMaxFieldSize(HiveStatement.java:579)
    at com.alibaba.druid.pool.DruidPooledPreparedStatement.<init>(DruidPooledPreparedStatement.java:80)
    at com.alibaba.druid.pool.DruidPooledConnection.prepareStatement(DruidPooledConnection.java:425)

重点看报错堆栈里面的getMaxFieldSize方法。应用使用druid连接池 + JDBC方式获取各种不同的数据源连接。
解决方案是在获取连接的时候判断一下不同数据源：

if (!driver.equalsIgnoreCase(DbDriverTypeEnum.IMPALA.getDbDriverType())) {
	conf.put(PROP_POOLPREPAREDSTATEMENTS, "true");
	conf.put(PROP_MAXOPENPREPAREDSTATEMENTS, "20");
}

@Getter
@AllArgsConstructor
public enum DbDriverTypeEnum {
    //
    HIVE("hive", "org.apache.hive.jdbc.HiveDriver", "jdbc"),
    MYSQL("mysql", "com.mysql.cj.jdbc.Driver", "jdbc"),
    ;

    private final String name;
    private final String dbDriverType;
    private final String connectionType;
}