Impala Parameter #1 is unset问题排查
概述
最近在全权负责的数据产品平台。简单来说,就是基于JDBC去执行SQL。支持各种不同的数据源如,Hive,Impala,MySQL,ClickHouse等。
问题
某次发布后,数据推送失败,报错Parameter #1 is unset
,具体的报错日志:
java.sql.SQLException: Parameter #1 is unset
at org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:139)
at org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:111)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeQuery(DruidPooledPreparedStatement.java:227)
at com.xy.cloudiview.datapattern.JdbcProvider.read(JdbcProvider.java:199)
具体的功能是,从Impala查询数据,然后推送到MQ。
报错代码片段:
@Slf4j
@Data
public abstract class JdbcProvider implements DataCommon {
protected Long pageSize = 500000L;
private Long page = 0L;
private String sql;
protected ResultSet resultSet;
private boolean hasNext;
protected Connection connection;
protected Statement ps;
protected JSONObject datasourceJson;
private String writeSqlFormat = " INSERT INTO %s.%s(";
protected List<ColumnMetadata> columnMetadataList;
public JdbcProvider read(String sql) throws Exception {
this.connection = reConnection();
this.ps = null;
String tempSql = decoratorSql(sql);
if (!Strings.isNullOrEmpty(tempSql)) {
this.setSql(tempSql);
} else {
this.setSql(sql);
}
try {
PreparedStatement preparedStatement = this.connection.prepareStatement(this.getSql(), ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
preparedStatement.setFetchSize(100);
readBefore(preparedStatement);
// 具体报错代码行
this.resultSet = preparedStatement.executeQuery();
this.setPs(preparedStatement);
columnMetadataList = new ArrayList<>();
} catch (Exception e) {
if (this.ps != null) {
this.ps.close();
this.connection.close();
}
throw e;
}
return this;
}
@Override
public String decoratorSql(String sql) {
return "-- this sql requested by iview " +
"\n-- executeTime:" + System.currentTimeMillis() +
"\n-- step:" + ++step +
"\n" + sql;
}
@Override
public void readBefore(Statement ps) throws SQLException {
executeBefore(ps);
setPageSize(5000000L);
}
@Override
public void executeBefore(Statement ps) throws SQLException {
ps.executeUpdate("SET DECIMAL_V2=FALSE");
}
@Data
public static class ColumnMetadata {
private String label;
private String type;
}
}
报错代码为:this.resultSet = preparedStatement.executeQuery();
排查
Impala数据源基于Hive,也是使用hive-jdbc
:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.1-cdh6.2.1</version>
</dependency>
HivePreparedStatement.java
源码:
public class HivePreparedStatement extends HiveStatement implements PreparedStatement {
public ResultSet executeQuery() throws SQLException {
return super.executeQuery(this.updateSql(this.sql, this.parameters));
}
private String updateSql(String sql, HashMap<Integer, String> parameters) throws SQLException {
List<String> parts = this.splitSqlStatement(sql);
StringBuilder newSql = new StringBuilder((String)parts.get(0));
for(int i = 1; i < parts.size(); ++i) {
if (!parameters.containsKey(i)) {
throw new SQLException("Parameter #" + i + " is unset");
}
newSql.append((String)parameters.get(i));
newSql.append((String)parts.get(i));
}
return newSql.toString();
}
private List<String> splitSqlStatement(String sql) {
List<String> parts = new ArrayList();
int apCount = 0;
int off = 0;
boolean skip = false;
for(int i = 0; i < sql.length(); ++i) {
char c = sql.charAt(i);
if (skip) {
skip = false;
} else {
switch(c) {
case '\'':
++apCount;
break;
case '?':
if ((apCount & 1) == 0) {
parts.add(sql.substring(off, i));
off = i + 1;
}
break;
case '\\':
skip = true;
}
}
}
parts.add(sql.substring(off, sql.length()));
return parts;
}
}
源码的重点在于HivePreparedStatement.splitSqlStatement
方法,对于以下3个特殊字段\'
、?
、\\
有特殊处理逻辑。
于是去查看用户提交的SQL,发现确实有?
符号,经过简化的SQL:
select '1' as a
union all
select '2' as a
/*
union all
-- 极速贷???
select '3' as a
*/
英文问号?
存在于注释语句中。
提交到大数据平台的SQL不支持注释语句?经过验证,无论是单行注释还是多行注释,通过hive-jdbc
提交都是没有问题的:
单行注释:
select '1' as a
-- union all
-- select '3' as a
多行注释:
select '1' as a
/*
union all
select '3' as a
*/
那与英文问号?
有关?验证下来,注释内容里面不能有英文问号?
。
这尼玛???
临时解决方案
删除注释内容里面的英文问号?
。
问题
<解决>上面的问题后,又爆出一个问题,报错信息也是Parameter #1 is unset
。
检查SQL,非注释内容中确实有?
,但是在测试环境不能重现问题:
经过反复仔细检查,发现生产环境,用户提交的SQL是双引号:
WTF??SQL的单引号和双引号还有区别???
平时写SQL时,经常写出这样的SQL:
select '1' as a
union all
select "2" as a
预览数据也没有问题:
临时解决方案
在不改Java code并发布应用的情况下的解决方法:将英文双引号换成英文单引号。
终极解决方案
不管是注释内容还是非注释内容,也不管是单引号还是双引号,SQL都应该支持?
。
TODO