Impala Parameter #1 is unset问题排查

概述

最近在全权负责的数据产品平台。简单来说,就是基于JDBC去执行SQL。支持各种不同的数据源如,Hive,Impala,MySQL,ClickHouse等。

问题

某次发布后,数据推送失败,报错Parameter #1 is unset,具体的报错日志:

java.sql.SQLException: Parameter #1 is unset
	at org.apache.hive.jdbc.HivePreparedStatement.updateSql(HivePreparedStatement.java:139)
	at org.apache.hive.jdbc.HivePreparedStatement.executeQuery(HivePreparedStatement.java:111)
	at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeQuery(DruidPooledPreparedStatement.java:227)
	at com.xy.cloudiview.datapattern.JdbcProvider.read(JdbcProvider.java:199)

具体的功能是,从Impala查询数据,然后推送到MQ。

报错代码片段:

@Slf4j
@Data
public abstract class JdbcProvider implements DataCommon {
    protected Long pageSize = 500000L;
    private Long page = 0L;
    private String sql;
    protected ResultSet resultSet;
    private boolean hasNext;
    protected Connection connection;
    protected Statement ps;
    protected JSONObject datasourceJson;
    private String writeSqlFormat = " INSERT INTO %s.%s(";
    protected List<ColumnMetadata> columnMetadataList;

	public JdbcProvider read(String sql) throws Exception {
	    this.connection = reConnection();
	    this.ps = null;
	    String tempSql = decoratorSql(sql);
	    if (!Strings.isNullOrEmpty(tempSql)) {
	        this.setSql(tempSql);
	    } else {
	        this.setSql(sql);
	    }
	    try {
	        PreparedStatement preparedStatement = this.connection.prepareStatement(this.getSql(), ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
	        preparedStatement.setFetchSize(100);
	        readBefore(preparedStatement);
	        // 具体报错代码行
	        this.resultSet = preparedStatement.executeQuery();
	        this.setPs(preparedStatement);
	        columnMetadataList = new ArrayList<>();
	    } catch (Exception e) {
	        if (this.ps != null) {
	            this.ps.close();
	            this.connection.close();
	        }
	        throw e;
	    }
	    return this;
	}
	
	@Override
	public String decoratorSql(String sql) {
	    return "-- this sql requested by iview " +
	            "\n-- executeTime:" + System.currentTimeMillis() +
	            "\n-- step:" + ++step +
	            "\n" + sql;
	}
	
	@Override
	public void readBefore(Statement ps) throws SQLException {
	    executeBefore(ps);
	    setPageSize(5000000L);
	}
	
	@Override
	public void executeBefore(Statement ps) throws SQLException {
	    ps.executeUpdate("SET DECIMAL_V2=FALSE");
	}

	@Data
	public static class ColumnMetadata {
	    private String label;
	    private String type;
	}
}

报错代码为:this.resultSet = preparedStatement.executeQuery();

排查

Impala数据源基于Hive,也是使用hive-jdbc

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>2.1.1-cdh6.2.1</version>
</dependency>

HivePreparedStatement.java源码:

public class HivePreparedStatement extends HiveStatement implements PreparedStatement {

    public ResultSet executeQuery() throws SQLException {
        return super.executeQuery(this.updateSql(this.sql, this.parameters));
    }

    private String updateSql(String sql, HashMap<Integer, String> parameters) throws SQLException {
        List<String> parts = this.splitSqlStatement(sql);
        StringBuilder newSql = new StringBuilder((String)parts.get(0));

        for(int i = 1; i < parts.size(); ++i) {
            if (!parameters.containsKey(i)) {
                throw new SQLException("Parameter #" + i + " is unset");
            }

            newSql.append((String)parameters.get(i));
            newSql.append((String)parts.get(i));
        }

        return newSql.toString();
    }

    private List<String> splitSqlStatement(String sql) {
        List<String> parts = new ArrayList();
        int apCount = 0;
        int off = 0;
        boolean skip = false;

        for(int i = 0; i < sql.length(); ++i) {
            char c = sql.charAt(i);
            if (skip) {
                skip = false;
            } else {
                switch(c) {
                case '\'':
                    ++apCount;
                    break;
                case '?':
                    if ((apCount & 1) == 0) {
                        parts.add(sql.substring(off, i));
                        off = i + 1;
                    }
                    break;
                case '\\':
                    skip = true;
                }
            }
        }
        parts.add(sql.substring(off, sql.length()));
        return parts;
    }

}

源码的重点在于HivePreparedStatement.splitSqlStatement方法,对于以下3个特殊字段\'?\\有特殊处理逻辑。

于是去查看用户提交的SQL,发现确实有?符号,经过简化的SQL:

select '1' as a
union all
select '2' as a
/*
union all
-- 极速贷???
select '3' as a
*/

英文问号?存在于注释语句中。

提交到大数据平台的SQL不支持注释语句?经过验证,无论是单行注释还是多行注释,通过hive-jdbc提交都是没有问题的:
单行注释:

select '1' as a
-- union all
-- select '3' as a

多行注释:

select '1' as a
/*
union all
select '3' as a
*/

在这里插入图片描述
那与英文问号?有关?验证下来,注释内容里面不能有英文问号?
在这里插入图片描述
这尼玛???

临时解决方案

删除注释内容里面的英文问号?

问题

<解决>上面的问题后,又爆出一个问题,报错信息也是Parameter #1 is unset

检查SQL,非注释内容中确实有?,但是在测试环境不能重现问题:
在这里插入图片描述
经过反复仔细检查,发现生产环境,用户提交的SQL是双引号:
在这里插入图片描述
WTF??SQL的单引号和双引号还有区别???
平时写SQL时,经常写出这样的SQL:

select '1' as a
union all
select "2" as a

预览数据也没有问题:
在这里插入图片描述

临时解决方案

在不改Java code并发布应用的情况下的解决方法:将英文双引号换成英文单引号。

终极解决方案

不管是注释内容还是非注释内容,也不管是单引号还是双引号,SQL都应该支持?

TODO

参考

posted @ 2022-08-13 13:56  johnny233  阅读(56)  评论(0编辑  收藏  举报  来源