dremio hive jdbc arp date 类型问题记录

简单记录下碰到的一些问题

分析

  • arthas stack 查看调用
    对于hive 是类似的,我测试的是mysql 的
stack com.mysql.cj.jdbc.result.ResultSetImpl  getDate

效果

ffect(class count: 2 , method count: 4) cost in 329 ms, listenerId: 11
ts=2023-12-26 06:18:17;thread_name=e3 - 1a758fd6-4c6d-9baa-6d8f-31fa8220ee00:frag:0:0;id=c4;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    @com.mysql.cj.jdbc.result.ResultSetImpl.getDate()
        at org.apache.commons.dbcp2.DelegatingResultSet.getDate(DelegatingResultSet.java:682)
        at org.apache.commons.dbcp2.DelegatingResultSet.getDate(DelegatingResultSet.java:682)
        at com.dremio.exec.store.jdbc.JdbcRecordReader$DateCopier.copy(JdbcRecordReader.java:688)
        at com.dremio.exec.store.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:291)
        at com.dremio.exec.store.CoercionReader.next(CoercionReader.java:187)
        at com.dremio.sabot.op.scan.ScanOperator.outputData(ScanOperator.java:365)
        at com.dremio.sabot.driver.SmartOp$SmartProducer.outputData(SmartOp.java:551)
        at com.dremio.sabot.driver.StraightPipe.pump(StraightPipe.java:56)
        at com.dremio.sabot.driver.Pipeline.doPump(Pipeline.java:124)
        at com.dremio.sabot.driver.Pipeline.pumpOnce(Pipeline.java:114)
        at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:565)
        at com.dremio.sabot.exec.fragment.FragmentExecutor.run(FragmentExecutor.java:480)
        at com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700(FragmentExecutor.java:109)
        at com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run(FragmentExecutor.java:1016)
        at com.dremio.sabot.task.AsyncTaskWrapper.run(AsyncTaskWrapper.java:122)
        at com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop(SlicingThread.java:249)
        at com.dremio.sabot.task.slicing.SlicingThread.run(SlicingThread.java:171)
 
ts=2023-12-26 06:57:01;thread_name=e6 - 1a7586c2-583b-6a25-56f0-01038d824a00:frag:0:0;id=c7;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
    @com.mysql.cj.jdbc.result.ResultSetImpl.getDate()
        at org.apache.commons.dbcp2.DelegatingResultSet.getDate(DelegatingResultSet.java:682)
        at org.apache.commons.dbcp2.DelegatingResultSet.getDate(DelegatingResultSet.java:682)
        at com.dremio.exec.store.jdbc.JdbcRecordReader$DateCopier.copy(JdbcRecordReader.java:688)
        at com.dremio.exec.store.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:291)
        at com.dremio.exec.store.CoercionReader.next(CoercionReader.java:187)
        at com.dremio.sabot.op.scan.ScanOperator.outputData(ScanOperator.java:365)
        at com.dremio.sabot.driver.SmartOp$SmartProducer.outputData(SmartOp.java:551)
        at com.dremio.sabot.driver.StraightPipe.pump(StraightPipe.java:56)
        at com.dremio.sabot.driver.Pipeline.doPump(Pipeline.java:124)
        at com.dremio.sabot.driver.Pipeline.pumpOnce(Pipeline.java:114)
        at com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run(FragmentExecutor.java:565)
        at com.dremio.sabot.exec.fragment.FragmentExecutor.run(FragmentExecutor.java:480)
        at com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700(FragmentExecutor.java:109)
        at com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run(FragmentExecutor.java:1016)
        at com.dremio.sabot.task.AsyncTaskWrapper.run(AsyncTaskWrapper.java:122)
        at com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop(SlicingThread.java:249)
        at com.dremio.sabot.task.slicing.SlicingThread.run(SlicingThread.java:171)
  • 类的反编译
jad com.dremio.exec.store.jdbc.JdbcRecordReader$DateCopier

效果

ClassLoader:                                                                                                                    
+-sun.misc.Launcher$AppClassLoader@18b4aac2                                                                                     
  +-sun.misc.Launcher$ExtClassLoader@5614c340                                                                                   
 
Location:                                                                                                                       
/opt/dremio/jars/dremio-ce-jdbc-plugin-24.3.0-202312190021150029-52db2faf.jar                                                   
 
        /*
         * Decompiled with CFR.
         */
        package com.dremio.exec.store.jdbc;
 
        import com.dremio.exec.store.jdbc.JdbcRecordReader;
        import java.sql.Date;
        import java.sql.ResultSet;
        import java.sql.SQLException;
        import java.util.Calendar;
        import java.util.TimeZone;
        import org.apache.arrow.vector.DateMilliVector;
 
        private static class JdbcRecordReader.DateCopier
        extends JdbcRecordReader.Copier<DateMilliVector> {
            private final Calendar calendar = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
 
            JdbcRecordReader.DateCopier(int columnIndex, ResultSet result, DateMilliVector vector) {
/*683*/         super(columnIndex, result, vector);
            }
 
            @Override
            void copy(int index) throws SQLException {
               // 会传递calendar
                Date date = this.getResult().getDate(this.getColumnIndex(), this.calendar);
/*689*/         if (date != null) {
/*690*/             ((DateMilliVector)this.getValueVector()).setSafe(index, date.getTime());
                }
            }
        }
  • hive getdate 包含calendar 的处理
    HiveBaseResultSet 类
    效果
public Date getDate(int columnIndex, Calendar cal) throws SQLException {
    logger.trace("{}, {}", this.traceInfo(), columnIndex);
    throw new SQLException("Method not supported");
}

已经很明显了,核心是dremio传递了Calendar,但是hive 的getdate 不支持Calendar,所以对于类型肯定就会有问题了

解决方法

  • 修改dremio
    不建议,侵入太大,而且会有影响
  • 直接修改hive jdbc 驱动
    对于getDate 包含Calendar 的处理,使用public Date getDate(int columnIndex) 的实现,忽略Calendar

说明

对于构建,如果不想自己完整编译,可以通过反编译,替换class 文件的模式(如果有源码不推荐这么玩,对于缺少源码的场景可以使用此方法)
实际一个修改的代码,参考下边的github

参考资料

https://github.com/apache/hive/blob/master/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java#L360
https://github.com/rongfengliang/inceptor-sdk-transwarp-fix

posted on 2023-12-26 15:26  荣锋亮  阅读(24)  评论(0编辑  收藏  举报

导航