hivesql迁移spark2.4 cannot resolve '(ctime >= start_time) due to data type mismatch: differing types in '(ctime` >= `start_time`)'(timestamp and bigint).; line 99 pos 10
hivesql 一些默认类型转换。但spark需要手动强转
比如以下部分:
SELECT 1 as id, avid as avid, mid as mid, TRIM(LOWER(tag)) as tag, ctime as ctime FROM archive.dws_archive_daily WHERE log_date = '20191203' ) av3 INNER JOIN ad_tag on av3.id = ad_tag.id WHERE av3.ctime >= ad_tag.start_time AND av3.ctime <= ad_tag.end_time
报错如下:
Error in query: cannot resolve '(av2.`ctime` >= ad_sub_tid.`start_time`)' due to data type mismatch: differing types in '(av2.`ctime` >= ad_sub_tid.`start_time`)' (timestamp and bigint).; line 99 pos 10;
实际上ctime为timestamp类型,而start_time为bigint(long)类型。
那么我们将语句显示强转
类型转换
cast(av2.ctime as bigint) >= ad_sub_tid.start_time
或者
av2.ctime >= cast(ad_sub_tid.start_time as timestamp)语句都能顺利执行。
迁移sql中并不想修改sql语句。就修改下源码好了
查看代码报错位置在checkAnalysis.class中
跟下checkInputDataTypes代码
新增一行类型可以转换的判断 timstamp->bigint 返回true
再次运行提交sparksql就可以了