数仓中典型的几种不下推语句整改案例
1.数仓实践丨主动预防-DWS关键工具安装确认2.一条SQL如何被MySQL架构中的各个组件操作执行的?3.GaussDB(DWS)网络流控与管控效果4.GaussDB(DWS)字符串处理函数返回错误结果集排查5.从缓存的本质说起,说服技术大佬用Redis6.这年头怕数据泄露?全密态数据库:无所谓,我会出手7.华为云新一代分布式数据库GaussDB,给世界一个更优选择8.GaussDB技术解读丨高级压缩9.掌数科技携手华为云GaussDB,助力金融科技创新,联合打造行业标杆10.一文带你全面了解openGemini11.GaussDB(for Redis)多租户:读写权限控制和数据库隔离的完美融合12.5分钟迁移关系型数据库到图数据库13.数仓现网案例丨超大结果集接收异常14.DWS轻量化更新黑科技:宽表加工优化15.数据库行业需要什么样的人才?高校老师这样说16.数仓性能优化:倾斜优化-表达式计算倾斜的hint优化17.GaussDB技术解读系列之SQL Audit,面向应用开发的SQL审核工具18.带你认识数仓的监控系统TopSQL19.带你走进数仓大集群内幕丨详解关于作业hang及残留问题定位20.实时入库不用愁,HStore帮分忧21.openGauss数据库在CentOS上的安装实践22.揭秘华为云GaussDB(for Redis)丨大key治理23.GaussDB(DWS)函数不同写法引发的结果差异
24.数仓中典型的几种不下推语句整改案例
25.GaussDB技术解读系列之应用无损透明(ALT)26.华为云GaussDB(for Influx)单机版上线,企业降本增效利器来了27.数仓备份经验分享丨详解roach备份原理及问题处理套路28.中国云数据仓库,双第一!29.华为云GaussDB打造最可信的数据库,给世界一个更优选择30.GaussDB技术解读系列:高级压缩之OLTP表压缩31.十年磨一剑的华为云GES,高明在哪32.使用DWS集群,用户被锁定如何解锁33.GaussDB技术解读系列:高安全之密态等值34.GaussDB技术解读:应用无损透明(ALT)35.数仓资源管控理论已掌握,是时候实战了36.row_number函数的不稳定性37.GaussDB技术解读丨数据库迁移创新实践38.聊聊GaussDB AP是如何执行SQL的39.Navicat 携手华为云GaussDB,联合打造便捷高效的数据库开发和建模工具方案40.GaussDB技术解读系列丨运维自动驾驶探索41.一次性全讲透GaussDB(DWS)锁的问题42.GaussDB(DWS)性能调优:Sort+Groupagg聚集引起的性能瓶颈案例43.多主架构:VLDB技术论文《Taurus MM: bringing multi-master to the cloud》解读44.GaussDB(for Redis)游戏实践:玩家下线行为上报45.一文详解数据仓库的物理细粒度备份恢复46.华为云HBase冷热分离最佳实践47.四问复合索引,让你的数据查询速度飞起48.GaussDB(DWS)案例丨MERGE场景下语句不下推引起的性能瓶颈问题49.如何强制SQL走性能更优的hash join50.如何使用GaussDB(DWS)的本地临时表进行数据处理51.华为云GaussDB亮相金融业数据库技术大会52.2个数仓中不等值关联优化案例53.数仓实时场景下表行数估算不准确引起的的性能瓶颈问题案例54.详解GuassDB数据库权限命令:GRANT和REVOKE55.DWS临时内存不可用报错: memory temporarily unavailable56.华为云GaussDB城市沙龙活动走进安徽,助力金融行业数字化转型57.理论+应用,带你了解数据库资源池58.人人用数不用愁,动态数据脱敏为您解忧59.实例讲解数据库的数据去重60.数仓实践丨表扫描时过滤行数过多引起的性能瓶颈问题61.实例详解构建数仓中的行列转换62.Proxy下的Prepare透传,让GaussDB(for MySQL)更稳固,性能更卓越63.浅析KV存储之长尾时延解决办法64.实例讲解数据库的定义重载函数65.详解数据库SQL中的三个语句:DROP、TRUNCATE 、DELETE66.华为云GaussDB助力工商银行、华夏银行斩获“十佳卓越实践奖”67.Navicat 基于 GaussDB 主备版的快速入门68.数仓实时算子难以观测,快来试试算子级监控吧69.列举数据库缓存使用场景实例和命令速查表70.带你认识多模数据库GeminiDB架构与应用实践71.3招解决时序数据高基数难题,性能多维度提升!72.数仓性能调优:row_number() over(p)-rn=1性能瓶颈发现和改写套路73.数仓实践丨常量标量子查询做全连接导致整体慢74.细说GaussDB(DWS)的2种查询优化技术75.细说SQL与ETL之间的小秘密76.从概念到实践,带你掌握层次递归查询77.GeminiDB Cassandra接口新特性PITR发布:支持任意时间点恢复78.你的JoinHint为什么不生效79.六步走向无忧,华为云数据库高可用的秘密武器80.数仓调优实践丨SQL改写消除相关子查询81.GaussDB(for MySQL)新特性TDE发布:支持透明数据加密82.详解GaussDB(DWS)通信安全的小妙招:连接认证机制83.GaussDB(for MySQL) RegionlessDB发布:全球数据库技术84.5分钟带您了解DRS录制回放85.ICDM'23 BICE论文解读:基于双向LSTM和集成学习的模型框架86.数仓如何递归查询视图依赖87.支撑核心系统分布式改造,GaussDB为江南农商银行筑稳根基88.近6成金融机构的选择!华为云GaussDB加快金融核心系统转型89.GaussDB(for MySQL)剪枝功能,让查询性能提升70倍!90.2023年度十佳课题公布:华为云GaussDB获权威认可91.详解如何在数仓中搭建细粒度容灾应用92.对话苏光牛:国内数据库市场已进入关键转折点,2024年或是分水岭93.GaussDB通信运维:详解stream连接池设计原理94.GaussDB(for MySQL) Serverless全面商用:无感弹性,极致性价比95.华为云GaussDB支撑农行超级网银业务,性能和稳定性备受认可96.实例详解数据库的游标管理97.数仓实践丨从CU入手优化HStore表98.数仓的等待视图中,为什么会有Hashjoin-nestloop99.如何基于Sharding-JDBC实现GaussDB在客户端应用的读写分离100.如何迅速并识别处理MDL锁阻塞问题本文分享自华为云社区《GaussDB(DWS)性能调优:典型不下推语句整改案例》,作者: 譡里个檔 。
场景1:With-Recursive contains only values rte is not shippable
根因:递归语句的某个分支中没有FROM字句(只有 VALUES 或者类似 SELECT 1 这样的语句)
案例1:递归驱动分支没有FROM字句
原始语句
SELECT T.RPT_ITEM_ID, --报表项ID T.RPT_ITEM_CODE, T.USER_GROUP_CODE AS USER_GROUP_CODE --用户组 FROM BIF.BIF_RPT_ITEM_DEF_T T, (WITH recursive cte AS ( SELECT DISTINCT TRIM(SUBSTR('' :: text, INSTR('', ',', 1, 1) + 1, INSTR('', ',', 1, 2) - INSTR('', ',', 1, 1) - 1)) AS cte_RPT_ITEM_CODE, 1 AS level FROM (SELECT '') AS tb0 UNION ALL SELECT DISTINCT TRIM(SUBSTR('' :: text, INSTR('', ',', 1, cte.level + 1) + 1, INSTR('', ',', 1, cte.level + 2) - INSTR('', ',', 1, cte.level + 1) - 1)), cte.level + 1 FROM (SELECT '') AS tb0, cte WHERE cte.level + 1 <= LENGTH('') - LENGTH(REPLACE('', ',', '')) - 1 ) SELECT DISTINCT cte_RPT_ITEM_CODE AS RPT_ITEM_CODE FROM cte ) T5 WHERE NVL(INSTR(T.RPT_ITEM_FREQUENCE, 'M'), 0) > 0 AND T.RPT_ITEM_CODE = NVL(T5.RPT_ITEM_CODE, T.RPT_ITEM_CODE) AND T.RPT_ITEM_TYPE = 1 --是否是叶子报表项,1=是,0=否,基本报表项 AND T.ENABLE_FLAG = 1 AND T.VERSION = '202308' --使用快照,增加条件限制 ORDER BY T.RPT_ITEM_ID
改写语句
SELECT T.RPT_ITEM_ID, --报表项ID T.RPT_ITEM_CODE, T.USER_GROUP_CODE AS USER_GROUP_CODE --用户组 FROM BIF.BIF_RPT_ITEM_DEF_T T, (WITH recursive cte AS ( SELECT DISTINCT TRIM(SUBSTR('' :: text, INSTR('', ',', 1, 1) + 1, INSTR('', ',', 1, 2) - INSTR('', ',', 1, 1) - 1)) AS cte_RPT_ITEM_CODE, 1 AS level FROM (SELECT '') AS tb0 UNION ALL SELECT DISTINCT TRIM(SUBSTR('' :: text, INSTR('', ',', 1, cte.level + 1) + 1, INSTR('', ',', 1, cte.level + 2) - INSTR('', ',', 1, cte.level + 1) - 1)), cte.level + 1 FROM (SELECT '') AS tb0, cte WHERE cte.level + 1 <= LENGTH('') - LENGTH(REPLACE('', ',', '')) - 1 ) SELECT DISTINCT cte_RPT_ITEM_CODE AS RPT_ITEM_CODE FROM cte ) T5 WHERE NVL(INSTR(T.RPT_ITEM_FREQUENCE, 'M'), 0) > 0 AND T.RPT_ITEM_CODE = NVL(T5.RPT_ITEM_CODE, T.RPT_ITEM_CODE) AND T.RPT_ITEM_TYPE = 1 --是否是叶子报表项,1=是,0=否,基本报表项 AND T.ENABLE_FLAG = 1 AND T.VERSION = '202308' --使用快照,增加条件限制 ORDER BY T.RPT_ITEM_ID
修改点比对
案例2:递归驱动分支没有FROM字句
原始语句
SELECT A.DYNM_COMP_ID, DECODE(B.LINE_NO, 1, '202308', A.VERSION) FROM BIF.BIF_DYNM_COMP_SOU_TBL_V A, (WITH recursive cte AS ( SELECT 1 AS level UNION ALL SELECT cte.level + 1 FROM cte WHERE cte.level + 1 < 3 ) SELECT level as LINE_NO FROM cte ) B WHERE EXISTS (SELECT 1 FROM BIF.BIF_RPT_ITEM_DEF_MV RPT, BIF.BIF_PUB_SUBJECT_AREA_T SBJ, BIF.BIF_SNAPSHORT_SUBJECT_V TYP WHERE A.DYNM_COMP_ID = RPT.DYNM_COMP_ID AND RPT.VERSION = 'current' AND RPT.SUBJECT_AREA_ID = SBJ.SUBJECT_AREA_ID AND SBJ.SUBJECT_AREA_CODE =TYP.SUBJECT_CODE AND TYP.SUBJECT_TYPE ='TAX') AND A.VERSION = 'current'
改写语句
SELECT A.DYNM_COMP_ID, DECODE(B.LINE_NO, 1, '202308', A.VERSION) FROM BIF.BIF_DYNM_COMP_SOU_TBL_V A, (SELECT * FROM generate_series(1, 2) AS cte(LINE_NO) ) B WHERE EXISTS (SELECT 1 FROM BIF.BIF_RPT_ITEM_DEF_MV RPT, BIF.BIF_PUB_SUBJECT_AREA_T SBJ, BIF.BIF_SNAPSHORT_SUBJECT_V TYP WHERE A.DYNM_COMP_ID = RPT.DYNM_COMP_ID AND RPT.VERSION = 'current' AND RPT.SUBJECT_AREA_ID = SBJ.SUBJECT_AREA_ID AND SBJ.SUBJECT_AREA_CODE =TYP.SUBJECT_CODE AND TYP.SUBJECT_TYPE ='TAX') AND A.VERSION = 'current'
修改点比对
案例3:递归驱动分支是VALUES字句
原始语句
WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n < (SELECT MAX(LENGTH(COMP_CODE)-LENGTH(REPLACE(COMP_CODE,',','')))+1 MAX_TOKENS FROM (SELECT DEPT_CODE, to_char(APPLICABLE_GEO_PC_CODE) COMP_CODE FROM SDIHR.MDM_CDM_DEPT_ACT_INFO_T_3600) ) ) SELECT n AS LVL FROM t
改写语句
WITH RECURSIVE t(n) AS ( SELECT * FROM generate_series(1, 1) UNION ALL SELECT n+1 FROM t WHERE n < (SELECT MAX(LENGTH(COMP_CODE)-LENGTH(REPLACE(COMP_CODE,',','')))+1 MAX_TOKENS FROM (SELECT DEPT_CODE, to_char(APPLICABLE_GEO_PC_CODE) COMP_CODE FROM SDIHR.MDM_CDM_DEPT_ACT_INFO_T_3600) ) ) SELECT n AS LVL FROM t
修改点比对
案例4:递归驱动分支是VALUES字句
原始语句
WITH RECURSIVE t(n) AS ( VALUES (1) UNION ALL SELECT n+1 FROM t WHERE n < (SELECT MAX(LENGTH(COMP_CODE)-LENGTH(REPLACE(COMP_CODE,',','')))+1 MAX_TOKENS FROM (SELECT DEPT_CODE, to_char(APPLICABLE_GEO_PC_CODE) COMP_CODE FROM SDIHR.MDM_CDM_DEPT_ACT_INFO_T_3600)) ) SELECT n AS LVL FROM t
改写语句
SELECT * FROM generate_series(1, (SELECT MAX(LENGTH(COMP_CODE)-LENGTH(REPLACE(COMP_CODE,',','')))+1 MAX_TOKENS FROM (SELECT DEPT_CODE, to_char(APPLICABLE_GEO_PC_CODE) COMP_CODE FROM SDIHR.MDM_CDM_DEPT_ACT_INFO_T_3600)) ) AS t(lvl)
修改点比对
场景2:With-Recursive contains system table is not shippable
根因:递归语句的某个分支中没有FROM字句中只用系统表或者系统视图(DUAL也被视为系统视图)
案例1:递归驱动分支是FROM DUAL查询
原始语句
WITH recursive cte AS ( SELECT TO_DATE(201701, 'YYYYMM') as level ,TO_DATE(20170131, 'YYYYMMDD') LASTDAY FROM dual UNION ALL SELECT ADD_MONTHS(cte.LEVEL, 1) AS PERIOD, LAST_DAY(ADD_MONTHS(cte.LEVEL, 1)) AS LASTDAY FROM cte WHERE cte.LEVEL <=SYSDATE ) SELECT TO_CHAR(cte.level,'YYYYMMDD') AS PERIOD , cte.LASTDAY FROM cte WHERE TO_CHAR(cte.level,'YYYYMMDD')<= TO_CHAR(SYSDATE,'YYYYMMDD')
改写语句
WITH recursive cte AS ( SELECT TO_DATE(201701, 'YYYYMM') as level ,TO_DATE(20170131, 'YYYYMMDD') LASTDAY FROM dual UNION ALL SELECT ADD_MONTHS(cte.LEVEL, 1) AS PERIOD, LAST_DAY(ADD_MONTHS(cte.LEVEL, 1)) AS LASTDAY FROM cte WHERE cte.LEVEL <=SYSDATE ) SELECT TO_CHAR(cte.level,'YYYYMMDD') AS PERIOD , cte.LASTDAY FROM cte WHERE TO_CHAR(cte.level,'YYYYMMDD')<= TO_CHAR(SYSDATE,'YYYYMMDD')
修改点对比
场景3:SubPlan exec on CN can't be shipped
根因:某个子查询语句只能在CN上执行,通常是这个子查询有不下推因素,比如有系统表、系统视图调用,或者存在不下推函数等
案例1:子查询中系统表/系统视图查询
原始语句
WITH error_log AS NOT MATERIALIZED ( SELECT upper(log_column_name) AS log_column_name, log_error_code, s.char_length AS data_length, s.data_type,s.nullable FROM (SELECT distinct unnest(string_to_array(bad_log_column_name,',')) AS log_column_name, unnest(string_to_array(bad_log_error_code,',')) AS log_error_code FROM stgltc.BAD_cfs_inv_invoice_ad_2500 ) T, (SELECT * FROM user_tab_columns WHERE table_name=lower('dlt_cfs_inv_invoice_ad_2500')) S WHERE upper(T.log_column_name)=upper(S.column_name) ) SELECT CASE WHEN upper('ACTIVITY_NAME') IN (SELECT log_column_name FROM error_log WHERE data_type IN ('varchar','char','character','nchar','character varying','varchar2','nvarchar2','clob','text') AND log_error_code='22001'/*字符超长*/) THEN SUBSTRB(ACTIVITY_NAME,0,(SELECT distinct DATA_LENGTH FROM error_log WHERE upper(log_column_name)=upper('ACTIVITY_NAME'))) ELSE ACTIVITY_NAME END AS ACTIVITY_NAME, CASE WHEN upper('ADJUSTMENT_ID') IN (SELECT log_column_name FROM error_log WHERE data_type IN ('varchar','char','character','nchar','character varying','varchar2','nvarchar2','clob','text') AND log_error_code='22001'/*字符超长*/) THEN SUBSTRB(ADJUSTMENT_ID,0,(SELECT distinct DATA_LENGTH FROM error_log WHERE upper(log_column_name)=upper('ADJUSTMENT_ID'))) ELSE ADJUSTMENT_ID END AS ADJUSTMENT_ID FROM stgltc.BAD_cfs_inv_invoice_ad_2500
改写语句
-- 识别不下推的子查询为WITH error_log字句中的 -- SELECT * FROM user_tab_columns WHERE table_name=lower('dlt_cfs_inv_invoice_ad_2500') -- -- 因为这部分为系统表查询,无论如何都不能下推,所以此处把这部分结果转储到一个中间表中 -- 中间表创建成行存表 CREATE TEMP TABLE s WITH(orientation=row) DISTRIBUTE BY ROUNDROBIN AS SELECT * FROM user_tab_columns WHERE table_name=lower('dlt_cfs_inv_invoice_ad_2500') -- 因为整个查询涉及到的表都是列存表,之后前面创建的临时表s为行存表 -- 所以此处加一个强制走向量化的hint WITH error_log AS NOT MATERIALIZED ( SELECT upper(log_column_name) AS log_column_name, log_error_code, s.char_length AS data_length, s.data_type,s.nullable FROM (SELECT distinct unnest(string_to_array(bad_log_column_name,',')) AS log_column_name, unnest(string_to_array(bad_log_error_code,',')) AS log_error_code FROM stgltc.bad_cfs_inv_invoice_ad_2500 ) T, pg_temp.S WHERE upper(T.log_column_name)=upper(S.column_name) ) SELECT /*+ set global(enable_force_vector_engine on)*/ CASE WHEN upper('ACTIVITY_NAME') IN (SELECT log_column_name FROM error_log WHERE data_type IN ('varchar','char','character','nchar','character varying','varchar2','nvarchar2','clob','text') AND log_error_code='22001'/*字符超长*/) THEN SUBSTRB(ACTIVITY_NAME,0,(SELECT distinct DATA_LENGTH FROM error_log WHERE upper(log_column_name)=upper('ACTIVITY_NAME'))) ELSE ACTIVITY_NAME END AS ACTIVITY_NAME, CASE WHEN upper('ADJUSTMENT_ID') IN (SELECT log_column_name FROM error_log WHERE data_type IN ('varchar','char','character','nchar','character varying','varchar2','nvarchar2','clob','text') AND log_error_code='22001'/*字符超长*/) THEN SUBSTRB(ADJUSTMENT_ID,0,(SELECT distinct DATA_LENGTH FROM error_log WHERE upper(log_column_name)=upper('ADJUSTMENT_ID'))) ELSE ADJUSTMENT_ID END AS ADJUSTMENT_ID FROM stgltc.bad_cfs_inv_invoice_ad_2500
修改点对比
场景4:Type of Record in TargetList can not be shipped
根因:输出列中存在record类型,这种类型的不下推一般是不会体现在最外层的输出列上,一般这类报错有两个场景
1.SQL书写逻辑错误,导致输出列上出现了(...)形式的输出列
2.SQL业务逻辑正确, 这种场景需要了解业务含义,把record字段强转为text类型,然后再使用record字段的地方做特殊适配
案例1:输出列书写错误,出现(...)形式的输出列
原始语句
SELECT d.id, coalesce(d.period, 'snull') AS period, (d.plan_unit_code, 'snull') AS plan_unit_code, coalesce(d.product_type_model, 'snull') AS product_type_model, coalesce(d.revision, 'snull') AS revision, d.start_date FROM (SELECT * FROM cdcscm.cdc_mp_d_forecast_t_6120 t WHERE t.cdc_timestamp > to_date('2023-07-06 00:00:00', 'yyyy-mm-dd hh24:mi:ss') - 7 AND t.cdc_timestamp < to_date('2023-08-08 00:00:00', 'yyyy-mm-dd hh24:mi:ss') ) t1, sdiscm.mp_d_forecast_t_6120 d WHERE (t1.audit_op_type = 'delete' AND t1.audit_op_option = 'before') AND d.id = t1.id
改写语句
SELECT d.id, coalesce(d.period, 'snull') AS period, coalesce(d.plan_unit_code, 'snull') AS plan_unit_code, coalesce(d.product_type_model, 'snull') AS product_type_model, coalesce(d.revision, 'snull') AS revision, d.start_date FROM (SELECT * FROM cdcscm.cdc_mp_d_forecast_t_6120 t WHERE t.cdc_timestamp > to_date('2023-07-06 00:00:00', 'yyyy-mm-dd hh24:mi:ss') - 7 AND t.cdc_timestamp < to_date('2023-08-08 00:00:00', 'yyyy-mm-dd hh24:mi:ss') ) t1, sdiscm.mp_d_forecast_t_6120 d WHERE (t1.audit_op_type = 'delete' AND t1.audit_op_option = 'before') AND d.id = t1.id
修改点对比
备注:改写前后语句不等价,不等价的原因是因为原始SQL书写有问题,正确的写法是就是coalesce(d.plan_unit_code, 'snull') AS plan_unit_code。
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 【.NET】调用本地 Deepseek 模型
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· DeepSeek “源神”启动!「GitHub 热点速览」
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 我与微信审核的“相爱相杀”看个人小程序副业
2020-08-14 【华为云推官招募】加入云推官,月入8万的兼职不是梦
2020-08-14 JavaScript中的正则表达式详解
2020-08-14 一瓶可乐的自动售货机指令“旅程”
2020-08-14 年近而立,Java何去何从?