Oracle导致Redo日志暴增的SQL语句排查

一、概述

最近数据库频繁不定时的报出一些耗时长的SQL,甚至SQL执行时间过长,导致连接断开现象。下面是一些排查思路。

二、查询日志的大小,日志组情况

SELECT L.GROUP#,
       LF.MEMBER,
       L.ARCHIVED,
       L.BYTES / 1024 / 1024 "SIZE(M)",
       L.MEMBERS
FROM V$LOG L,
     V$LOGFILE LF
WHERE L.GROUP# = LF.GROUP#;

查询结果:

 

 从上图可以看出目前共分为10个日志组,每个日志组2个文件,每个文件大小为3G。

三、查询Oracle最近几天每小时归档日志产生数量

SELECT SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'), 1, 5)                             Day,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '00', 1, 0)) H00,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '01', 1, 0)) H01,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '02', 1, 0)) H02,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '03', 1, 0)) H03,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '04', 1, 0)) H04,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '05', 1, 0)) H05,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '06', 1, 0)) H06,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '07', 1, 0)) H07,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '08', 1, 0)) H08,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '09', 1, 0)) H09,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '10', 1, 0)) H10,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '11', 1, 0)) H11,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '12', 1, 0)) H12,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '13', 1, 0)) H13,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '14', 1, 0)) H14,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '15', 1, 0)) H15,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '16', 1, 0)) H16,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '17', 1, 0)) H17,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '18', 1, 0)) H18,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '19', 1, 0)) H19,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '20', 1, 0)) H20,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '21', 1, 0)) H21,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '22', 1, 0)) H22,
       SUM(DECODE(SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH24:MI:SS'), 10, 2), '23', 1, 0)) H23,
       COUNT(*)                                                                           TOTAL
FROM v$log_history a
WHERE first_time >= to_char(sysdate - 10)
GROUP BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'), 1, 5)
ORDER BY SUBSTR(TO_CHAR(first_time, 'MM/DD/RR HH:MI:SS'), 1, 5) DESC;

查询结果

 

从上图可以看出业务高峰期每小时产生40个日志文件左右(目前设定的每个日志文件大小为3G),平均1.5分钟产生一个3G的日志文件。而oracle官方建议Redo日志平均30分钟切换一次最好

四、查看最近2小时"块改变"最多的segment

redo大量产生必然是由于大量产生"块改变"。从awr视图中找到"块改变"最多的segment。

这是查询最近2小时(120分钟)的,begin_interval_time> sysdate - 120/1440,大家也可以自定义修改查询最近多少分钟的。

 1 select *
 2 from (SELECT to_char(begin_interval_time, 'YYYY_MM_DD HH24:MI') snap_time,
 3              dhsso.object_name,
 4              SUM(db_block_changes_delta)
 5       FROM dba_hist_seg_stat dhss,
 6            dba_hist_seg_stat_obj dhsso,
 7            dba_hist_snapshot dhs
 8       WHERE dhs.snap_id = dhss.snap_id
 9         AND dhs.instance_number = dhss.instance_number
10         AND dhss.obj# = dhsso.obj#
11         AND dhss.dataobj# = dhsso.dataobj#
12         AND begin_interval_time > sysdate - 120 / 1440
13       GROUP BY to_char(begin_interval_time, 'YYYY_MM_DD HH24:MI'), dhsso.object_name
14       order by 3 desc)
15 where rownum <= 5;

查询结果:

 

五、从awr视图中找出步骤四中排序靠前的对象涉及的SQL

说明:LIKE '%MON_MODS$%'中MON_MODS是步骤1中查询出来的OBJECT_NAME

 1 SELECT to_char(begin_interval_time, 'YYYY_MM_DD HH24:MI'),
 2        dbms_lob.substr(sql_text, 4000, 1),
 3        dhss.instance_number,
 4        dhss.sql_id,
 5        executions_delta,
 6        rows_processed_delta
 7 FROM dba_hist_sqlstat dhss,
 8      dba_hist_snapshot dhs,
 9      dba_hist_sqltext dhst
10 WHERE UPPER(dhst.sql_text) LIKE '%MON_MODS$%'
11   AND dhss.snap_id = dhs.snap_id
12   AND dhss.instance_Number = dhs.instance_number
13   AND dhss.sql_id = dhst.sql_id;

查询结果

 

六、从ASH相关视图找到执行这些SQL的session、module和machine

1 select * from dba_hist_active_sess_history WHERE sql_id = 'c9n8kv7afchtd';
2 select * from v$active_session_history where sql_Id = 'c9n8kv7afchtd';
c9n8kv7afchtd是SQL_ID,替换第二步查询的结果SQL_ID列

 七、排查问题SQL

通过第四步,我们确定了导致产生大量redo日志主要涉及三张表,再通过第五步确定了每张表排名前五的SQL。针对这些产生大量Redo日志的SQL,就是需要做优化的地方。

 

posted @ 2021-11-24 17:21  shileishmily  阅读(1470)  评论(0编辑  收藏  举报