如何诊断oracle数据库运行缓慢或hang住的问题

如何诊断oracle数据库运行缓慢或hang住的问题

Oracle 作者:yangi1314 时间:2016-09-05 21:55:39 2296 0

为了诊断oracle运行缓慢的问题首先要决定收集哪些诊断信息,可以采取下面的诊断方法:
1.数据库运行缓慢这个问题是常见还是在特定时间出现
如果数据库运行缓慢是一个常见的问题那么可以在问题出现的时候收集这个时期的awr或者statspack报告(通常收集时间间隔是一个小时).生成awr报告的方法如下:
awr是通过sys用户来收集持久系统性能统计信息并且这些信息保存在sysaux表空间.缺省情况下快照是一个小时生成一次并且保留7天.awr报告输出了基于指定快照之间的一系列的统计信息用于性能分析和调查其它问题.
运行基本的报告
可以执行下面的脚本来生成一个awr报告:
$ORACLE_HOME/rdbms/admin/awrrpt.sql

可以根据自己收集awr报告的原因来决定生成一个快照的时间间隔也可以指定生成awr报告的格式(text或html).

生成各种类型的awr报告
可以根据各种要求来运行各种sql脚本来生成各种类型的awr报告.每一种报告都有两种格式(txt或html):
awrrpt.sql
显示指定快照范围内的各种统计信息

awrrpti.sql
显示一个特定数据库和实例中指定快照范围内的各种统计信息

awrsqrpt.sql
显示一个指定快照范围内的一个特定的sql语句的统计信息.运行这个报告是为了检查或调查一个特定sql语句的性能

awrsqrpi.sql
显示一个特定sql在指定快照范围内的的统计信息.

awrddrpt.sql
比较在两个选择的时间间隔期间内详细的性能数据和配置情况

awrddrpi.sql
在一个特定的数据库和平共处实例中比较在两个选择的时间间隔期间内详细的性能数和配置情况

各种awr相关的操作
怎样修改awr快照的设置:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
    retention => 43200,        -- Minutes (43200 = 30 Days).
                               -- Current value retained if NULL.
    interval  => 30);          -- Minutes. Current value retained if NULL.
END;
/

创建一个awr基线:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.create_baseline (
    start_snap_id => 10,
    end_snap_id   => 100,
    baseline_name => 'AWR First baseline');
END;
/

在oracle11G中引入了一个新的dbms_workload_repository.create_baseline_template过程来创建一个awr基线模板

BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE_TEMPLATE (
start_time => to_date('&start_date_time','&start_date_time_format'),
end_time => to_date('&end_date_time','&end_date_time_format'),
baseline_name => 'MORNING',
template_name => 'MORNING',
expiration => NULL ) ;
END;
/

"expiration=>NULL"这意味着这个基线将永远保持有效.

删除一个awr基线

BEGIN
    DBMS_WORKLOAD_REPOSITORY.DROP_BASELINE (
    baseline_name => 'AWR First baseline');
END;
/

也能从一个老的数据库中删除一个awr基线:

BEGIN
DBMS_WORKLOAD_REPOSITORY.DROP_BASELINE (baseline_name => 'peak baseline',
cascade => FALSE, dbid => 3310949047);
END;
/

删除awr快照:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.drop_snapshot_range(
(low_snap_id=>40,
High_snap_id=>80);
END;
/

也可能基于报告时间期间对创建和删除的awr基线指定一个模板:

BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE_TEMPLATE (
day_of_week => 'MONDAY',
hour_in_day => 9,
duration => 3,
start_time => to_date('&start_date_time','&start_date_time_format'),
end_time => to_date('&end_date_time','&end_date_time_format'),
baseline_name_prefix => 'MONDAY_MORNING'
template_name => 'MONDAY_MORNING',
expiration => 30 );
END;
/

将会在'&start_date_time'到'&end_date_time'期间的每一个星期一都会生成基线

手动生成的一个awr快照:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.create_snapshot();
END;
/

工作负载资料档案库视图:
V$ACTIVE_SESSION_HISTORY - 显示历史活动会话信息每秒抽样一样
V$METRIC - 显示度量标准信息
V$METRICNAME - 显示与每个度量标准组相关的度量标准
V$METRIC_HISTORY - 显示历史度量标准
V$METRICGROUP - 显示所有的度量标准组
DBA_HIST_ACTIVE_SESS_HISTORY - 显示历史活动会话的详细信息
DBA_HIST_BASELINE - 显示基线信息
DBA_HIST_DATABASE_INSTANCE - 显示数据库环境信息
DBA_HIST_SNAPSHOT - 显示快照信息
DBA_HIST_SQL_PLAN - 显示sql执行计划
DBA_HIST_WR_CONTROL - 显示awr设置情况

如果数据库运行缓慢在特定时间出现那么可以当问题存在时生成一个awr或statspack报告,报告的时间间隔包含了问题出现的时间.另外为了比较可以收集没有出现问题而时间间隔相同的数据库正常运行的报告这样可以对报告进行比较.

2.数据库缓慢它影响的是一个会话,几个会话还是所有会话
如果数据库缓慢它影响的是一个会话或几个会话可以对这个会话或几个会话进行10046跟踪
如果数据库缓慢它影响的是所有会话可以收集awr或statspack报告

执行10046跟踪的方法如下:
收集10046跟踪文件
10046事件是一种标准的方法用来对oracle会话收集扩展的sql_trace信息
对于查询性能问题来说通常要求记录查询的等待和绑定变量信息.这可以使用级别为12的10046跟踪来完成.下面的例子说明了在各种情况下设置10046事件.

跟踪文件的位置
在oracle11g及以上版本中引入了新的诊断架构,跟踪和核心文件存储的位置由diagnostic_dest初始化参数来控制.可以使用下面的命令来显示:

sys@JINGYONG> show parameter diagnostic_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
diagnostic_dest                      string      /u01/app/oracle

注意:在有些例子中可能设置了'tracefile_identifier'来帮助找到输出的跟踪文件

会话跟踪
可以在用户会话执行sql语句之前对会话启用跟踪,在会话级别收集10046跟踪

sys@JINGYONG> alter session set timed_statistics=true;

会话已更改。

sys@JINGYONG> alter session set statistics_level=all;

会话已更改。

sys@JINGYONG> alter session set max_dump_file_size=unlimited;

会话已更改。

sys@JINGYONG> alter session set events '10046 trace name context forever,level 1
2';

会话已更改。

sys@JINGYONG> select * from dual;

D
-
X

sys@JINGYONG>exit

如果会话没有退出可以执行以下语句来禁用10046跟踪

sys@JINGYONG> alter session set events '10046 trace name context off';

会话已更改。


sys@JINGYONG> select * from v$diag_info ;

   INST_ID NAME
---------- ----------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
         1 Diag Enabled
TRUE

         1 ADR Base
/u01/app/oracle

         1 ADR Home
/u01/app/oracle/diag/rdbms/jingyong/jingyong

         1 Diag Trace
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace

         1 Diag Alert
/u01/app/oracle/diag/rdbms/jingyong/jingyong/alert

         1 Diag Incident
/u01/app/oracle/diag/rdbms/jingyong/jingyong/incident

         1 Diag Cdump
/u01/app/oracle/diag/rdbms/jingyong/jingyong/cdump

         1 Health Monitor
/u01/app/oracle/diag/rdbms/jingyong/jingyong/hm

         1 Default Trace File
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2572_10046.trc

         1 Active Problem Count
0

         1 Active Incident Count
0


已选择11行。

sys@JINGYONG> select * from v$diag_info where name='Default Trace File';

   INST_ID NAME
---------- ----------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
         1 Default Trace File
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2572_10046.trc

注意:如果会话不是彻底的关闭和禁用跟踪那么重要的跟踪信息可能会从跟踪文件中丢失.

注意:这里statistics_level=all因此它会在这种情况下收集一定程度的统计信息.这个参数有三个参数值.all,typical,basic.为了诊断性能问题会要求获得一定程度的统计信息.设置为all可能是不必要的但可以使用typical以此来获得全面的诊断信息.

跟踪一个已经启动的进程
如果要跟踪一个已经存在的会话可以使用oradebug来连接到会话初始化10046跟踪
1.通过某种方法来确定要被跟踪的会话
例如在sql*plus中启动一个会话然后找到这个会话的操作系统进行id(spid):
select p.PID,p.SPID,s.SID
from v$process p,v$session s
where s.paddr = p.addr
and s.sid = &SESSION_ID
/

SPID是操作系统进程标识符
PID是oracle进程标识符
如果你不知道要不得被跟踪会话的sid可以使用类似于下面的查询来帮助你识别这个会话:

column line format a79
set heading off
select 'ospid: ' || p.spid ||'  pid: '||p.pid || ' # ''' ||s.sid||','||s.serial#||''' '||
  s.osuser || ' ' ||s.machine ||' '||s.username ||' '||s.program line
from v$session s , v$process p
where p.addr = s.paddr
and s.username <> ' ';
执行结果如下:
sys@JINGYONG> column line format a79
sys@JINGYONG> set heading off
sys@JINGYONG> select 'ospid: ' || p.spid || ' # ''' ||s.sid||','||s.serial#||'''
 '||
  2    s.osuser || ' ' ||s.machine ||' '||s.username ||' '||s.program line
  3  from v$session s , v$process p
  4  where p.addr = s.paddr
  5  and s.username <> ' ';

ospid: 2529 # '30,32' Administrator WORKGROUP\JINGYONG SYS sqlplus.exe

注意:在oracle12c中对于多线程进程,在v$process视图中加入了新的列stid来找到特定的线程.因为oracle会组合多个进程到一个单独的ospid中.为了找到这个特定的线程使用下面的语法:
oradebug setospid

2.当确定进程的操作系统进程ID后然后可以使用下面的语句来初始化跟踪:
假设要被跟踪进程的操作系统进程ID是2529

SQL>connect / as sysdba
sys@JINGYONG> oradebug setospid 2529
Oracle pid: 21, Unix process pid: 2529, image: oracle@jingyong
sys@JINGYONG> oradebug unlimit
已处理的语句
sys@JINGYONG> oradebug event 10046 trace name context forever,level 12
已处理的语句
sys@JINGYONG> select * from dual;

X

sys@JINGYONG> oradebug event 10046 trace name context off
已处理的语句
sys@JINGYONG> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2529.trc

注意:连接到一个会话也可以使用oradebug setorapid
在这种情况下PID(oracle进程标识符)将被使用(而不是使用SPID):
sys@JINGYONG> oradebug setorapid 21
Oracle pid: 21, Unix process pid: 2529, image: oracle@jingyong
从显示的信息可知道使用oradebug setorapid 21与oradebug set0spid 2529是一样的
sys@JINGYONG> oradebug unlimit
已处理的语句
sys@JINGYONG> oradebug event 10046 trace name context forever,level 12
已处理的语句
sys@JINGYONG> select sysdate from dual;

11-11月-13

sys@JINGYONG> oradebug event 10046 trace name context off
已处理的语句
sys@JINGYONG> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2529.trc

注意:在oracle12c中对于多线程进程,在v$process视图中加入了新的列stid来找到特定的线程.因为oracle会组合多个进程到一个单独的ospid中.为了找到这个特定的线程使用下面的语法:
oradebug setospid

跟踪产生的跟踪文件名称类似于_.trc

实例级别的跟踪
注意:在实例级别启用跟踪因为每一个会话都会被跟踪这样对性能是有影响的
在设置这个跟踪参数后产生的每一个会话都会被跟踪断开的会话将不会被跟踪
设置系统级别的10046跟踪是用于当出现了一个问题会话但不能提前识别这个会话的情况下.在这种情况下跟踪可以被短时间地启用,这个问题可能会记录到跟踪文件中然后禁用跟踪在生成的跟踪文件中找到这个问题的原因

启用系统级别的10046跟踪:
alter system set events '10046 trace name context forever,level 12';
对所有会话禁有系统级别的10046跟踪:
alter system set events '10046 trace name context off';

初始化参数的设置:
当实例重新启动后对每一个会话启用10046跟踪.
event="10046 trace name context forever,level 12"
要禁用实例级别的10046跟踪可以删除这个初始化参数然后重启实例或者使用alter system语句
alter system set events '10046 trace name context off';

编写登录触发器
在有些情况下可能要跟踪特定用户的会话活动在这种情况下可以编写一个登录触发器来实现例如:

CREATE OR REPLACE TRIGGER SYS.set_trace
AFTER LOGON ON DATABASE
WHEN (USER like '&USERNAME')
DECLARE
lcommand varchar(200);
BEGIN
EXECUTE IMMEDIATE 'alter session set tracefile_identifier=''From_Trigger''';
EXECUTE IMMEDIATE 'alter session set statistics_level=ALL';
EXECUTE IMMEDIATE 'alter session set max_dump_file_size=UNLIMITED';
EXECUTE IMMEDIATE 'alter session set events ''10046 trace name context forever, level 12''';
END set_trace;
/

注意:为了能跟踪会话用户执行触发器需要显式的被授予'alter session'权限:
grant alter session to username;

使用SQLT来收集跟踪信息
什么是SQLTXPLAIN(SQLT)
SQLTXPLAIN也叫作SQLT,它是由专业的oracle服务技术中心提供了一个工具.SQLT输入一个SQL语句后它会输出一组诊断文件.这些诊断文件会被用来诊断性能低下的sql语句.SQLT连接到数据库并收集执行,基于成本优化的统计信息,方案对象元数据,性能统计,配置参数和类似影响SQL性能的元素.

使用SQLTXPLAIN的Xecute选项可以生成10046跟踪作为SQLT输出的一部分.

使用dbms_monitor包来进行跟踪
dbms_monitor是一个新的跟踪包.跟踪基于特定的客户端标识符或者服务名,模块名和操作名的组合形式来启用诊断和工作负载管理.在有些情况下可能会生成多个跟踪文件(例如对于一个模块启用服务级别的跟踪)使用新的trcsess工具来扫描所有的跟踪文件并将它们合成一个跟踪文件.在合并这一组跟踪文件后可以使用标准跟踪文件分析方法进行分析

查看启用的跟踪
可以查询dba_enabled_traces来检测什么跟踪被启用了.
例如:

sys@JINGYONG>select trace_type, primary_id, QUALIFIER_ID1, waits, binds
             from DBA_ENABLED_TRACES;

TRACE_TYPE                   PRIMARY_ID  QUALIFIER_ID1           WAITS        BINDS
---------------------- ---------------   ------------------      --------    -------
SERVICE_MODULE         SYS$USERS        SQL*Plus                 TRUE        FALSE
CLIENT_ID              HUGO                                      TRUE        FALSE
SERVICE                v101_DGB                                  TRUE        FALSE

在这个数据库中已经启用了三个不同的跟踪状态
1.第一行记录显示将会对在SQL*Plus中执行的所有sql语句进行跟踪
2.第二行记录显示将会对带有客户端标识符"HUGO'的所有会话进行跟踪
3.第三行记录显示将会对使用服务"v101_DGB'连接到数据库的所有程序进行跟踪

session_trace_enable函数
可以使用session_trace_enable过程来对本地实例的一个指定的数据库会话启用sql跟踪.
语法如下:
启用sql跟踪
dbms_monitor.session_trace_enable(session_id => x, serial_num => y,
waits=>(TRUE|FALSE),binds=>(TRUE|FALSE) );

禁止sql跟踪
dbms_monitor.session_trace_disable(session_id => x, serial_num => y);
其中waits的缺省值是true,binds的缺省值是false.

可以从v$session视图中查询会话id和序列号

SQL> select serial#, sid , username from v$session;

SERIAL#             SID  USERNAME
-------           -----  --------------
  1                 131
 18                 139
  3                 140
 11                 143     SCOTT

然后可以使用下面的命令来对指定的会话启用跟踪
SQL> execute dbms_monitor.session_trace_enable(143,11);
跟踪状态在数据库重启后就会被删除可以查询dba_enabled_traces视图看到没有记录
sys@JINGYONG> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2529.trc
sys@JINGYONG> select trace_type,primary_id,qualifier_id1,waits,binds
2 from dba_enabled_traces;

未选定行

当会话断开或者使用下面的命令可以禁止跟踪
SQL> execute dbms_monitor.session_trace_disable(143,11);

client_id_trace_enable函数
在多层架构环境中,一个请求从一个终端客户端通过中间层分发到不同的数据库会话.这意味着终端客户端与数据库会话的联系不是静态的.在oracle10g之前没有方法可以对一个客户端跨不同数据库会话进行跟踪.端到端的跟踪可以通过一个新的属性client_identifier来标识它是唯一标识一个特定的终端客户端.这个客户端标识符对应于v$session视图中的client_identifier列.通过系统上下文也可以查看.
语法如下:
启用跟踪
execute dbms_monitor.client_id_trace_enable ( client_id =>'client x',
waits => (TRUE|FALSE), binds => (TRUE|FALSE) );

禁止跟踪
execute dbms_monitor.client_id_trace_disable ( client_id =>'client x');
其中waits的缺省值是true,binds的缺省值是false.

例如:
可以使用dbms_session.set_identifier函数来设置client_identifier

sys@JINGYONG> exec dbms_session.set_identifier('JY');

PL/SQL 过程已成功完成。

sys@JINGYONG> select sys_context('USERENV','CLIENT_IDENTIFIER') client_id from dual;

JY


sys@JINGYONG> select client_identifier client_id from v$session where sid=30;

JY

sys@JINGYONG> exec dbms_monitor.client_id_trace_enable('JY');

PL/SQL 过程已成功完成。

使用查询来检查跟踪是否已经启用

sys@JINGYONG> select primary_id,qualifier_id1,waits,binds
  2  from dba_enabled_traces where trace_type='CLIENT_ID';
PRIMARY_ID         QUALIFIER_ID1         WAITS    BINDS
----------------   --------------        -------- --------
JY                                       TRUE     FALSE

这个跟踪在数据库重启之后还是有效的你得调用函数来禁用.
sys@JINGYONG> exec dbms_monitor.client_id_trace_disable('JY');

PL/SQL 过程已成功完成。
检查生成的跟文件

Trace file /u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_2529.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db
System name:	Linux
Node name:	jingyong
Release:	2.6.18-164.el5
Version:	#1 SMP Tue Aug 18 15:51:54 EDT 2009
Machine:	i686
Instance name: jingyong
Redo thread mounted by this instance: 1
Oracle process number: 21
Unix process pid: 2529, image: oracle@jingyong


*** 2013-11-11 11:31:56.737
*** SESSION ID:(30.32) 2013-11-11 11:31:56.737
*** CLIENT ID:() 2013-11-11 11:31:56.737
*** SERVICE NAME:(jingyong) 2013-11-11 11:31:56.737
*** MODULE NAME:(sqlplus.exe) 2013-11-11 11:31:56.737
*** ACTION NAME:() 2013-11-11 11:31:56.737

PARSING IN CURSOR #8 len=96 dep=0 uid=0 oct=3 lid=0 tim=1384150635839986 hv=3018843459 ad='275fa5ec' sqlid='3gg23wktyzta3'
select primary_id,qualifier_id1,waits,binds
from dba_enabled_traces where trace_type='CLIENT_ID'
END OF STMT

在启用跟踪后执行的语句被记录到了跟踪文件中.

sys@JINGYONG> select primary_id,qualifier_id1,waits,binds
  2  from dba_enabled_traces where trace_type='CLIENT_ID';

未选定行

当你使用MTS时有时将会生成多个跟踪文件,不同的共享服务器进程能执行sql语句这就将会生成多个跟踪文件.对于RAC
环境也是一样.

serv_mod_act_trace_enable函数
端到端跟踪对于使用MODULE,ACTION,SERVICES标识的应用程序能够进行有效地管理和计算其工作量.service名,module和
action名提供了一种方法来识别一个应用程序中重要的事务.
你可以使用serv_act_trace_enable过程来对由一组service,module和action名指定的全局会话启用sql跟踪,除非指定了特定
的实例名.对于一个会话的service名,module名与v$session视图中的service_name和module列相对应.
语句如下:
启用跟踪
execute dbms_monitor.serv_mod_act_trace_enable('Service S', 'Module M', 'Action A',
waits => (TRUE|FALSE), binds => (TRUE|FALSE), instance_name => 'ORCL' );

禁止跟踪
execute dbms_monitor.serv_mod_act_trace_disable('Service S', 'Module M', 'Action A');
其中waits的缺省值是true,binds的缺省值是false,instance_name的缺省值是null.

例如想要对在数据库服务器使用SQL*Plus执行的所有sql语句进行跟踪可以执行以下命令:

sys@JINGYONG> select module,service_name from v$session where sid=25;
MODULE                                      SERVICE_NAME
-----------------------------               ---------------------
sqlplus@jingyong (TNS V1-V3)                SYS$USERS

sys@JINGYONG> exec dbms_monitor.serv_mod_act_trace_enable('SYS$USERS','sqlplus@j
ingyong (TNS V1-V3)');

PL/SQL 过程已成功完成。

sys@JINGYONG> select primary_id,qualifier_id1,waits,binds
  2  from dba_enabled_traces
  3  where trace_type='SERVICE_MODULE';
PRIMARY_ID       QUALIFIER_ID1                WAITS    BINDS
---------------  -------------------          -------- --------
SYS$USERS        sqlplus@jingyong (TNS V1-V3) TRUE     FALSE


启用跟踪后我们执行一个测试语句
SQL> select 'x' from dual;

'
-
x
检查生成的跟踪文件名
SQL> select * from v$diag_info where name='Default Trace File';

   INST_ID NAME
---------- ----------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
         1 Default Trace File
/u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_4411.trc

查看跟踪内容如下

trace file /u01/app/oracle/diag/rdbms/jingyong/jingyong/trace/jingyong_ora_4411.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db
System name:	Linux
Node name:	jingyong
Release:	2.6.18-164.el5
Version:	#1 SMP Tue Aug 18 15:51:54 EDT 2009
Machine:	i686
Instance name: jingyong
Redo thread mounted by this instance: 1
Oracle process number: 24
Unix process pid: 4411, image: oracle@jingyong (TNS V1-V3)


*** 2013-11-11 14:34:00.971
*** SESSION ID:(25.412) 2013-11-11 14:34:00.972
*** CLIENT ID:() 2013-11-11 14:34:00.972
*** SERVICE NAME:(SYS$USERS) 2013-11-11 14:34:00.972
*** MODULE NAME:(sqlplus@jingyong (TNS V1-V3)) 2013-11-11 14:34:00.972
*** ACTION NAME:() 2013-11-11 14:34:00.972

WAIT #1: nam='SQL*Net message from client' ela= 152965072 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1384151640937525
CLOSE #1:c=1000,e=521,dep=0,type=0,tim=1384151640973430
=====================
PARSING IN CURSOR #1 len=20 dep=0 uid=0 oct=3 lid=0 tim=1384151640977682 hv=2740543121 ad='275fa9e4' sqlid='04vfkrajpkrnj'
select 'x' from dual

我们执行的测试语句被记录了在跟踪文件中.

sys@JINGYONG> exec dbms_monitor.serv_mod_act_trace_disable('SYS$USERS','sqlplus@
jingyong (TNS V1-V3)');

PL/SQL 过程已成功完成。

sys@JINGYONG> select primary_id,qualifier_id1,waits,binds
  2  from dba_enabled_traces
  3  where trace_type='SERVICE_MODULE';

未选定行

使用trcsess来合并跟踪文件
从某些跟踪操作中会得到多个跟踪文件.在oracle10g之前的版本中你得手动将这些跟踪文件合并到一起.现在可以使用trcsess工具来帮你合并这些跟踪文件.
语句如下:
trcsess [output=] [session=] [clientid=] [service=] [action=] [module=]

output= output destination default being standard output.
session= session to be traced.
Session id is a combination of session Index & session serial number e.g. 8.13.


clientid= clientid to be traced.
service= service to be traced.
action= action to be traced.
module= module to be traced.

<trace_file_names> Space separated list of trace files with wild card '*' suppor
ted.

[oracle@jingyong trace]$ trcsess output=jingyong_ora_88888888.trc service=jingyong jingyong_ora_2529.trc jingyong_ora_4411.trc
[oracle@jingyong trace]$ ls -lrt jingyong_ora_88888888.trc
-rw-r--r-- 1 oracle oinstall 16219 Nov 11 14:59 jingyong_ora_88888888.trc

dbms_application_info
可以在过程开始一个事务之前使用dbms_application_info.set*过程来注册一个事务名/客户端信息/模块名为以后检查性能来使用.你应该对以后可能消耗你最多系统资源的活动事务进行指定.
dbms_application_info包有以下过程
SET_CLIENT_INFO ( client_info IN VARCHAR2 );
SET_ACTION ( action_name IN VARCHAR2 );
SET_MODULE ( module_name IN VARCHAR2, action_name IN VARCHAR2 );

例如
sys@JINGYONG> create table emp as select * from scott.emp where 1=0;

表已创建。

sys@JINGYONG> exec dbms_application_info.set_module(module_name=>'add_emp',actio
n_name=>'insert into emp');

PL/SQL 过程已成功完成。

sys@JINGYONG> insert into emp select * from scott.emp;

已创建14行。

sys@JINGYONG> commit;

提交完成。

sys@JINGYONG> exec dbms_application_info.set_module(null,null);

PL/SQL 过程已成功完成。
下面查询v$sqlarea视图使用module和action列进行查询
sys@JINGYONG> select sql_text from v$sqlarea where module='add_emp';

insert into emp select * from scott.emp

sys@JINGYONG> select sql_text from v$sqlarea where action='insert into emp';

insert into emp select * from scott.emp

declare
l_client varchar2(100);
l_mod_name varchar2(100);
l_act_name varchar2(100);
begin
dbms_application_info.read_client_info(l_client);
dbms_application_info.read_module(l_mod_name,l_act_name);
dbms_output.put_line(l_client);
dbms_output.put_line(l_mod_name);
end;

dbms_session包:只能跟踪当前会话,不能指定会话。
跟踪当前会话:
SQL> exec dbms_session.set_sql_trace(true);
SQL> 执行sql
SQL> exec dbms_session.set_sql_trace(false);
dbms_session.set_sql_trace相当于alter session set sql_trace,从生成的trace文件可以明确地看alter session set sql_trace语句。
使 用dbms_session.session_trace_enable过程,不仅可以看到等待事件信息还可以看到绑定变量信息,相当于alter session set events '10046 trace name context forever, level 12';语句,从生成的trace文件可以确认。
SQL> exec dbms_session.session_trace_enable(waits=>true,binds=>true);
SQL> 执行sql
SQL> exec dbms_session.session_trace_enable(); --This procedure resets the session-level SQL trace for the session from which it was called.

dbms_support包:不应该使用这种方法,非官方支持。
系统默认没有安装这个包,可以手动执行$ORACLE_HOME/rdbms/admin/bmssupp.sql脚本来创建该包。
SQL> desc dbms_support
FUNCTION MYSID RETURNS NUMBER
FUNCTION PACKAGE_VERSION RETURNS VARCHAR2
PROCEDURE START_TRACE
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
WAITS BOOLEAN IN DEFAULT
BINDS BOOLEAN IN DEFAULT
PROCEDURE START_TRACE_IN_SESSION
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
SID NUMBER IN
SERIAL NUMBER IN
WAITS BOOLEAN IN DEFAULT
BINDS BOOLEAN IN DEFAULT
PROCEDURE STOP_TRACE
PROCEDURE STOP_TRACE_IN_SESSION
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
SID NUMBER IN
SERIAL NUMBER IN
SQL> select dbms_support.package_version from dual;
PACKAGE_VERSION
--------------------------------------------------------------------------------
DBMS_SUPPORT Version 1.0 (17-Aug-1998) - Requires Oracle 7.2 - 8.0.5
SQL> select dbms_support.mysid from dual;
MYSID
----------
292
SQL> select * from v$mystat where rownum=1;
SID STATISTIC# VALUE
---------- ---------- ----------
292 0 1
跟踪当前会话:
SQL> exec dbms_support.start_trace
SQL> 执行sql
SQL> exec dbms_support.stop_trace
跟踪其他会话:等待事件+绑定变量,相当于level 12的10046事件。
SQL> select sid,serial#,username from v$session where ...;
SQL> exec dbms_support.start_trace_in_session(sid=>sid,serial=>serial#,waits=>true,binds=>true);
SQL> exec dbms_support.stop_trace_in_session(sid=>sid,serial=>serial#);

dbms_system包:9i时使用
跟踪其他会话:
SQL> select sid,serial#,username from v$session where ...;
SQL> exec dbms_system.set_sql_trace_in_session(sid,serial#,true);
可以等候片刻,跟踪session执行任务,捕获sql操作…
SQL> exec dbms_system.set_sql_trace_in_session(sid,serial#,false);
ps:dbms_system这个包在10gR2官方文档上面没有找到这个包的说明,但数据库中有。
SQL> exec sys.dbms_system.SET_BOOL_PARAM_IN_SESSION(sid, serial#, 'sql_trace', TRUE);
SQL> exec sys.dbms_system.SET_BOOL_PARAM_IN_SESSION(sid, serial#, 'sql_trace', FALSE);
使用dbms_system.set_ev设置10046事件
SQL> select sid,serial#,username from v$session where ...;
SQL> exec dbms_system.set_ev(sid,serial#,10046,12,'');
SQL> exec dbms_system.set_ev(sid,serial#,10046,0,'');
最后一个参数只有为''时,才会生成trace文件,否则不报错,但没有trace文件生成。

3.数据库hang住是一个特定会话出现hang住还是几个会话出现hang住还是所有的会话都出现hang住
如果数据库是一个会话或几个会话出现hang住可以对这个会话执行10046跟踪,可以对这个会话收集一些errorstacks信息,也可以当问题出现时生成一个awr或statspack报告

生成转储和errorstack信息的方法如下:
为了转储跟踪和errorstacks信息,可以使用操作系统进程ID或者oracle进程ID.比如可以通过oracle的sid来查询到操作系统进ID:
SELECT p.pid, p.SPID,s.SID
FROM v$process p, v$session s
WHERE s.paddr = p.addr
AND s.SID = &SID;

SPID是操作系统标识符
SID是oracle会话标识符
PID是oracle进程标识符

比如一个SPID是1254,pid是56如果使用SPID来生成转储和errorstacks信息可以执行下面的语句:
connect / as sysdba
ALTER SESSION SET tracefile_identifier = 'STACK_10046';
oradebug setospid 1254
oradebug unlimit

oradebug event 10046 trace name context forever,level 12

oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug tracefile_name
oradebug event 10046 trace name context off

如果使用PID来生成转储和errorstacks信息可以执行下面的语句:
connect / as sysdba
ALTER SESSION SET tracefile_identifier = 'STACK_10046';
oradebug setpid 56
oradebug unlimit

oradebug event 10046 trace name context forever,level 12

oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug tracefile_name
oradebug event 10046 trace name context off

其中oradebug tracefile_name命令会显示跟踪文件的名字和位置,在生成的跟踪文件名字会包含STACK_10046字符

如果要对当前会话收集errorstacks信息首先要找出当前会话的SPID或PID可以执行如下语句来获得:
SELECT p.pid, p.SPID,s.SID
FROM v$process p, v$session s
WHERE s.paddr = p.addr
AND s.audsid = userenv('SESSIONID') ;

或者

SELECT p.pid, p.SPID,s.SID
FROM v$process p,v$session s
WHERE s.paddr = p.addr
AND s.SID =
(SELECT DISTINCT SID
FROM V$MYSTAT);

如果数据库是所有会话出现hang也就是整个数据库出现hang住了诊断hang住的方法如下:
当一个数据库出现Hang的问题时从数据库中收集信息来诊断挂志的根本原因是非常有用的.数据库Hang的原因往往是孤立的可以使用收集来的诊断信息来解决.另外如果不能解决可以用获得的信息来避免这个问题的再次重现.
解决方法
诊断数据库Hang需要什么信息
数据库Hang的特点是一些进程正在等待另一些进程的完成.通常有一个或多个阻塞进程被困或者正在努力工作但不是迅速的释放资源.为了诊断需要以下信息:
1.Hanganalyze and Systemstate Dumps
2.数据库性能的awr/statspack快照
3.及时的RDA
Hanganalyze and Systemstate Dumps
Hang分析和系统状态转储提供了在一个特定时间点的数据库中的进程信息.Hang分析提供了在Hang链表中所有进程的信息,系统状态提供了数据库中所有进程的信息.当查看一个潜在的Hang情况时你需要判断是否一个进程被因或动行缓慢.通过在两个连续的时间间隔内收集这些转储信息如果进程被困这些跟踪信息可以用于将来的诊断可能帮助你提供一些解决方法.Hang分析用来总结和确认数据库是真的Hang还是只是缓慢并提供了一致性快照,系统状态转储显示了数据库中每一个进程正在做什么
收集Hang分析和系统状态转储信息
登录系统
使用sql*plus以sysdba身份来登录
sqlplus '/ as sysdba'
如果连接时出现问题在oracle10gr2中可以使用sqlplus的"preliminary connection'
sqlplus -prelim '/ as sysdba'
注意:从oracle 11.2.0.2开始Hang分析在sqlplus的'preliminary connection'连接下将不会生成输出因为它要会请求一个进程的状态对象和一个会话状态对象.如果正试图分析跟踪会输出:
HANG ANALYSIS:
ERROR: Can not perform hang analysis dump without a process state object and a session state object.
( process=(nil), sess=(nil) )
非rac环境收集Hang分析和系统状态的收集命令
有些时候数据库可能只是非常的慢而不是真正的Hang.因此建议收集级别为2的Hang分析和系统状态转储来判断这些进程是正在执行还是已经停止执行
持起分析

sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 3
-- Wait one minute before getting the second hanganalyze
oradebug hanganalyze 3
oradebug tracefile_name
exit

系统转储

sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
oradebug dump systemstate 266
oradebug tracefile_name
exit

rac环境收集Hang分析和系统状态的收集命令
如果在你的系统中没有应用相关的补丁程序使用级别为266或267的系统状态转储会有2个bug.因此在没有应用这些补丁收集这些级别的转储是不明智的选择
补丁信息如下:
Document 11800959.8 Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances
Document 11827088.8 Bug 11827088 - Latch 'gc element' contention, LMHB terminates the instance
在修正bug 11800959和bug 11827088的情况下对于rac环境惧订Hang分析和系统状态的收集命令如下:

sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug  unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 266
oradebug -g all dump systemstate 266
exit

在没有修正bug 11800959和bug 11827088的情况下对于rac环境惧订Hang分析和系统状态的收集命令如下:

sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
exit

在rac环境中会在每一个实例的跟踪文件中创建所有实例的转储信息
对Hang分析和系统状态转储的级别说明
Hang分析级别
level 3(级别3):在oracle11g之前level 3对Hang链表中的相关进程也会收集一个简短的堆栈信息
系统状态转储级别
level 258(级别258)是一个快速的选择但是会丢失一些锁的元数据信息
level 267(级别267)它包含了理解成本所需要的额外的缓冲区缓存/锁元数据信息
其它的方法
如果不能连接到系统时如何收集系统状态转储信息
通常有两种方法来在系统Hang不能连接时来生成系统状态转储信息
1.alter session set events 'immediate trace name SYSTEMSTATE level 10';
2.$ sqlplus
connect sys/passwd as sysdba
oradebug setospid oradebug unlimit
oradebug dump systemstate 10
(注意:在oradebug中不能使用任何半冒号,如果你的数据库是比oracle9i还老的版本你将需要使用svrmgrl来连接到内部)
当你使用这两种方法中的一种时,要确保在两次转储时内部连接断开.这种方法生成的转储将在你的user_dump_dest目录中是分开的ora_.trc文件
在非常严重的情况下不能使用svrmgrl或sqlplus进行连接执行这些必要的命令.在这种情况下仍然有一个后门方法使用调试器比如你的系统有dbx的话可以用dbx来生成系统状态转储信息.被连接到的转储核心进程可能会被杀死所以不能连接到一个oracle后台进程.dbx的语法如下:
dbx -a PID (where PID = any oracle shadow process)
dbx() print ksudss(10)
...return value printed here
dbx() detach
首先你需要找到一个影子进程

(jy) % ps -ef |grep sqlplus
osupport  78526 154096   0 12:11:05  pts/1  0:00 sqlplus scott/tiger
osupport  94130  84332   1 12:11:20  pts/3  0:00 grep sqlplus
(jy) % ps -ef |grep 78526
osupport  28348  78526   0 12:11:05      -  0:00 oracles734 (DESCRIPTION=(LOCAL
osupport  78526 154096   0 12:11:05  pts/1  0:00 sqlplus scott/tiger
osupport  94132  84332   1 12:11:38  pts/3  0:00 grep 78526

这样将会连接到影子进程PID 28348上.当返回提示符时输入ksudss(10)命令和detach:

(jy) % dbx -a 28348
Waiting to attach to process 28348 ...
Successfully attached to oracle.
warning: Directory containing oracle could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...
stopped in read at 0xd016fdf0
0xd016fdf0 (read+0x114) 80410014        lwz   r2,0x14(r1)
(dbx) print ksudss(10)
2
(dbx) detach

在user_dump_dest目录中你将会通过跟踪的pid找到一个系统状态转储文件

(jy) % ls -lrt *28348*
-rw-r-----   1 osupport dba        46922 Oct 10 12:12 ora_28348.trc

core_28348:
total 72
-rw-r--r--   1 osupport dba        16567 Oct 10 12:12 core
drwxr-xr-x   7 osupport dba        12288 Oct 10 12:12 ../
drwxr-x---   2 osupport dba          512 Oct 10 12:12 ./

在跟踪文件中你将会找到常用的头信息.在oracle7.3.4并行操作系统中在这后面紧跟的是锁信息然后才是系统转储信息.
在oracle8并行操作系统中和非并行操作系统和oracle7.3.4非并行操作系统的系统状态信息是紧跟头信息.
在转储文件中头信息如下:

Dump file /oracle/mpp/734/rdbms/log/ora_28348.trc
Oracle7 Server Release 7.3.4.4.1 - Production
With the distributed, replication, parallel query, Parallel Server
and Spatial Data options
PL/SQL Release 2.3.4.4.1 - Production
ORACLE_HOME = /oracle/mpp/734
System name:    AIX
Node name:      saki
Release:        3
Version:        4
Machine:        000089914C00
Instance name: s734
Redo thread mounted by this instance: 2
Oracle process number: 0
Unix process pid: 28348, image:

ksinfy: nfytype = 0x5
ksinfy: calling scggra(&se)
scggra: SCG_PROCESS_LOCKING not defined
scggra: calling lk_group_attach()
ksinfy: returning
*** SESSION ID:(12.15) 2000.10.10.12.11.06.000
ksqcmi: get or convert
ksqcmi: get or convert
*** 2000.10.10.12.12.08.000
===================================================
SYSTEM STATE

.....

确保在这个文件中有一个end of system state.可以对它使用grep或在vi中搜索.如果没有那么这个跟踪文件是不过完整.
可能是因为init.ora文件中的max_dump_file的大小太小了.
对于oracle10g及以后的版本:
在有些情况下不连接到实例是允许的(在有些ora-20的情况下,对于oracle10.1.x,对于sqlplus有一个新选项来允许访问实例来生成跟踪文件)sqlplus -prelim / as sysdba
例如

export ORACLE_SID=PROD                                 ## Replace PROD with the SID you want to trace
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 10

在rac系统中,Hang分析,系统转储和其它一些rac信息可以使用racdiag.sql脚本来进行收集:

-- NAME: RACDIAG.SQL
-- SYS OR INTERNAL USER, CATPARR.SQL ALREADY RUN, PARALLEL QUERY OPTION ON
-- ------------------------------------------------------------------------
-- AUTHOR:
-- Michael Polaski - Oracle Support Services
-- Copyright 2002, Oracle Corporation
-- ------------------------------------------------------------------------
-- PURPOSE:
-- This script is intended to provide a user friendly guide to troubleshoot
-- RAC hung sessions or slow performance scenerios. The script includes
-- information to gather a variety of important debug information to determine
-- the cause of a RAC session level hang. The script will create a file
-- called racdiag_.out in your local directory while dumping hang analyze
-- dumps in the user_dump_dest(s) and background_dump_dest(s) on all nodes.
--
-- ------------------------------------------------------------------------
-- DISCLAIMER:
-- This script is provided for educational purposes only. It is NOT
-- supported by Oracle World Wide Technical Support.
-- The script has been tested and appears to work as intended.
-- You should always run new scripts on a test instance initially.
-- ------------------------------------------------------------------------
-- Script output is as follows:

set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,'Mondd_hhmi') timecol,
'.out' spool_extension from sys.dual;
column output new_value dbname
select value || '_' output
from v$parameter where name = 'db_name';
spool racdiag_&&dbname&&timestamp&&suffix
set lines 200
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = 'MON-DD-YYYY HH24:MI:SS';
alter session set timed_statistics = true;
set feedback on
select to_char(sysdate) time from dual;

set numwidth 5
column host_name format a20 tru
select inst_id, instance_name, host_name, version, status, startup_time
from gv$instance
order by inst_id;

set echo on

-- WAIT CHAINS
-- 11.x+ Only (This will not work in < v11
-- See Note 1428210.1 for instructions on interpreting.
set pages 1000
set lines 120
set heading off
column w_proc format a50 tru
column instance format a20 tru
column inst format a28 tru
column wait_event format a50 tru
column p1 format a16 tru
column p2 format a16 tru
column p3 format a15 tru
column Seconds format a50 tru
column sincelw format a50 tru
column blocker_proc format a50 tru
column waiters format a50 tru
column chain_signature format a100 wra
column blocker_chain format a100 wra
SELECT *
FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE,
'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)||
' from Instance '||blocker_instance BLOCKER_PROC,'Number of waiters: '||num_waiters waiters,
'Wait Event: ' ||wait_event_text wait_event, 'P1: '||p1 p1, 'P2: '||p2 p2, 'P3: '||p3 p3,
'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
'',blocker_chain_id) blocker_chain
FROM v$wait_chains wc,
v$instance i
WHERE wc.instance = i.instance_number (+)
AND ( num_waiters > 0
OR ( blocker_osid IS NOT NULL
AND in_wait_secs > 10 ) )
ORDER BY chain_id,
num_waiters DESC)
WHERE ROWNUM < 101; -- Taking Hang Analyze dumps  -- This may take a little while...  oradebug setmypid  oradebug unlimit  oradebug -g all hanganalyze 3  -- This part may take the longest, you can monitor bdump or udump to see if  -- the file is being generated.  oradebug -g all dump systemstate 258  -- WAITING SESSIONS:  -- The entries that are shown at the top are the sessions that have  -- waited the longest amount of time that are waiting for non-idle wait  -- events (event column). You can research and find out what the wait  -- event indicates (along with its parameters) by checking the Oracle  -- Server Reference Manual or look for any known issues or documentation  -- by searching Metalink for the event name in the search bar. Example  -- (include single quotes): [ 'buffer busy due to global cache' ].  -- Metalink and/or the Server Reference Manual should return some useful  -- information on each type of wait event. The inst_id column shows the  -- instance where the session resides and the SID is the unique identifier  -- for the session (gv$session). The p1, p2, and p3 columns will show  -- event specific information that may be important to debug the problem.  -- To find out what the p1, p2, and p3 indicates see the next section.  -- Items with wait_time of anything other than 0 indicate we do not know  -- how long these sessions have been waiting.  --  set numwidth 15 set heading on column state format a7 tru  column event format a25 tru  column last_sql format a40 tru  select sw.inst_id, sw.sid, sw.state, sw.event, sw.seconds_in_wait seconds,  sw.p1, sw.p2, sw.p3, sa.sql_text last_sql  from gv$session_wait sw, gv$session s, gv$sqlarea sa  where sw.event not in  ('rdbms ipc message','smon timer','pmon timer',  'SQL*Net message from client','lock manager wait for remote message',  'ges remote message', 'gcs remote message', 'gcs for action', 'client message',  'pipe get', 'null event', 'PX Idle Wait', 'single-task message',  'PX Deq: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue',  'listen endpoint status','slave wait','wakeup time manager')  and sw.seconds_in_wait > 0
and (sw.inst_id = s.inst_id and sw.sid = s.sid)
and (s.inst_id = sa.inst_id and s.sql_address = sa.address)
order by seconds desc;

-- EVENT PARAMETER LOOKUP:
-- This section will give a description of the parameter names of the
-- events seen in the last section. p1test is the parameter value for
-- p1 in the WAITING SESSIONS section while p2text is the parameter
-- value for p3 and p3 text is the parameter value for p3. The
-- parameter values in the first section can be helpful for debugging
-- the wait event.
--
column event format a30 tru
column p1text format a25 tru
column p2text format a25 tru
column p3text format a25 tru
select distinct event, p1text, p2text, p3text
from gv$session_wait sw
where sw.event not in ('rdbms ipc message','smon timer','pmon timer',
'SQL*Net message from client','lock manager wait for remote message',
'ges remote message', 'gcs remote message', 'gcs for action', 'client message',
'pipe get', 'null event', 'PX Idle Wait', 'single-task message',
'PX Deq: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue',
'listen endpoint status','slave wait','wakeup time manager')
and seconds_in_wait > 0
order by event;

-- GES LOCK BLOCKERS:
-- This section will show us any sessions that are holding locks that
-- are blocking other users. The inst_id will show us the instance that
-- the session resides on while the sid will be a unique identifier for
-- the session. The grant_level will show us how the GES lock is granted to
-- the user. The request_level will show us what status we are trying to
-- obtain.  The lockstate column will show us what status the lock is in.
-- The last column shows how long this session has been waiting.
--
set numwidth 5
column state format a16 tru;
column event format a30 tru;
select dl.inst_id, s.sid, p.spid, dl.resource_name1,
decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as grant_level,
decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as request_level,
decode(substr(dl.state,1,8),'KJUSERGR','Granted','KJUSEROP','Opening',
'KJUSERCA','Canceling','KJUSERCV','Converting') as state,
s.sid, sw.event, sw.seconds_in_wait sec
from gv$ges_enqueue dl, gv$process p, gv$session s, gv$session_wait sw
where blocker = 1
and (dl.inst_id = p.inst_id and dl.pid = p.spid)
and (p.inst_id = s.inst_id and p.addr = s.paddr)
and (s.inst_id = sw.inst_id and s.sid = sw.sid)
order by sw.seconds_in_wait desc;

-- GES LOCK WAITERS:
-- This section will show us any sessions that are waiting for locks that
-- are blocked by other users. The inst_id will show us the instance that
-- the session resides on while the sid will be a unique identifier for
-- the session. The grant_level will show us how the GES lock is granted to
-- the user. The request_level will show us what status we are trying to
-- obtain.  The lockstate column will show us what status the lock is in.
-- The last column shows how long this session has been waiting.
--
set numwidth 5
column state format a16 tru;
column event format a30 tru;
select dl.inst_id, s.sid, p.spid, dl.resource_name1,
decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as grant_level,
decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as request_level,
decode(substr(dl.state,1,8),'KJUSERGR','Granted','KJUSEROP','Opening',
'KJUSERCA','Cancelling','KJUSERCV','Converting') as state,
s.sid, sw.event, sw.seconds_in_wait sec
from gv$ges_enqueue dl, gv$process p, gv$session s, gv$session_wait sw
where blocked = 1
and (dl.inst_id = p.inst_id and dl.pid = p.spid)
and (p.inst_id = s.inst_id and p.addr = s.paddr)
and (s.inst_id = sw.inst_id and s.sid = sw.sid)
order by sw.seconds_in_wait desc;

-- LOCAL ENQUEUES:
-- This section will show us if there are any local enqueues. The inst_id will
-- show us the instance that the session resides on while the sid will be a
-- unique identifier for. The addr column will show the lock address. The type
-- will show the lock type. The id1 and id2 columns will show specific
-- parameters for the lock type.
--
set numwidth 12
column event format a12 tru
select l.inst_id, l.sid, l.addr, l.type, l.id1, l.id2,
decode(l.block,0,'blocked',1,'blocking',2,'global') block,
sw.event, sw.seconds_in_wait sec
from gv$lock l, gv$session_wait sw
where (l.sid = sw.sid and l.inst_id = sw.inst_id)
and l.block in (0,1)
order by l.type, l.inst_id, l.sid;

-- LATCH HOLDERS:
-- If there is latch contention or 'latch free' wait events in the WAITING
-- SESSIONS section we will need to find out which proceseses are holding
-- latches. The inst_id will show us the instance that the session resides
-- on while the sid will be a unique identifier for. The username column
-- will show the session's username. The os_user column will show the os
-- user that the user logged in as. The name column will show us the type
-- of latch being waited on. You can search Metalink for the latch name in
-- the search bar. Example (include single quotes):
-- [ 'library cache' latch ]. Metalink should return some useful information
-- on the type of latch.
--
set numwidth 5
select distinct lh.inst_id, s.sid, s.username, p.username os_user, lh.name
from gv$latchholder lh, gv$session s, gv$process p
where (lh.sid = s.sid and lh.inst_id = s.inst_id)
and (s.inst_id = p.inst_id and s.paddr = p.addr)
order by lh.inst_id, s.sid;

-- LATCH STATS:
-- This view will show us latches with less than optimal hit ratios
-- The inst_id will show us the instance for the particular latch. The
-- latch_name column will show us the type of latch. You can search Metalink
-- for the latch name in the search bar. Example (include single quotes):
-- [ 'library cache' latch ]. Metalink should return some useful information
-- on the type of latch. The hit_ratio shows the percentage of time we
-- successfully acquired the latch.
--
column latch_name format a30 tru
select inst_id, name latch_name,
round((gets-misses)/decode(gets,0,1,gets),3) hit_ratio,
round(sleeps/decode(misses,0,1,misses),3) "SLEEPS/MISS"
from gv$latch
where round((gets-misses)/decode(gets,0,1,gets),3) < .99
and gets != 0
order by round((gets-misses)/decode(gets,0,1,gets),3);

-- No Wait Latches:
--
select inst_id, name latch_name,
round((immediate_gets/(immediate_gets+immediate_misses)), 3) hit_ratio,
round(sleeps/decode(immediate_misses,0,1,immediate_misses),3) "SLEEPS/MISS"
from gv$latch
where round((immediate_gets/(immediate_gets+immediate_misses)), 3) < .99  and immediate_gets + immediate_misses > 0
order by round((immediate_gets/(immediate_gets+immediate_misses)), 3);

-- GLOBAL CACHE CR PERFORMANCE
-- This shows the average latency of a consistent block request.
-- AVG CR BLOCK RECEIVE TIME should typically be about 15 milliseconds
-- depending on your system configuration and volume, is the average
-- latency of a consistent-read request round-trip from the requesting
-- instance to the holding instance and back to the requesting instance. If
-- your CPU has limited idle time and your system typically processes
-- long-running queries, then the latency may be higher. However, it is
-- possible to have an average latency of less than one millisecond with
-- User-mode IPC. Latency can be influenced by a high value for the
-- DB_MULTI_BLOCK_READ_COUNT parameter. This is because a requesting process
-- can issue more than one request for a block depending on the setting of
-- this parameter. Correspondingly, the requesting process may wait longer.
-- Also check interconnect badwidth, OS tcp settings, and OS udp settings if
-- AVG CR BLOCK RECEIVE TIME is high.
--
set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id
or b1.name = 'gc cr block receive time' and
b2.name = 'gc cr blocks received' and b1.inst_id = b2.inst_id ;

-- GLOBAL CACHE LOCK PERFORMANCE
-- This shows the average global enqueue get time.
-- Typically AVG GLOBAL LOCK GET TIME should be 20-30 milliseconds. the
-- elapsed time for a get includes the allocation and initialization of a
-- new global enqueue. If the average global enqueue get (global cache
-- get time) or average global enqueue conversion times are excessive,
-- then your system may be experiencing timeouts. See the 'WAITING SESSIONS',
-- 'GES LOCK BLOCKERS', GES LOCK WAITERS', and 'TOP 10 WAIT EVENTS ON SYSTEM'
-- sections if the AVG GLOBAL LOCK GET TIME is high.
--
set numwidth 20
column "AVG GLOBAL LOCK GET TIME (ms)" format 9999999.9
select b1.inst_id, (b1.value + b2.value) "GLOBAL LOCK GETS",
b3.value "GLOBAL LOCK GET TIME",
(b3.value / (b1.value + b2.value) * 10) "AVG GLOBAL LOCK GET TIME (ms)"
from gv$sysstat b1, gv$sysstat b2, gv$sysstat b3
where b1.name = 'global lock sync gets' and
b2.name = 'global lock async gets' and b3.name = 'global lock get time'
and b1.inst_id = b2.inst_id and b2.inst_id = b3.inst_id
or b1.name = 'global enqueue gets sync' and
b2.name = 'global enqueue gets async' and b3.name = 'global enqueue get time'
and b1.inst_id = b2.inst_id and b2.inst_id = b3.inst_id;

-- RESOURCE USAGE
-- This section will show how much of our resources we have used.
--
set numwidth 8
select inst_id, resource_name, current_utilization, max_utilization,
initial_allocation
from gv$resource_limit
where max_utilization > 0
order by inst_id, resource_name;

-- DLM TRAFFIC INFORMATION
-- This section shows how many tickets are available in the DLM. If the
-- TCKT_WAIT columns says "YES" then we have run out of DLM tickets which
-- could cause a DLM hang. Make sure that you also have enough TCKT_AVAIL.
--
set numwidth 10
select * from gv$dlm_traffic_controller
order by TCKT_AVAIL;

-- DLM MISC
--
set numwidth 10
select * from gv$dlm_misc;

-- LOCK CONVERSION DETAIL:
-- This view shows the types of lock conversion being done on each instance.
--
select * from gv$lock_activity;

-- INITIALIZATION PARAMETERS:
-- Non-default init parameters for each node.
--
set numwidth 5
column name format a30 tru
column value format a50 wra
column description format a60 tru
select inst_id, name, value, description
from gv$parameter
where isdefault = 'FALSE'
order by inst_id, name;

-- TOP 10 WAIT EVENTS ON SYSTEM
-- This view will provide a summary of the top wait events in the db.
--
set numwidth 10
column event format a25 tru
select inst_id, event, time_waited, total_waits, total_timeouts
from (select inst_id, event, time_waited, total_waits, total_timeouts
from gv$system_event where event not in ('rdbms ipc message','smon timer',
'pmon timer', 'SQL*Net message from client','lock manager wait for remote message',
'ges remote message', 'gcs remote message', 'gcs for action', 'client message',
'pipe get', 'null event', 'PX Idle Wait', 'single-task message',
'PX Deq: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue',
'listen endpoint status','slave wait','wakeup time manager')
order by time_waited desc)
where rownum < 11  order by time_waited desc;  -- SESSION/PROCESS REFERENCE:  -- This section is very important for most of the above sections to find out  -- which user/os_user/process is identified to which session/process.  --  set numwidth 7  column event format a30 tru  column program format a25 tru  column username format a15 tru  select p.inst_id, s.sid, s.serial#, p.pid, p.spid, p.program, s.username,  p.username os_user, sw.event, sw.seconds_in_wait sec  from gv$process p, gv$session s, gv$session_wait sw  where (p.inst_id = s.inst_id and p.addr = s.paddr)  and (s.inst_id = sw.inst_id and s.sid = sw.sid)  order by p.inst_id, s.sid;  -- SYSTEM STATISTICS:  -- All System Stats with values of > 0. These can be referenced in the
-- Server Reference Manual
--
set numwidth 5
column name format a60 tru
column value format 9999999999999999999999999
select inst_id, name, value
from gv$sysstat
where value > 0
order by inst_id, name;

-- CURRENT SQL FOR WAITING SESSIONS:
-- Current SQL for any session in the WAITING SESSIONS list
--
set numwidth 5
column sql format a80 wra
select sw.inst_id, sw.sid, sw.seconds_in_wait sec, sa.sql_text sql
from gv$session_wait sw, gv$session s, gv$sqlarea sa
where sw.sid = s.sid (+)
and sw.inst_id = s.inst_id (+)
and s.sql_address = sa.address
and sw.event not in ('rdbms ipc message','smon timer','pmon timer',
'SQL*Net message from client','lock manager wait for remote message',
'ges remote message', 'gcs remote message', 'gcs for action', 'client message',
'pipe get', 'null event', 'PX Idle Wait', 'single-task message',
'PX Deq: Execution Msg', 'KXFQ: kxfqdeq - normal deqeue',
'listen endpoint status','slave wait','wakeup time manager')
and sw.seconds_in_wait > 0
order by sw.seconds_in_wait desc;

-- WAIT CHAINS
-- 11.x+ Only (This will not work in < v11
-- See Note 1428210.1 for instructions on interpreting.
set pages 1000
set lines 120
set heading off
column w_proc format a50 tru
column instance format a20 tru
column inst format a28 tru
column wait_event format a50 tru
column p1 format a16 tru
column p2 format a16 tru
column p3 format a15 tru
column seconds format a50 tru
column sincelw format a50 tru
column blocker_proc format a50 tru
column waiters format a50 tru
column chain_signature format a100 wra
column blocker_chain format a100 wra
SELECT *
FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE,
'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)||
' from Instance '||blocker_instance BLOCKER_PROC,'Number of waiters: '||num_waiters waiters,
'Wait Event: ' ||wait_event_text wait_event, 'P1: '||p1 p1, 'P2: '||p2 p2, 'P3: '||p3 p3,
'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
'',blocker_chain_id) blocker_chain
FROM v$wait_chains wc,
v$instance i
WHERE wc.instance = i.instance_number (+)
AND ( num_waiters > 0
OR ( blocker_osid IS NOT NULL
AND in_wait_secs > 10 ) )
ORDER BY chain_id,
num_waiters DESC)
WHERE ROWNUM < 101;

-- Taking Hang Analyze dumps
-- This may take a little while...
oradebug setmypid
oradebug unlimit
oradebug -g all hanganalyze 3
-- This part may take the longest, you can monitor bdump or udump to see
-- if the file is being generated.
oradebug -g all dump systemstate 258

set echo off

select to_char(sysdate) time from dual;

spool off

-- ---------------------------------------------------------------------------
Prompt;
Prompt racdiag output files have been written to:;
Prompt;
host pwd
Prompt alert log and trace files are located in:;
column host_name format a12 tru
column name format a20 tru
column value format a60 tru
select distinct i.host_name, p.name, p.value
from gv$instance i, gv$parameter p
where p.inst_id = i.inst_id (+)
and p.name like '%_dump_dest'
and p.name != 'core_dump_dest';
v$wait_chains

从oracle11gr1开始,dia0后台进程开始收集Hang分析信息并存储在内存中的"hang analysis cache"中.它会每3秒钟收集一次本地的Hang分析和第10秒钟收集一次全局(rac)Hang分析信息.这些信息在出现Hang时提供快速查看Hang链表的方法.
存储在"hang analysiz cache"中的数据对于诊断数据库竞争和Hang是非常有效的
有许多数据库功能可以利用Hang分析缓存:Hang Management, Resource Manager Idle Blocker Kill,
SQL Tune Hang Avoidance和PMON清除以及外部工具象Procwatcher
下面是oracle11gr2中v$wait_chains视图的描述:

SQL> desc v$wait_chains
  Name                                      Null     Type
  ----------------------------------------- -------- ----------------------
  CHAIN_ID                                           NUMBER
  CHAIN_IS_CYCLE                                     VARCHAR2(5)
  CHAIN_SIGNATURE                                    VARCHAR2(801)
  CHAIN_SIGNATURE_HASH                               NUMBER
  INSTANCE                                           NUMBER
  OSID                                               VARCHAR2(25)
  PID                                                NUMBER
  SID                                                NUMBER
  SESS_SERIAL#                                       NUMBER
  BLOCKER_IS_VALID                                   VARCHAR2(5)
  BLOCKER_INSTANCE                                   NUMBER
  BLOCKER_OSID                                       VARCHAR2(25)
  BLOCKER_PID                                        NUMBER
  BLOCKER_SID                                        NUMBER
  BLOCKER_SESS_SERIAL#                               NUMBER
  BLOCKER_CHAIN_ID                                   NUMBER
  IN_WAIT                                            VARCHAR2(5)
  TIME_SINCE_LAST_WAIT_SECS                          NUMBER
  WAIT_ID                                            NUMBER
  WAIT_EVENT                                         NUMBER
  WAIT_EVENT_TEXT                                    VARCHAR2(64)
  P1                                                 NUMBER
  P1_TEXT                                            VARCHAR2(64)
  P2                                                 NUMBER
  P2_TEXT                                            VARCHAR2(64)
  P3                                                 NUMBER
  P3_TEXT                                            VARCHAR2(64)
  IN_WAIT_SECS                                       NUMBER
  TIME_REMAINING_SECS                                NUMBER
  NUM_WAITERS                                        NUMBER
  ROW_WAIT_OBJ#                                      NUMBER
  ROW_WAIT_FILE#                                     NUMBER
  ROW_WAIT_BLOCK#                                    NUMBER
  ROW_WAIT_ROW#                                      NUMBER

注意:v$wait_chains等价于gv$视图可能在rac环境中报告多个实例
使用sql来查询基本信息

SQL> SELECT chain_id, num_waiters, in_wait_secs, osid, blocker_osid, substr(wait_event_text,1,30)
 FROM v$wait_chains; 2

 CHAIN_ID   NUM_WAITERS IN_WAIT_SECS OSID           BLOCKER_OSID         SUBSTR(WAIT_EVENT_TEXT,1,30)
 ---------- ----------- ------------ -------------- ------------------------- -----------------------------
1          0           10198        21045          21044                      enq: TX - row lock contention
 1          1           10214        21044                                    SQL*Net message from client

查询top 100 wait chain processs

 set pages 1000
 set lines 120
 set heading off
 column w_proc format a50 tru
 column instance format a20 tru
 column inst format a28 tru
 column wait_event format a50 tru
 column p1 format a16 tru
 column p2 format a16 tru
 column p3 format a15 tru
 column Seconds format a50 tru
 column sincelw format a50 tru
 column blocker_proc format a50 tru
 column waiters format a50 tru
 column chain_signature format a100 wra
 column blocker_chain format a100 wra

 SELECT *
 FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE,
 'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)||
 ' from Instance '||blocker_instance BLOCKER_PROC,'Number of waiters: '||num_waiters waiters,
 'Wait Event: ' ||wait_event_text wait_event, 'P1: '||p1 p1, 'P2: '||p2 p2, 'P3: '||p3 p3,
 'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
 'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
 '',blocker_chain_id) blocker_chain
 FROM v$wait_chains wc,
 v$instance i
 WHERE wc.instance = i.instance_number (+)
 AND ( num_waiters > 0
 OR ( blocker_osid IS NOT NULL
 AND in_wait_secs > 10 ) )
 ORDER BY chain_id,
 num_waiters DESC)
 WHERE ROWNUM < 101;


Current Process:21549                                   SID RAC1                 INST #: 1
Blocking Process: from Instance                   Number of waiters:1
Wait Event:SQL*Net message from client                  P1: 1650815232  P2: 1     P3:0
Seconds in Wait:36                                      Seconds Since Last Wait:
Wait Chaing:1 : 'SQL*Net message from client '< ='enq: TX - row lock contention'
Blocking Wait Chain:

Current Process:25627                                   SID RAC1                 INST #: 1
Blocking Process:21549 from Instance 1                  Number of waiters:0
Wait Event:enq: TX - row lock contention                P1:1415053318 P2: 524316 P3:50784
Seconds in Wait:22                                      Seconds Since Last Wait:
Wait Chain:1 : 'SQL*Net message from client '< ='enq: TX - row lock contention'
Blocking Wait Chain:

ospid 25627正等待一个TX lock正被ospid 21549所阻塞
ospid 21549正空闲等待'SQL*Net message from client'
在oracle11gr2中的最终阻塞会话
在oracle11gr2中可能将v$session.final_blocking_session看作是最终的阻塞者.最终的阻会话/进程在top等待链表上.
这些会话/进程可能是造成问题的原因.

set pages 1000
set lines 120
set heading off
column w_proc format a50 tru
column instance format a20 tru
column inst format a28 tru
column wait_event format a50 tru
column p1 format a16 tru
column p2 format a16 tru
column p3 format a15 tru
column Seconds format a50 tru
column sincelw format a50 tru
column blocker_proc format a50 tru
column fblocker_proc format a50 tru
column waiters format a50 tru
column chain_signature format a100 wra
column blocker_chain format a100 wra

SELECT *
FROM (SELECT 'Current Process: '||osid W_PROC, 'SID '||i.instance_name INSTANCE,
 'INST #: '||instance INST,'Blocking Process: '||decode(blocker_osid,null,'',blocker_osid)||
 ' from Instance '||blocker_instance BLOCKER_PROC,
 'Number of waiters: '||num_waiters waiters,
 'Final Blocking Process: '||decode(p.spid,null,'',
 p.spid)||' from Instance '||s.final_blocking_instance FBLOCKER_PROC,
 'Program: '||p.program image,
 'Wait Event: ' ||wait_event_text wait_event, 'P1: '||wc.p1 p1, 'P2: '||wc.p2 p2, 'P3: '||wc.p3 p3,
 'Seconds in Wait: '||in_wait_secs Seconds, 'Seconds Since Last Wait: '||time_since_last_wait_secs sincelw,
 'Wait Chain: '||chain_id ||': '||chain_signature chain_signature,'Blocking Wait Chain: '||decode(blocker_chain_id,null,
 '',blocker_chain_id) blocker_chain
FROM v$wait_chains wc,
 gv$session s,
 gv$session bs,
 gv$instance i,
 gv$process p
WHERE wc.instance = i.instance_number (+)
 AND (wc.instance = s.inst_id (+) and wc.sid = s.sid (+)
 and wc.sess_serial# = s.serial# (+))
 AND (s.final_blocking_instance = bs.inst_id (+) and s.final_blocking_session = bs.sid (+))
 AND (bs.inst_id = p.inst_id (+) and bs.paddr = p.addr (+))
 AND ( num_waiters > 0
 OR ( blocker_osid IS NOT NULL
 AND in_wait_secs > 10 ) )
ORDER BY chain_id,
 num_waiters DESC)
WHERE ROWNUM < 101;



Current Process:2309                                    SID RAC1                 INST #: 1
Blocking Process: from Instance                   Number of waiters:2
Wait Event:SQL*Net message from client                  P1: 1650815232  P2: 1     P3:0
Seconds in Wait:157                                     Seconds Since Last Wait:
Wait Chaing:1 : 'SQL*Net message from client '< ='enq: TM - contention'<='enq: TM - contention'
Blocking Wait Chain:

Current Process:2395                                    SID RAC1                 INST #: 1
Blocking Process:2309 from Instance 1                   Number of waiters:0
Final Block Process:2309 from Instance 1                Program: oracle@racdbe1.us.oracle.com (TNS V1-V3)
Wait Event:enq: TX - contention                         P1:1415053318 P2: 524316 P3:50784
Seconds in Wait:139                                      Seconds Since Last Wait:
Wait Chain:1 : 'SQL*Net message from client '< ='enq: TM - contention'<='enq: TM - contention'
Blocking Wait Chain:

B.对数据库性能生成一个awr/statspack快照
C.收集最新的RDA
最新的RDA提供了大量额外关于数据库配置和性能度量的信息可以用来检测可能影响性能的热点的后台进程问题
有时数据库不是真正的被hang住可是只是'spinning' cpu.可以使用以下方法来检查服务器是hang还是spin如果一个操作执行的时间比期待的时间长或者这个操作损害了其它操作的性能时那么最好是检查v$session_wait视图.这个视图显示了在系统中会话当前正在等待的信息.可以使用下面的脚本来操作.

column sid format 990
column seq# format 99990
column wait_time heading 'WTime' format 99990
column event format a30
column p1 format 9999999990
column p2 format 9999999990
column p3 format 9990
select sid,event,seq#,p1,p2,p3,wait_time
from V$session_wait
order by sid
/

上面的查询最少应该执行三次并比较其它查询结果
列意思
sid-- 会话的系统标识符
seq#--序列号.当一个特定会话的等待一个新的事件时这个数字会增加.它能告诉你一个会话是否正在执行
evnet--会话正在等待的或最后等待的操作
p1,p2,p3--它们代表不同的等待值
wait_time--0指示这个会话正在等待的事件.非0指示这个会话最后等待的事件和会话正使用CPU
例如:

 SID EVENT                            SEQ#          P1          P2    P3  WTime
---- ------------------------------ ------ ----------- ----------- ----- ------
   1 pmon timer                        335         300           0     0      0
   2 rdbms ipc message                 779         300           0     0      0
   6 smon timer                         74         300           0     0      0
   9 Null event                        347           0         300     0      0
  16 SQL*Net message from client      1064  1650815315           1     0     -1

如果脚本查询的结果显示正在等待一个enqueue等待事件那么你将需要检查与你hang会话相关的锁信息
column sid format 990
column type format a2
column id1 format 9999999990
column id2 format 9999999990
column Lmode format 990
column request format 990
select * from v$lock
/
Spinning
在spin的情况下事件通常来说是静态的且会话不会是正在等待一个事件--而是在等待cpu(注意在极少数情况下,这个事件依赖于执行spin的代码也可能不会静态的.如果会自豪感是spin它将严重使用cpu和内存资源.
对于一个spin的情况重要的是要检测会话正处于spinning的代码.从事件的一些迹象说明通常需要对一个进程生成几次的错误堆栈信息用来分析:
connect sys/sys as sysdba
oradebug setospid
oradebug unlimit
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
这里的spid是操作系统标识符可以从v$process视图是得到.
Hanging
在正常的情况下在v$session_wait视图中的值应该是用每个会话执行的不同操作来替换.
在hang住的情况下对于一个或一组特定会话的所有系统事件将会是保持静态状态且进程不会消耗任何cpu和内存资源.鉴于会话现在没有请求锁定任何资源这就叫hang
在这种情况下可对实例转储系统状态来获得一些更详细更有用的信息.
ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL XX';
在oralce9.2.0.6或oracle10.1.0.4或在oracle10g中最高的版本的中这里的xx是266.执行上面的命令在你的user_dump_dest目录中会生成系统状态跟踪文件.
通过下面的查询可以得到问题进程的进程ID
SELECT pid FROM v$process
WHERE addr =
(SELECT paddr FROM v$session
WHERE sid = sid_of_problem_session);
系统状态转储文件包含了每一个进程的信息.可以通过搜索'PROCESS '来找到每一个进程的详细信息.通过搜索'waiting for'来找到当前正在等待的事件.

posted @ 2020-04-30 14:13  耀阳居士  阅读(616)  评论(0编辑  收藏  举报