20、解析2_2(等待、awr、ora-04031)
解析
在oracle 11g里面,对于shared pool,都有哪些常见的等待事件(也就是异常的现象)
1、链的争用等待事件:
library cache里面
在oracle 11g中:chain链争用,主要表现为library cache中的链的争用:library cache mutex X的争用,经常伴随着硬解析
在oracle 10g中:chain链争用,主要表现为library cache中的链的争用:library cache latch的争用
free cache里面
latch的争用(这个10g和11g里面都一样):在shared pool的free空间里的latch争用:一个用户以S的方式在找一个free chunk,另一个用户以S的方式要从链上摘一个free chunk,这时候,在free空间里就产生了latch的争用,往往也伴随着硬解析
row cache里面
也会出现链的争用,表现为row cache objects的latch争用,往往也伴随着硬解析
2、cursor的争用等待事件:
cursor: pin S wait on X等待事件
A session waits for this event when it is requesting a shared mutex pin and another session is holding an exclusive mutex pin on the same cursor object.
就是一个server process1以X的方式(也就是要进行解析)持有一个链,这个链上面挂着一些cursor,然后找到要修改的cursor,pin住这个cursor,然后释放链;
然后server process2以S的方式持有着同一个链,因为server process1已经把链释放了,然后要读取同一个cursor,因为server process1已经pin住了,所以就会产生等待,要等待server process1解析完这个cursor,server process2才能进行读取
cursor: pin X等待事件
A session waits on this event when it is requesting an exclusive mutex pin for a cursor object and it must wait because the resource is busy. The mutex pin for a cursor object can be busy either because a session is already holding it exclusive, or there are one or more sessions which are holding shared mutex pin(s). The exclusive waiter must wait until all holders of the pin for that cursor object have released it, before it can be granted.
是一个server process1以X或者S的方式持有着一个链,这个链上面挂着一些cursor,然后找到cursor,pin住这个cursor,然后释放链;
然后server process2以X的方式持有着同一个链,因为server process1已经把链释放了,然后要对同一个cursor进行解析修改,因为server process1已经pin住了,所以就会产生等待,要等待server process1解析完或者读完这个cursor,server process2才能进行解析修改
模拟硬解析,看硬解析往往伴随着什么?
SQL> alter user scott identified by tiger account unlock; -- 解锁Scott用户,且密码为tiget
User altered.
SQL> grant dba to scott; -- 授dba权限
Grant succeeded.
SQL> connect scott/tiger -- 连接Scott用户
Connected.
使用Scott用户建二十张表:
create table testlib1 (id number) ;
create table testlib2 (id number) ;
create table testlib3 (id number) ;
create table testlib4 (id number) ;
create table testlib5 (id number) ;
create table testlib6 (id number) ;
create table testlib7 (id number) ;
create table testlib8 (id number) ;
create table testlib9 (id number) ;
create table testlib10 (id number) ;
create table testlib11 (id number) ;
create table testlib12 (id number) ;
create table testlib13 (id number) ;
create table testlib14 (id number) ;
create table testlib15 (id number) ;
create table testlib16 (id number) ;
create table testlib17 (id number) ;
create table testlib18 (id number) ;
create table testlib19 (id number) ;
create table testlib20 (id number) ;
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT (); --生成一个快照
编辑生成一个脚本(为了模拟大量的硬解析):
[oracle@db11g ~]$ vim test.sh
#!/bin/ksh
i="$1"
while true
do
echo $i
sqlplus scott/tiger << EOF
select * from testlib$i where id = $RANDOM; -- 这是一个常量SQL,会不停的访问testlib$i这个表,一直访问
exit
EOF
done
[oracle@rac11g ~]$ chmod a+x test.sh
执行脚本:
[oracle@db11g ~]$ nohup sh test.sh 1 &
nohup sh test.sh 2 &
nohup sh test.sh 3 &
nohup sh test.sh 4 &
nohup sh test.sh 5 &
nohup sh test.sh 6 &
nohup sh test.sh 7 &
nohup sh test.sh 8 &
nohup sh test.sh 9 &
nohup sh test.sh 10 &
nohup sh test.sh 11 &
nohup sh test.sh 12 &
nohup sh test.sh 13 &
nohup sh test.sh 14 &
nohup sh test.sh 15 &
nohup sh test.sh 16 &
nohup sh test.sh 17 &
nohup sh test.sh 18 &
nohup sh test.sh 19 &
nohup sh test.sh 20 &
[oracle@db11g ~]$ ps
PID TTY TIME CMD
1113 pts/0 00:00:00 sqlplus
1115 pts/0 00:00:00 sqlplus
1116 pts/0 00:00:00 sqlplus
1117 pts/0 00:00:00 sqlplus
1118 pts/0 00:00:00 sqlplus
1119 pts/0 00:00:00 sqlplus
1120 pts/0 00:00:00 sqlplus
1121 pts/0 00:00:00 sqlplus
1122 pts/0 00:00:00 sqlplus
1123 pts/0 00:00:00 sqlplus
1125 pts/0 00:00:00 sqlplus
1126 pts/0 00:00:00 sqlplus
1127 pts/0 00:00:00 sqlplus
1128 pts/0 00:00:00 sqlplus
1129 pts/0 00:00:00 sqlplus
1130 pts/0 00:00:00 sqlplus
1132 pts/0 00:00:00 sqlplus
1134 pts/0 00:00:00 sqlplus
1150 pts/0 00:00:00 sqlplus
1152 pts/0 00:00:00 sqlplus
1153 pts/0 00:00:00 sqlplus
1155 pts/0 00:00:00 ps
6431 pts/0 00:00:01 sh
7862 pts/0 00:00:01 sh
31873 pts/0 00:00:00 bash
31915 pts/0 00:00:01 sh
31916 pts/0 00:00:01 sh
31917 pts/0 00:00:01 sh
31918 pts/0 00:00:01 sh
31919 pts/0 00:00:01 sh
31920 pts/0 00:00:01 sh
31921 pts/0 00:00:01 sh
31922 pts/0 00:00:01 sh
31923 pts/0 00:00:01 sh
31924 pts/0 00:00:01 sh
31925 pts/0 00:00:01 sh
31926 pts/0 00:00:01 sh
31927 pts/0 00:00:01 sh
31928 pts/0 00:00:01 sh
31929 pts/0 00:00:01 sh
31930 pts/0 00:00:01 sh
31931 pts/0 00:00:01 sh
31932 pts/0 00:00:01 sh
31933 pts/0 00:00:01 sh
查找常量的SQL:
SQL> SET pagesize 0
SQL> SET linesize 250
SQL> column FORCE_MATCHING_SIGNATURE format 99999999999999999999999
WITH c AS
(SELECT FORCE_MATCHING_SIGNATURE,
COUNT(*) cnt
FROM v$sqlarea
WHERE FORCE_MATCHING_SIGNATURE!=0
GROUP BY FORCE_MATCHING_SIGNATURE
HAVING COUNT(*) > 20
)
,
sq AS
(SELECT sql_text,
FORCE_MATCHING_SIGNATURE,
row_number() over (partition BY FORCE_MATCHING_SIGNATURE ORDER BY sql_id DESC) p
FROM v$sqlarea s
WHERE FORCE_MATCHING_SIGNATURE IN
(SELECT FORCE_MATCHING_SIGNATURE
FROM c
)
)
SELECT sq.sql_text ,
sq.FORCE_MATCHING_SIGNATURE,
c.cnt "unshared count" -- 没有共享的SQL的数量
FROM c,
sq
WHERE sq.FORCE_MATCHING_SIGNATURE=c.FORCE_MATCHING_SIGNATURE
AND sq.p =1
ORDER BY c.cnt DESC;
再次生成一个快照:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
PL/SQL procedure successfully completed.
生成一个报告:
SQL> @?/rdbms/admin/awrrpt.sql
Current Instance
~~~~~~~~~~~~~~~~
1458193369 ORCL 1 orcl
Specify the Report Type
~~~~~~~~~~~~~~~~~~~~~~~
Would you like an HTML report, or a plain text report?
Enter 'html' for an HTML report, or 'text' for plain text
Defaults to 'html'
Enter value for report_type: text
Type Specified: text
Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DB Id Inst Num DB Name Instance Host
------------ -------- ------------ ------------ ------------
* 1458193369 1 ORCL orcl db11g
Using 1458193369 for database Id
Using 1 for instance number
Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing <return> without
specifying a number lists all completed snapshots.
Enter value for num_days: 1
Listing the last day's Completed Snapshots
SnapInstance DB Name Snap Id Snap Started Level
------------ ------------ --------- ------------------ -----
orcl ORCL 8 13 Dec 2016 00:00 1
9 13 Dec 2016 01:00 1
10 13 Dec 2016 01:49 1 -- 执行SQL之前生成的快照
11 13 Dec 2016 03:00 1 -- 执行SQL之后生成的快照
Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 10
Begin Snapshot Id specified: 10
Enter value for end_snap: 11
End Snapshot Id specified: 11
Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is awrrpt_1_10_11.txt. To use this name,
press <return> to continue, otherwise enter an alternative.
Enter value for report_name:
Using the report name awrrpt_1_10_11.txt -- 报告的名字
查看报告:
查询常量SQL:
WITH c AS
(SELECT FORCE_MATCHING_SIGNATURE,
COUNT(*) cnt
FROM v$sqlarea
WHERE FORCE_MATCHING_SIGNATURE!=0
GROUP BY FORCE_MATCHING_SIGNATURE
HAVING COUNT(*) > 20
)
,
sq AS
(SELECT sql_text ,
FORCE_MATCHING_SIGNATURE,
row_number() over (partition BY FORCE_MATCHING_SIGNATURE ORDER BY sql_id DESC) p
FROM v$sqlarea s
WHERE FORCE_MATCHING_SIGNATURE IN
(SELECT FORCE_MATCHING_SIGNATURE
FROM c
)
)
SELECT sq.sql_text ,
sq.FORCE_MATCHING_SIGNATURE,
c.cnt "unshared count" -- 没有共享的SQL的数量
FROM c
,
sq
WHERE sq.FORCE_MATCHING_SIGNATURE=c.FORCE_MATCHING_SIGNATURE
AND sq.p =1
ORDER BY c.cnt DESC;
查询结果:
这里有一个参数:CURSOR_SHARING
SQL> show parameter CURSOR_SHARING;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing string EXACT
如果将CURSOR_SHARING这个参数改为:false之后,oracle收到一个常量SQL之后,会将字面常量改为绑定变量之后再进行解析,执行,获取(比如:将select * from testlib17 where id = 4019 改为 select * from testlib17 where id = :1,然后求一下select * from testlib17 where id = :1 这个SQL的hash值,这个hash值就等于EXACT_MATCHING_SIGNATURE的:6.69228038869743E18),后面的没有共享的数量 38,说明的是SQL是一样的,都是select * from testlib17 where id = :1这个SQL的hash值6.69228038869743E18,只是最后的常量不同,有38个不同的常量
然后再生成一个快照:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
PL/SQL procedure successfully completed.
再执行脚本,生成报告:
SQL> @?/rdbms/admin/awrrpt.sql
查看解析花费的时间(也在报告里):
查询latch:
select * from v$latch_children order by misses desc;
查询mutex:
select * from v$mutex_sleep order by sleeps;
INVALIDATIONS:
就是说:select一个表,这个表被做了DDL操作,这个select所对应的cursor就变为了INVALIDATIONS(无效的),主要是这种情况;还有就是这个表的统计信息被重新收集了,统计信息和DDL操作是导致cursor变为无效很主要的两个原因;
所以INVALIDATIONS多,就说明这个表被做了DDL或者重新收集了统计信息
RELOADS:
RELOADS往往跟硬解析有关,就是一个父游标然后下面有子游标,子游标下面是执行计划,shared pool里,假设free空间比较小的时候,或者找不到free空间的时候,就会找到library cache,然后把执行计划扔掉,释放掉,就可以用这个free空间,然后另外一个sp想要来执行的时候,发现执行计划没有了,就要重新解析一下,重新生成执行计划,这就叫做:RELOADS
还有一种情况就是子游标过多(child Number 之cursors),会产生大量的latch争用:
就是这两个latch的争用:
1、11g:chain争用,library cache中的chain的争用
library cache: mutex X,硬解析
2、10g:chain争用,library cache中的chain的争用
latch:library cache
查询(关于INVALIDATIONS的问题):
SELECT SUBSTR(sql_text, 1, 40) "SQL",
invalidations
FROM v$sqlarea
where invalidations >0
ORDER BY invalidations DESC;
查询(针对子游标过多的情况):
SELECT address, hash_value,
version_count ,
users_opening ,
users_executing,
substr(sql_text,1,40) "SQL"
FROM v$sqlarea
WHERE version_count > 10
ORA-04031错误:
它往往会对应一个trace文件,然后对比着trace文件,看mos,mos里有许多关于ORA-04031错误的文档
查看哪些SQL占用了大量的空间:
SELECT substr(sql_text,1,40) "Stmt", count(*),
sum(sharable_mem) "Mem",
sum(users_opening) "Open",
sum(executions) "Exec"
FROM v$sql
GROUP BY substr(sql_text,1,40)
HAVING sum(sharable_mem) > &MEMSIZE;
占空间超过10k的SQL: