[20250215]21c library cache mutex的深入探究15(_mutex_spin_count).txt

[20250215]21c library cache mutex的深入探究15(_mutex_spin_count).txt

--//探究_mutex_spin_count隐含参数在_mutex_wait_scheme=2的情况下。

1.环境:
SYS@book> @ver2
==============================
PORT_STRING                   : x86_64/Linux 2.4.xx
VERSION                       : 21.0.0.0.0
BANNER                        : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
BANNER_FULL                   : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.3.0.0.0
BANNER_LEGACY                 : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
CON_ID                        : 0
PL/SQL procedure successfully completed.

SYS@book> @ hidez ^_mutex
 NUM N_HEX NAME               DESCRIPTION       DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE ISSES ISSYS_MOD
---- ----- ------------------ ----------------- ------------- ------------- ------------ ----- ---------
3553   DE1 _mutex_wait_time   Mutex wait time   TRUE          1             1            FALSE IMMEDIATE
3554   DE2 _mutex_spin_count  Mutex spin count  TRUE          255           255          FALSE IMMEDIATE
3555   DE3 _mutex_wait_scheme Mutex wait scheme TRUE          2             2            FALSE IMMEDIATE
--//缺省_mutex_wait_time=1,时间单位与_mutex_wait_scheme相关,_mutex_wait_scheme=2时时间单位是厘秒,而
--//_mutex_wait_scheme=0,1时,单位时毫秒。
--//_mutex_wait_scheme =2时,_mutex_wait_time>1时sleeps的时间会出现指数回退.

--//缺省_mutex_wait_scheme =2.

--//网上找了一段资料:
* _mutex_spin_count (Integer)
- This sets the number of times to spin before yielding/waiting.

* _mutex_wait_scheme (Integer)
- In 11.2 this controls which wait scheme to use. It can be set to one of the three wait schemes described above thus:
_mutex_wait_scheme = 0                        – Always YIELD
_mutex_wait_scheme = 1 & _mutex_wait_time = t – Always SLEEP for t milli-seconds
_mutex_wait_scheme = 2 & _mutex_wait_time = t – EXP BACKOFF with maximum sleep (default)

2.测试前准备:
SYS@book> @ hide _mutex_spin_count
SYS@book> @ pr
==============================
NAME                          : _mutex_spin_count
DESCRIPTION                   : Mutex spin count
DEFAULT_VALUE                 : TRUE
SESSION_VALUE                 : 255
SYSTEM_VALUE                  : 255
ISSES_MODIFIABLE              : FALSE
ISSYS_MODIFIABLE              : IMMEDIATE
PL/SQL procedure successfully completed.

SYS@book> oradebug setmypid
Statement processed.
SYS@book> oradebug dump library_cache 4
Statement processed.

$ grep "^Bucket:" /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_10770.trc | head -4
Bucket: #=0 Mutex=0x6cfa1400(1125281431552, 9, 0, 6)
Bucket: #=5 Mutex=0x6cfa14f0(1125281431552, 4, 0, 6)
Bucket: #=17 Mutex=0x6cfa1730(1125281431552, 8, 0, 6)
Bucket: #=39 Mutex=0x6cfa1b50(1125281431552, 4, 0, 6)
--//以上是前面测试的结果,直接找Bucket: #=0 Mutex=0x6cfa1400的语句测试。
--//该sql语句select /*+ 9 */ count(*) from dept where deptno = 93834;的bucket=0.
--//Bucket: #=0 Mutex=0x6cfa1400
--//0x6cfa1400-0x10 = 0x6cfa13f0

3.测试:
--//session 1:
SCOTT@book01p> @ spid
==============================
SID                           : 282
SERIAL#                       : 27527
PROCESS                       : 3400
SERVER                        : DEDICATED
SPID                          : 3402
PID                           : 42
P_SERIAL#                     : 5
KILL_COMMAND                  : alter system kill session '282,27527' immediate;
PL/SQL procedure successfully completed.

SCOTT@book01p> alter session set session_cached_cursors=0 ;
Session altered.

SCOTT@book01p> select /*+ 9 */ count(*) from dept where deptno = 93834;
  COUNT(*)
----------
         0
--//执行以上语句多次。

--//session 2:
SYS@book> @opeek 0x6cfa13f0 48 0
[06CFA13F0, 06CFA1420) = 66381710 00000000 66381710 00000000 00000000 00000000 00000004 00000000 00000000 00000000 643D2260 00000000

--//session 3:
SYS@book> @ spid
==============================
SID                           : 24
SERIAL#                       : 30581
PROCESS                       : 3417
SERVER                        : DEDICATED
SPID                          : 3418
PID                           : 88
P_SERIAL#                     : 2
KILL_COMMAND                  : alter system kill session '24,30581' immediate;
PL/SQL procedure successfully completed.

--//window 1,跟踪session 1:
$ rlgdb -f -p 3402
(gdb) b semtimedop
Breakpoint 1 at 0x7f841a6d7fc0: file ../sysdeps/unix/syscall-template.S, line 81.

--//window 2,跟踪session 3:
$ rlgdb -f -p 3418

(gdb) x /8wx 0x6cfa1400
0x6cfa1400:     0x00000000      0x00000000      0x00000006      0x00000000
0x6cfa1410:     0x00000000      0x00000000      0x643d2260      0x00000000
(gdb) set *(int *)0x6cfa1404=0x7777

(gdb) x /8wx 0x6cfa1400
0x6cfa1400:     0x00000000      0x00007777      0x00000006      0x00000000
0x6cfa1410:     0x00000000      0x00000000      0x643d2260      0x00000000

--//session 1:
SCOTT@book01p> select /*+ 9 */ count(*) from dept where deptno = 93834;
--//挂起!!

--//window 1:
(gdb) c
Continuing.

Breakpoint 1, semtimedop () at ../sysdeps/unix/syscall-template.S:81
/usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f841a6d7fc0
(gdb) bt
#0  semtimedop () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000015339805 in sskgpwwait ()
#2  0x0000000015334595 in skgpwwait ()
#3  0x000000001498b09e in ksliwat ()
#4  0x000000001498a4ba in kslwaitctx ()
#5  0x0000000005abd22c in kgxWait ()
#6  0x0000000015398630 in kgxExclusive ()
#7  0x0000000015371361 in kglGetMutex ()
#8  0x0000000015365edb in kglhdgn ()
#9  0x0000000015363756 in kglLock ()
#10 0x000000001535e178 in kglget ()
#11 0x0000000014fc6882 in kkspsc0 ()
#12 0x0000000014fc45f2 in kksParseCursor ()
#13 0x00000000151c963f in opiosq0 ()
#14 0x0000000014f6f166 in kpooprx ()
#15 0x0000000014f6ca1b in kpoal8 ()
#16 0x0000000014eefdf8 in opiodr ()
#17 0x0000000015294359 in ttcpip ()
#18 0x00000000030025c2 in opitsk ()
#19 0x0000000003007920 in opiino ()
#20 0x0000000014eefdf8 in opiodr ()
#21 0x0000000002ffe2ab in opidrv ()
#22 0x0000000003e8bd15 in sou2o ()
#23 0x0000000000e91ae0 in opimai_real ()
#24 0x0000000003e9915c in ssthrdmain ()
#25 0x0000000000e91924 in main ()
--//mutex_spin_count会在那个函数里面呢?猜测一下在kgxExclusive。
--//反汇编kgxExclusive。

(gdb) disassemble kgxExclusive
....
   0x00000000153985fd <+653>:   mov    %edx,%ecx
   0x00000000153985ff <+655>:   inc    %edx
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   0x0000000015398601 <+657>:   movzwl %dx,%edx
   0x0000000015398604 <+660>:   cmp    %r15d,%ecx
   0x0000000015398607 <+663>:   je     0x15398610 <kgxExclusive+672>
   0x0000000015398609 <+665>:   pause
   0x000000001539860b <+667>:   jmpq   0x153983b6 <kgxExclusive+70>
   0x0000000015398610 <+672>:   mov    %r12,%rdi
   0x0000000015398613 <+675>:   mov    %r14,%rsi
   0x0000000015398616 <+678>:   xor    %r9d,%r9d
   0x0000000015398619 <+681>:   lea    -0xb8(%rbp),%rdx
   0x0000000015398620 <+688>:   lea    -0x240(%rbp),%rcx
   0x0000000015398627 <+695>:   lea    -0x30(%rbp),%r8
   0x000000001539862b <+699>:   callq  0x5abcd20 <kgxWait>
   0x0000000015398630 <+704>:   mov    %eax,%r11d
   0x0000000015398633 <+707>:   incw   0x12(%r14)
   0x0000000015398638 <+712>:   xor    %edx,%edx
   0x000000001539863a <+714>:   jmp    0x15398609 <kgxExclusive+665>
   0x000000001539863c <+716>:   nopl   0x0(%rax)
End of assembler dump.

--//首先说明,不熟悉汇编,不过可以猜测,注意看下划线存在inc +1调用。后面出现cmp的比较,je相等跳到<kgxExclusive+672>。
--//jmpq 跳到<kgxExclusive+70>,猜测%r15d保存的就是_mutex_spin_count计数。
--//注:实际上我直接上kimi,贴上整个反汇编代码,提出需求要找出循环部分.
--//结论部分给出的答案:
--//循环的入口是 0x0000000015398609 <+665>,循环体从这里开始,通过 jmpq 指令回到循环的开始位置,直到满足退出条件(cmp
--//%r15d,%ecx 和 je 0x15398610)。循环体中调用了 kgxWait 函数,并对某些内存位置和寄存器进行操作。
--//现在的人工智能真的有点恐怖.也尝试deepseek,有点卡,也找到了,但是给出的提示不是太准确。

--//建立如下脚本:
$ cat spin_mutex.gdb
break *0x00000000153985ff
  commands
    silent
    printf " mutex spin count loop: %d %d \n", $r15,$rdx
    c
  end

--//window 1:
$ rlgdb -f -p 3402 -x spin_mutex.gdb
Breakpoint 1 at 0x153985ff
(gdb) c
Continuing.
 mutex spin count loop: 255 0
 mutex spin count loop: 255 1
 mutex spin count loop: 255 2
 mutex spin count loop: 255 3
 mutex spin count loop: 255 4
 mutex spin count loop: 255 5
 mutex spin count loop: 255 6
 mutex spin count loop: 255 7
 mutex spin count loop: 255 8
 mutex spin count loop: 255 9
 mutex spin count loop: 255 10
...
--//说明基本跟踪正确。

--//session 2:
SYS@book> alter system set "_mutex_spin_count"=200 scope=memory;
System altered.

--//原来的会话要退出才能生效,重复前面的测试,发现:
Breakpoint 1 at 0x153985ff
(gdb) c
Continuing.
 mutex spin count loop: 200 0
 mutex spin count loop: 200 1
 mutex spin count loop: 200 2
 mutex spin count loop: 200 3
 mutex spin count loop: 200 4
 mutex spin count loop: 200 5
 mutex spin count loop: 200 6
 mutex spin count loop: 200 7
 mutex spin count loop: 200 8
 mutex spin count loop: 200 9
 mutex spin count loop: 200 10
..
--//说明自己的判断完全正确。

4.编写gdb脚本:

$ cat spin_mutex.gdb
set pagination off
#set print repeats 0
#set print elements 0
set logging file /tmp/spin_mutex.log
set logging overwrite on
set logging on
set $sleep_count  = 0

break semtimedop
  commands
    silent
        shell echo -n $( date +"%Y/%m/%d %T.%N : ")
    printf "semtimedop %d\n", ++$sleep_count
    c
end

break sched_yield
  commands
    silent
        shell echo -n $( date +"%Y/%m/%d %T.%N : ")
    printf "sched_yield %d\n", ++$sleep_count
    c
end

break select
  commands
    silent
        shell echo -n $( date +"%Y/%m/%d %T.%N : ")
    printf "select %d\n", ++$sleep_count
    c
end

#break kgxExclusive
#  commands
#    silent
#        shell echo -n $( date +"%Y/%m/%d %T.%N : ")
#    printf "kgxExclusive \n"
#    c
#end

#break *0x00000000153985ff
#  commands
#    silent
#    printf " mutex spin count loop: %d %d \n", $r15d,$edx
#    c
#  end

--//使用它验证sleep计数是否正确。
(gdb) x /8wx 0x6cfa1400
0x6cfa1400:     0x00000000      0x00000000      0x0000000f      0x00000d72
0x6cfa1410:     0x00000000      0x00000000      0x643d2260      0x00000000

(gdb) set *(int *)0x6cfa1404=0x0888
(gdb) x /8wx 0x6cfa1400
0x6cfa1400:     0x00000000      0x00000888      0x0000000f      0x00000d72
0x6cfa1410:     0x00000000      0x00000000      0x643d2260      0x00000000

(gdb) set *(int *)0x6cfa1404=0x0
(gdb) x /8wx 0x6cfa1400
0x6cfa1400:     0x00000000      0x00000000      0x00000010      0x00000ea3
0x6cfa1410:     0x00000000      0x00000000      0x643d2260      0x00000000

$ head -10 /tmp/spin_mutex.log; tail -2  /tmp/spin_mutex.log
Breakpoint 1 at 0x7f2241b1ffc0: file ../sysdeps/unix/syscall-template.S, line 81.
Breakpoint 2 at 0x7f2241b032e0: file ../sysdeps/unix/syscall-template.S, line 81.
Breakpoint 3 at 0x7f2241b15b70: file ../sysdeps/unix/syscall-template.S, line 81.
Continuing.
 sched_yield 1
 sched_yield 2
 semtimedop 3
 semtimedop 4
 semtimedop 5
 semtimedop 6
 semtimedop 304
 semtimedop 305

--//0x00000ea3-0x00000d72 = 0x131 = 305

5.补充测试,取消脚本全部注解:

--//仅仅贴出测试结果:
$  awk '{print $1,$2}' /tmp/spin_mutex.log| egrep "semtimedop|getrusage|sched_yield|select|kgxExclusive|mutex spin" | uniq -c |head -10
      3 kgxExclusive
    201 mutex spin
      1 sched_yield 1
    201 mutex spin
      1 sched_yield 2
    201 mutex spin
      1 semtimedop 3
    201 mutex spin
      1 semtimedop 4
    201 mutex spin

$  awk '{print $1,$2}' /tmp/spin_mutex.log| egrep "semtimedop|getrusage|sched_yield|select|kgxExclusive|mutex spin" | uniq -c |tail
    201 mutex spin
      1 semtimedop 55
    201 mutex spin
      1 semtimedop 56
    201 mutex spin
      1 semtimedop 57
    201 mutex spin
      1 semtimedop 58
     97 mutex spin
      2 kgxExclusive

--//每次sleep前都要spin尝试。可以想象如何设置很大_mutex_spin_count会消耗CPU资源的。

--//贴一个以前在11g下的测试结果:
-------------------------------------------------------------------------------------------------------
_mutex_wait_scheme     CPU使用率      描述
------------------------------------------------------------------------------------------------------
2                      .3             2秒时间内,semtimedop 182次 getrusage 2次 _mutex_spin_count=255
                       12             2秒时间内,semtimedop 167次 getrusage 2次 _mutex_spin_count=65535
1                      3              2秒时间内, select 16XX次(每次调用0.001秒),getrusage 2次
0                      39.9           调用99次sched_yield,然后1次seelct(每次调用0.001秒).
------------------------------------------------------------------------------------------------------

--//注:21c下,2秒时间内,getrusage 1次.

posted @   lfree  阅读(4)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· 清华大学推出第四讲使用 DeepSeek + DeepResearch 让科研像聊天一样简单!
· 实操Deepseek接入个人知识库
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
历史上的今天:
2024-02-19 [20240219]建立完善sql_idx.sh脚本.txt
2019-02-19 [20190219]那个更快(11g).txt
点击右上角即可分享
微信分享提示