[20240920]跟踪library cache lock library cache pin使用gdb.txt

[20240920]跟踪library cache lock library cache pin使用gdb.txt

–//前一阵子,写的使用gdb跟踪library cache lock library cache pin的脚本有一个小问题,无法获得lock address以及pin address

–//地址,有一点点小缺陷,尝试修改完善看看。

–//按照https://nenadnoveljic.com/blog/tracing-library-cache-locks/介绍:

In order to close this gap I, first, examined the following two Oracle C functions on the release 19.6.0.0.200114:

kgllkal and kglGetSO.

kgllkal allocates a library cache lock. It receives the following arguments:

rdx: handle address

rcx: lock mode

Further, it calls kglGetSO to allocate the library cache state object. kglGetSO returns the lock address.

–//按照介绍调用kglGetSO返回lock address.我猜测 library cache pind的pin地址也是类似的情况。

$ ./lookup.awk kglGetSO

kglGetSO : kernel generic library cache management get state object

–//如何使用gdb实现,自己尝试看看。

–//仔细看链接https://nenadnoveljic.com/blog/tracing-library-cache-locks/

pid$target:oracle:kglGetSO:return

/ self->kglGetSO /

{

  printf(“\n”);

  printf(“KGLLKHDL KGLLKMOD KGLLKADR\n”);

  printf(“——– ——– ——–\n”);

  printf(“%X %8d %X\n” , self->kgllkhdl, self->kgllkmod, arg1 );

  printf(“\n”);

  printf(“==========================\n”);

  printf(“\n”);

  self->kglGetSO = 0;

}

–//dtrace可以设置在kglGetSO return获取arg1就是KGLLKADR,gdb如何实现呢?想起测试学习使用的方法,建立一个过程lcp加入sleep 3600,

–//先执行它,然后打开另外一个会话编译它,就会出现library cache pin,再打开另外的会话编译它,就会出现library cache lock,使用它

–//测试看看。

1.环境:

SCOTT@book01p> @ver2

==============================

PORT_STRING                   : x86_64/Linux 2.4.xx

VERSION                       : 21.0.0.0.0

BANNER                        : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 – Production

BANNER_FULL                   : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 – Production

Version 21.3.0.0.0

BANNER_LEGACY                 : Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 – Production

CON_ID                        : 0

PL/SQL procedure successfully completed.

SYS@book> @ sed “library cache pin”

SYS@book> @ pr

==============================

SED_EVENT#                    : 375

SED_NAME                      : library cache pin

SED_WAIT_CLASS                : Concurrency

SED_P1                        : handle address

SED_P2                        : pin address

SED_P3                        : 100*mode+namespace

SED_EQ_NAME                   :

SED_REQ_REASON                :

SED_REQ_DESCRIPTION           :

PL/SQL procedure successfully completed.

–//library cache lock类似

–//grant execute on sys.dbms_lock to scott;

create procedure lcp

is

begin

sys.dbms_lock.sleep(3600);

end;

/

2.测试1:

–//session 1:

SCOTT@book01p(412,5770)> exec lcp

–//session 2:

SCOTT@book01p(148,4168)> set timing on

SCOTT@book01p(148,4168)> alter procedure lcp compile;

–//挂起!!

SYS@book> @ ashtop event,p1hex,p2hex,p2,p3hex  1=1 &10s

    Total                                                                                                                                                                       Distinct Distinct    Distinct

  Seconds     AAS %This   EVENT                                      P1HEX             P2HEX                     P2 P3HEX             FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1

——— ——- ——- —————————————— —————– —————– ———- —————– ——————- ——————- ———- ——– ———–

       10     1.0   83% | library cache pin                           00000000635F4DA0  000000006C0DCDD8 1812843992  00012A0300010003 2024-09-20 16:59:23 2024-09-20 16:59:32          1       10           1

        2      .2   17% |                                                                                         0                   2024-09-20 16:59:23 2024-09-20 16:59:31          1        2           2

–//出现library cache pin等待事件,P1hex=00000000635F4DA0就是对象句柄。

$ source ext_kglobj.sh 00000000635F4DA0

(gdb) 0x635f4f68:       “LCPSCOTTBOOK01P”

(gdb) quit

–//使用我前面写的抽取对象脚本ext_kglobj.sh,可以发现就是对象lcp。中断上面的执行。

3.编写脚本如下:

$ cat lkpn.gdb2

set pagination off

set logging file /tmp/lkpn.log

set logging overwrite on

set logging on

set $lk  = 0

set $pn  = 0

set $lock  = 0

#break kglpnal if $rcx==3

break kglpnal if ($rcx=3 && $rdx=0x00000000635F4DA0)

commands

 silent

 printf “kglpnal count %02d — handle address: %016x, mode: %d “, ++$pn ,$rdx ,$rcx

 echo kglnaobj address:

 x/s $rdx+0x1c8

 c

 end

break kglGetSO

commands

 silent

 finish

end

–//注:设置断点在句柄$rdx=0x00000000635F4DA0,kglGetSO执行最后是finish,这样停止在该函数kglGetSO的返回。

4.继续测试:

–//session 2:

SCOTT@book01p(148,4168)> @ spid

       SID    SERIAL# PROCESS                  SERVER    SPID                               PID  P_SERIAL# C50

———- ———- ———————— ——— —————————— ——- ———- ————————————————–

       148       4168 5283                     DEDICATED 5285                                69         14 alter system kill session ‘148,4168’ immediate;

–//获得session 2的spid=5285。

–//window 1:

$ rlgdb -f -p 5285 -x lkpn.gdb2



/usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f35cf359480

Breakpoint 1 at 0x1536c020

Breakpoint 2 at 0x1536f9c0

–//重复前面的测试:

–//window 1,不停按c继续,直到出现:

(gdb) c

Continuing.

0x0000000015367fbe in kgllkal ()

(gdb) c

Continuing.

0x0000000015367fbe in kgllkal ()

(gdb) c

Continuing.

0x000000001536c16a in kglpnal ()

(gdb) c

….

(gdb) c

Continuing.

kglpnal count 01 — handle address: 00000000635f4da0, mode: 3 kglnaobj address:0x635f4f68:      “LCPSCOTTBOOK01P”

0x000000001536c16a in kglpnal ()

–//session 3:

SYS@book> @ ashtop event,p1hex,p2hex,p2,p3hex  1=1 &10s

    Total                                                                                                                                                                       Distinct Distinct    Distinct

  Seconds     AAS %This   EVENT                                      P1HEX             P2HEX                     P2 P3HEX             FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1

——— ——- ——- —————————————— —————– —————– ———- —————– ——————- ——————- ———- ——– ———–

       10     1.0   77% |                                                                                         1                   2024-09-20 18:00:02 2024-09-20 18:00:11          1       10           1

        3      .3   23% |                                                                                         0                   2024-09-20 18:00:03 2024-09-20 18:00:08          1        2           2

–//没有获得相应等待事件。

–//window 1:

(gdb) c

Continuing.

kglpnal count 01 — handle address: 00000000635f4da0, mode: 3 kglnaobj address:0x635f4f68:      “LCPSCOTTBOOK01P”

0x000000001536c16a in kglpnal ()

–//现在停在函数kglGetSO的返回处。记住停止在地址:0x000000001536c16a

(gdb) info regi

rax            0x6c09abb8       1812573112

rbx            0x0      0

rcx            0x71fa9128       1912246568

rdx            0x6c09abb8       1812573112

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

rsi            0x71fbca40       1912326720

rdi            0x3      3

rbp            0x7ffd091a1540   0x7ffd091a1540

rsp            0x7ffd091a1120   0x7ffd091a1120

r8             0x7f35d2e33570   139869148099952

r9             0x6fb37c20       1874033696

r10            0x76c84268       1992835688

r11            0x3      3

r12            0x6c297cc8       1814658248

r13            0x7f35d2dee060   139869147816032

r14            0x635f4da0       1667190176

r15            0x3      3

rip            0x1536c16a       0x1536c16a <kglpnal+330>

eflags         0x246    [ PF ZF IF ]

cs             0x33     51

ss             0x2b     43

ds             0x0      0

es             0x0      0

fs             0x0      0

gs             0x0      0

(gdb) c

Continuing.

Program received signal SIGUSR2, User defined signal 2.

0x00007f35cee68fca in semtimedop () at ../sysdeps/unix/syscall-template.S:81

/usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f35cee68fca

–//session 3:

SYS@book> @ ashtop event,p1hex,p2hex,p2,p3hex  1=1 &10s

    Total                                                                                                                                                                       Distinct Distinct    Distinct

  Seconds     AAS %This   EVENT                                      P1HEX             P2HEX                     P2 P3HEX             FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1

——— ——- ——- —————————————— —————– —————– ———- —————– ——————- ——————- ———- ——– ———–

        9      .9   64% |                                                                                         1                   2024-09-20 18:02:13 2024-09-20 18:02:22          2        9           2

        2      .2   14% | library cache pin                           00000000635F4DA0  000000006C09ABB8 1812573112  00012A0300010003 2024-09-20 18:02:21 2024-09-20 18:02:22          1        2           1

        2      .2   14% |                                                                                         0                   2024-09-20 18:02:14 2024-09-20 18:02:17          1        2           2

–//P2hex=000000006C09ABB8,正好是前面rdx=0x6c09abb8对上,实际上这个就是pin address地址。

–//p3hex=00012A0300010003,100*mode+namespace,这里稍微有点不同,前面8位,00012A03 = 76291对应lcp的object_id.

–//mode=0003,namespace=0001。

SCOTT@book01p> @ o2 lcp

owner object_name object_type SEG_PART_NAME status      OID D_OID CREATED             LAST_DDL_TIME

—– ———– ———– ————- ——— —– —– ——————- ——————-

SCOTT LCP         PROCEDURE                 VALID     76291       2024-08-18 11:51:35 2024-09-21 12:07:05

SYS@book> @nmsp  procedure -1

@ nmsp table -1

@ nmsp ” 74  or @ nmsp ” 0x4a|x4a

KGLSTDSC        KGLSTIDN KGLSTIDN_HEX

————— ——– ————

TABLE/PROCEDURE        1 1

–//有了前面的信息,改写脚本如下:

$ cat  lkpn.gdb2

set pagination off

set logging file /tmp/lkpn.log

set logging overwrite on

set logging on

set $lk  = 0

set $pn  = 0

set $lock  = 0

#break kgllkal if $rcx==3

#break kgllkal if ( $rcx==3 && $rdx==0x00000000670C9E58 )

#break kgllkal if $rdx==0x00000000670C9E58

break kgllkal

condition 1

 commands

 silent

 printf “kgllkal count %02d — handle address: %016x, mode: %d “, ++$lk ,$rdx ,$rcx

 echo kglnaobj address:

 x/s $rdx+0x1c8

 c

 end

#break kglpnal if $rcx==3

#break kglpnal if $rdx==0x00000000635F4DA0

break kglpnal

commands

 silent

 printf “kglpnal count %02d — handle address: %016x, mode: %d “, ++$pn ,$rdx ,$rcx

 echo kglnaobj address:

 x/s $rdx+0x1c8

 c

 end

#break kglGetSO

#commands

# silent

# finish

#end

break *0x0000000015367fbe

commands

 silent

 printf “kglGetS0 return lock address : %016x\n”, $rdx

 c

end

break *0x1536c16a

commands

 silent

 printf “kglGetS0 return pin address : %016x\n”, $rdx

 c

end

–//注:调用kgllkal在调用kglGetS0的返回停止在0x15367fbe地址时,对应寄存器rdx的就是lock地址(猜测)。

4.继续测试,验证lock address是否正确。

–//session 3,再打开新的会话:

SCOTT@book01p(272,12390)> @spid

       SID    SERIAL# PROCESS                  SERVER    SPID                               PID  P_SERIAL# C50

———- ———- ———————— ——— —————————— ——- ———- ————————————————–

       272      12390 6427                     DEDICATED 6429                                70         20 alter system kill session ‘272,12390’ immediate;

–//window 1:

$ rlgdb -f -p 6429 -x lkpn.gdb2



/usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f931703e480

Breakpoint 1 at 0x15367e90

Breakpoint 2 at 0x1536c020

Breakpoint 3 at 0x15367fbe

Breakpoint 4 at 0x1536c16a

(gdb) c

–//重复前面的测试,只不过这次是3个会话。

–//session 1:

SCOTT@book01p(412,5770)> exec lcp

–//session 2:

SCOTT@book01p(148,4168)> set timing on

SCOTT@book01p(148,4168)> alter procedure lcp compile;

–//挂起!!

–//session 4:

SCOTT@book01p(272,12390)> set timing on

SCOTT@book01p(272,12390)> alter procedure lcp compile;

–//挂起!!

–//window 1:

(gdb) c

Continuing.

kgllkal count 01 — handle address: 0000000063cb7b20, mode: 1 kglnaobj address:0x63cb7ce8:      “alter procedure lcp compile”

kglGetS0 lock address : 0000000063634090

kgllkal count 02 — handle address: 000000006134f840, mode: 1 kglnaobj address:0x6134fa08:      “”

kglGetS0 lock address : 00000000610c6b90

kglpnal count 01 — handle address: 000000006134f840, mode: 3 kglnaobj address:0x6134fa08:      “”

kglGetS0 pin address : 00000000610c6d70

kgllkal count 03 — handle address: 00000000702d5908, mode: 2 kglnaobj address:0x702d5ad0:      “bookSYSCDB$ROOT”

kglGetS0 lock address : 00000000610c6c80

kgllkal count 04 — handle address: 000000006b97c9f8, mode: 2 kglnaobj address:0x6b97cbc0:      “1073777561SYSCDB$ROOT”

kglGetS0 lock address : 0000000063633eb0

kgllkal count 05 — handle address: 0000000069c6b610, mode: 1 kglnaobj address:0x69c6b7d8:      “e985fc239b919877f25909e8f398a456Child:2BOOK01P”

kglGetS0 lock address : 0000000063633dc0

kglpnal count 02 — handle address: 0000000069c6b610, mode: 3 kglnaobj address:0x69c6b7d8:      “e985fc239b919877f25909e8f398a456Child:2BOOK01P”

kglGetS0 pin address : 00000000610c6aa0

kgllkal count 06 — handle address: 0000000060d45130, mode: 1 kglnaobj address:0x60d452f8:      “SCOTTBOOK01P”

kglGetS0 lock address : 0000000063633dc0

kgllkal count 07 — handle address: 00000000702d5908, mode: 2 kglnaobj address:0x702d5ad0:      “bookSYSCDB$ROOT”

kglGetS0 lock address : 00000000610c6d70

kgllkal count 08 — handle address: 000000006b97c9f8, mode: 2 kglnaobj address:0x6b97cbc0:      “1073777561SYSCDB$ROOT”

kglGetS0 lock address : 00000000610c6c80

kglpnal count 03 — handle address: 00000000697c1220, mode: 2 kglnaobj address:0x697c13e8:      “DATABASESYSBOOK01P”

kglGetS0 pin address : 00000000610c6d70

kglpnal count 04 — handle address: 00000000697c1220, mode: 2 kglnaobj address:0x697c13e8:      “DATABASESYSBOOK01P”

kglGetS0 pin address : 00000000610c6d70

kgllkal count 09 — handle address: 0000000067570378, mode: 1 kglnaobj address:0x67570540:      “LOGMNRGGC_TRIGGERSYSBOOK01P”

kglGetS0 lock address : 00000000610c6c80

kglpnal count 05 — handle address: 0000000067570378, mode: 2 kglnaobj address:0x67570540:      “LOGMNRGGC_TRIGGERSYSBOOK01P”

kglGetS0 pin address : 0000000063633eb0

kgllkal count 10 — handle address: 00000000674b5f50, mode: 1 kglnaobj address:0x674b6118:      “NO_VM_DDLWMSYSBOOK01P”

kglGetS0 lock address : 00000000610c6c80

kglpnal count 06 — handle address: 00000000674b5f50, mode: 2 kglnaobj address:0x674b6118:      “NO_VM_DDLWMSYSBOOK01P”

kglGetS0 pin address : 0000000063633eb0

kgllkal count 11 — handle address: 0000000066fe8790, mode: 1 kglnaobj address:0x66fe8958:      “LBAC$BEFORE_ALTERLBACSYSBOOK01P”

kglGetS0 lock address : 00000000610c6c80

kglpnal count 07 — handle address: 0000000066fe8790, mode: 2 kglnaobj address:0x66fe8958:      “LBAC$BEFORE_ALTERLBACSYSBOOK01P”

kglGetS0 pin address : 0000000063633eb0

kgllkal count 12 — handle address: 00000000635f4da0, mode: 1 kglnaobj address:0x635f4f68:      “LCPSCOTTBOOK01P”

kglGetS0 lock address : 00000000610c6d70

kgllkal count 13 — handle address: 00000000635f4da0, mode: 3 kglnaobj address:0x635f4f68:      “LCPSCOTTBOOK01P”

kglGetS0 lock address : 00000000610c6d70

–//对象LCPSCOTTBOOK01P ,session 4先调用kgllkal mode=1,因为session 2出现library cache pin的mode=3,无法获取,出现

–//library cache lock等待事件,mode=3.

Program received signal SIGUSR2, User defined signal 2.

0x00007f9316b4dfca in semtimedop () at ../sysdeps/unix/syscall-template.S:81

/usr/src/debug/glibc-2.17-c758a686/sysdeps/unix/syscall-template.S:81:3374:beg:0x7f9316b4dfca

–//进入休眠,每3秒探测1次。

SYS@book> @ ashtop event,p1hex,p2hex,p2,p3hex  1=1 &10s

    Total                                                                                                                                                                       Distinct Distinct    Distinct

  Seconds     AAS %This   EVENT                                      P1HEX             P2HEX                     P2 P3HEX             FIRST_SEEN          LAST_SEEN           Execs Seen  Tstamps Execs Seen1

——— ——- ——- —————————————— —————– —————– ———- —————– ——————- ——————- ———- ——– ———–

       10     1.0   48% | library cache lock                          00000000635F4DA0  00000000610C6D70 1628204400  00012A0300010003 2024-09-20 18:22:19 2024-09-20 18:22:28          1       10           1

       10     1.0   48% | library cache pin                           00000000635F4DA0  000000006198F7B0 1637414832  00012A0300010003 2024-09-20 18:22:19 2024-09-20 18:22:28          1       10           1

        1      .1    5% |                                                                                         0                   2024-09-20 18:22:20 2024-09-20 18:22:20          1        1           1

–//可以发现等待事件library cache lock的P2的信息完全可以对上。

5.简单小结:

–//通过调用kglGetSO的返回获得library cache lock/library cache pin的lock pin address。

–//gdb不知道如何写在kglGetSO的返回处获得相应地址,只能使用硬代码获得相应寄存器信息,那位知道gdb如何写。

© 版权声明
THE END
支持一下吧
点赞8 分享
评论 抢沙发
头像
请文明发言!
提交
头像

昵称

取消
昵称表情代码快捷回复

    暂无评论内容