Gieno  DEBUGGING : Windbg Training, Episode 1

Basic Configuration
1 Copy sos.dll from the framework directory to the folder where you installed windbg. Make sure you copy it from the same Framework version as the one you wish to investigate. If you'll be working with both 1.1 and 2.0 you can rename the SOS.dlls to SOS11.dll and SOS20.dll or put thein separate folders. 
2 Create a folder where you want to cache all the symbol files. For example: "C:\Symbols"
3 Start windbg and open the dialogue to configure symbol path by clicking File -> Symbol File Path. 
4 Enter the path, as well as the address from which you'll want to download missing symbols using the following syntax:
   srv*[cache path]*[symbols path]
   I'd recommend the following path:
   srv*c:\symbols\public*http://msdl.microsoft.com/download/symbols
You should now be set to go. You're now ready to open up a saved dump, or attach to a process

Useful commands
I'll be using a dump from an IIS6-server to demonstrate some useful commands.
The first thing you'll want to do is load SOS. You'll do this using the .load command. The syntax is simple. .load [extension filename]. So if you want to load sos and haven't renamed the .dll you'd simply write:
.load sos
You'll now have all the cool commands from the SOS-extension at your disposal as well as the default windbg ones. Commands from extensions are always preceded by a "!", so if you want to run the help-command for sos you'd write:
!help
If you should happen to have two extensions with an identically named command you can always separate them by typing ![extension name].[command] Example:
!sos.help
Okay, now that we know how to run the commands from the extension, try running !help. It should give you the following result.
0:000> !help
-------------------------------------------------------------------------------
SOS is a debugger extension DLL designed to aid 
in the debugging of managed
programs. Functions are listed by category, then roughly 
in order of
importance. Shortcut names 
for popular functions are listed in parenthesis.
Type 
"!help " for detailed info on that function

Object Inspection                  Examining code and stacks
-----------------------------      -----------------------------
DumpObj (
do)                       Threads
DumpArray (da)                     CLRStack
DumpStackObjects (dso)             IP2MD
DumpHeap                           U
DumpVC                             DumpStack
GCRoot                             EEStack
ObjSize                            GCInfo
FinalizeQueue                      EHInfo
PrintException (pe)                COMState
TraverseHeap                       BPMD 

Examining CLR data structures      Diagnostic Utilities
-----------------------------      -----------------------------
DumpDomain                         VerifyHeap
EEHeap                             DumpLog
Name2EE                            FindAppDomain
SyncBlk                            SaveModule
DumpMT                             GCHandles
DumpClass                          GCHandleLeaks
DumpMD                             VMMap
Token2EE                           VMStat
EEVersion                          ProcInfo 
DumpModule                         StopOnException (soe)
ThreadPool                         MinidumpMode 
DumpAssembly                       
DumpMethodSig                      Other
DumpRuntimeTypes                   
-----------------------------
DumpSig                            FAQ
RCWCleanupList
DumpIL
For more documentation on a specific command, type !help [name of command]
.time
This is not an SOS-command, which is evident by the command not beginning with a "!". Running the .Time command will show you relevant info about the time, as well as system uptime, process uptime and the amount of time spent in kernel & user mode.
0:000> .time
Debug session time: Tue Oct 
23 08:38:35.000 2007 (GMT+1)
System Uptime: 
4 days 17:48:01.906
Process Uptime: 0 days 0:24:37.000
  Kernel time: 
0 days 0:04:23.000
  User time: 
0 days 0:03:28.000
As you can see the system has been up for over 4 days. The process has been running for 24½ minutes and has an accumulated CPU-time of about 8 minutes total. This would give us an average CPU-usage for the process of around 32,5%
!threadpool
We can then use the !Threadpool-command to find out exactly what the CPU-usage was at the time the dump was taken. We'll also get some useful information like the number of work requests in the queue, completion port threads and timers.
0:000> !threadpool
CPU utilization 
100%
Worker Thread: Total: 
5 Running: 4 Idle: 1 MaxLimit: 200 MinLimit: 2
Work Request 
in Queue: 16
Unknown 
Function: 6a2d945d  Context: 023ede30
Unknown 
Function: 6a2d945d  Context: 023ee1e8
AsyncTimerCallbackCompletion TimerInfo
@11b53760
Unknown 
Function: 6a2d945d  Context: 023ee3a8
Unknown 
Function: 6a2d945d  Context: 023e3040
Unknown 
Function: 6a2d945d  Context: 023ee178
Unknown 
Function: 6a2d945d  Context: 023edfb0
AsyncTimerCallbackCompletion TimerInfo
@11b36428
AsyncTimerCallbackCompletion TimerInfo
@11b53868
Unknown 
Function: 6a2d945d  Context: 023ee060
Unknown 
Function: 6a2d945d  Context: 023ee290
Unknown 
Function: 6a2d945d  Context: 023eded0
Unknown 
Function: 6a2d945d  Context: 023edd88
Unknown 
Function: 6a2d945d  Context: 023ede98
Unknown 
Function: 6a2d945d  Context: 023ee258
Unknown 
Function: 6a2d945d  Context: 023edfe8
--------------------------------------
Number of Timers: 
9
--------------------------------------
Completion Port Thread:Total: 
3 Free: 3 MaxFree: 4 CurrentLimit: 2 MaxLimit: 200 MinLimit: 2
So we can see that currently we're using 100% of the CPU, which leads us to the next command.
!runaway
This is a nice command that will list all running threads and their CPU-usage. It's your best friend when troubleshooting a high CPU hang issue.
0:000> !runaway
User Mode Time
  Thread       Time
  
25:1a94      0 days 0:00:39.937
  
16:1bc0      0 days 0:00:38.390
  
50:1e8c      0 days 0:00:08.859
  
52:1e40      0 days 0:00:08.687
  
20:1c2c      0 days 0:00:08.234
  
51:1340      0 days 0:00:08.171
  
21:1bcc      0 days 0:00:06.953
  
26:13ec      0 days 0:00:06.671
  
44:131c      0 days 0:00:03.906
  
22:d8c       0 days 0:00:03.375
  
33:78c       0 days 0:00:02.656
  
34:1a8c      0 days 0:00:00.906
  
29:1f5c      0 days 0:00:00.828
   
6:e28       0 days 0:00:00.625
   
5:1c78      0 days 0:00:00.546
  
23:14a4      0 days 0:00:00.484
   
4:5ac       0 days 0:00:00.437
  
45:5dc       0 days 0:00:00.421
   
3:13b4      0 days 0:00:00.421
  
47:19c8      0 days 0:00:00.375
  
28:1b6c      0 days 0:00:00.250
  
46:1dac      0 days 0:00:00.156
   
7:1dd8      0 days 0:00:00.109
  
48:cdc       0 days 0:00:00.093
  
49:1eac      0 days 0:00:00.062
  
15:1a64      0 days 0:00:00.062
   
0:1804      0 days 0:00:00.046
  
36:4a4       0 days 0:00:00.031
  
11:1eb4      0 days 0:00:00.031
   
1:10b4      0 days 0:00:00.031
  
31:16ac      0 days 0:00:00.015
  
14:4ac       0 days 0:00:00.015
   
2:186c      0 days 0:00:00.015
  
59:590       0 days 0:00:00.000
  
58:294       0 days 0:00:00.000
  
57:16d0      0 days 0:00:00.000
  
56:1578      0 days 0:00:00.000
  
55:1428      0 days 0:00:00.000
  
54:16d8      0 days 0:00:00.000
  
53:fd8       0 days 0:00:00.000
  
43:1b8c      0 days 0:00:00.000
  
42:1c24      0 days 0:00:00.000
  
41:1e2c      0 days 0:00:00.000
  
40:11b0      0 days 0:00:00.000
  
39:edc       0 days 0:00:00.000
  
38:1a08      0 days 0:00:00.000
  
37:171c      0 days 0:00:00.000
  
35:1254      0 days 0:00:00.000
  
32:1f9c      0 days 0:00:00.000
  
30:1ae8      0 days 0:00:00.000
  
27:190c      0 days 0:00:00.000
  
24:1d2c      0 days 0:00:00.000
  
19:1e38      0 days 0:00:00.000
  
18:ee4       0 days 0:00:00.000
  
17:fb8       0 days 0:00:00.000
  
13:1b54      0 days 0:00:00.000
  
12:1a48      0 days 0:00:00.000
  
10:f64       0 days 0:00:00.000
   
9:1024      0 days 0:00:00.000
   
8:1b78      0 days 0:00:00.000
As you can see the total amount of time does not match the total CPU utilization time that we got from the .time command. That's simply because threads get reused and recycled. This means that the total amount of CPU time used by a thread may have been split up over several page requests.
!threads
To get more information about the running threads we can run the !Threads-command. This will list all managed threads in the application, what application domain the thread is currently executing under, etc. The output will look like this:
0:000> !threads
ThreadCount: 
48
UnstartedThread: 
0
BackgroundThread: 
29
PendingThread: 
0
DeadThread: 
19
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
  
16    1 1bc0 001fccd0   1808220 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
22    2  d8c 002016f0      b220 Enabled  00000000:00000000 0019daf0     0 MTA (Finalizer)
  
14    4  4ac 00242e58   880a220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  
23    5 14a4 11b39f18    80a220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  
24    6 1d2c 11b41ad8      1220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
25    7 1a94 11b46c70   180b220 Enabled  27240c98:27241fd8 11b42540     1 MTA (Threadpool Worker)
  
26    9 13ec 12ce2888   200b220 Enabled  2a9f1434:2a9f33c0 11b42540     0 MTA
  
27    a 190c 12d85eb8   200b220 Enabled  00000000:00000000 11b42540     0 MTA
  
29    b 1f5c 13df6a50   200b220 Enabled  2ab1da6c:2ab1f1c0 11b42540     0 MTA
  
30    c 1ae8 12d44a58      b220 Enabled  00000000:00000000 11b42540     0 MTA
  
31    d 16ac 12e2e008   200b220 Enabled  2a81348c:2a8153c0 11b42540     1 MTA
   
5    e 1c78 12da2160       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
33    8  78c 11b674c8   200b220 Enabled  2707b818:2707c1d8 11b42540     0 MTA
  
34   12 1a8c 13f163c8       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
36   13  4a4 13eef718   200b220 Enabled  2a7db4a4:2a7dd3c0 11b42540     0 MTA
   
4   14  5ac 13ef2008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
42   10 1c24 13f0e950   880b220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
   
6   11  e28 13f16008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
   
3    f 13b4 13eba008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
43   15 1b8c 140db008   880b220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  
44   17 131c 140ceb28   200b220 Enabled  272288c8:27229fd8 11b42540     0 MTA
  
45   1d  5dc 140cd0a0       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  
47   20 19c8 1651a008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   
24    0 16468880   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
46   1f 1dac 1650ab48       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   1a    
0 140d5008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
16    0 140c5008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
50    3 1e8c 14064420   180b220 Enabled  27246f54:27247fd8 11b42540     1 MTA (Threadpool Worker)
XXXX   
35    0 1406e800   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
51   36 1340 140df008   180b220 Enabled  2adec9cc:2aded1c0 11b42540     1 MTA (Threadpool Worker)
XXXX   
37    0 16566868   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
48   38  cdc 16578840       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   
39    0 16566c28   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3b    
0 1646b8b0   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3c    
0 16674008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3d    
0 16676418   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3e    
0 16676fb8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3f    
0 16674d48   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
40    0 1667de10   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
41    0 16680050   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
42    0 166812e8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
43    0 16683e60   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
52   44 1e40 165259e8   180b220 Enabled  2adf126c:2adf31c0 11b42540     1 MTA (Threadpool Worker)
XXXX   
45    0 165b7c08   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
46    0 165aa3d8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
47    0 165242c8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   
48    0 165e9500   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  
49   3a 1eac 165676f0       220 Enabled  00000000:00000000 0019daf0     0 Ukn
The threads with an ID of XXXX have ended and are waiting to be recycled. We can also see that the finalizer thread has an ID of 22. So if we'd seen an unhealthy amount of activity on thread 22 when we ran the !runaway-command we would now have known that we had a finalizer-issue on our hands.

Switching to a specific thread
To go to a specific thread we use the ~-command. The syntax is as follows: ~[number of thread]s. So to switch to thread 50 we would type the following:
0:000> ~50s
We have then switched to thread 50, and can use a lot of other useful commands.
!clrstack
This great command will list the callstack for the current thread. If you want additional information you can add the "-p" switch which will show you parameters and local variables as well.
Below is a sample listing of the clrstack for thread 50.
0:050> !clrstack
OS Thread Id: 
0x1e8c (50)
ESP       EIP     
17a9e750 7d61c828 [NDirectMethodFrameSlim: 17a9e750] System.DirectoryServices.Protocols.Wldap32.ldap_bind_s(IntPtr, System.String, System.DirectoryServices.Protocols.SEC_WINNT_AUTH_IDENTITY_EX, System.DirectoryServices.Protocols.BindMethod)
17a9e768 14df70f9 System.DirectoryServices.Protocols.LdapConnection.BindHelper(System.Net.NetworkCredential, Boolean)
17a9e794 14df6de0 System.DirectoryServices.Protocols.LdapConnection.Bind()
17a9e79c 14df59e9 System.DirectoryServices.Protocols.LdapConnection.SendRequestHelper(System.DirectoryServices.Protocols.DirectoryRequest, Int32 ByRef)
17a9e8b8 14df56e8 System.DirectoryServices.Protocols.LdapConnection.SendRequest(System.DirectoryServices.Protocols.DirectoryRequest, System.TimeSpan)
17a9e8bc 14df5657 [InlinedCallFrame: 17a9e8bc]
So, reading from the bottom up we can see that an LdapConnection called the SendRequest method, which in turn called the SendRequestHelper method, which called the Bind method, and so on.
If we run !clrstack -p we get the following:
0:050> !clrstack -p
OS Thread Id: 
0x1e8c (50)
ESP       EIP     
17a9e750 7d61c828 [NDirectMethodFrameSlim: 17a9e750] System.DirectoryServices.Protocols.Wldap32.ldap_bind_s(IntPtr, System.String, System.DirectoryServices.Protocols.SEC_WINNT_AUTH_IDENTITY_EX, System.DirectoryServices.Protocols.BindMethod)
17a9e768 14df70f9 System.DirectoryServices.Protocols.LdapConnection.BindHelper(System.Net.NetworkCredential, Boolean)
    PARAMETERS:
        this 
= 0x271fdfe0
        newCredential 
= 
        needSetCredential 
= 

17a9e794 14df6de0 System.DirectoryServices.Protocols.LdapConnection.Bind()
    PARAMETERS:
        this 
= 

17a9e79c 14df59e9 System.DirectoryServices.Protocols.LdapConnection.SendRequestHelper(System.DirectoryServices.Protocols.DirectoryRequest, Int32 ByRef)
    PARAMETERS:
        this 
= 0x271fdfe0
        request 
= 0x27246e38
        messageID 
= 0x17a9e8ec

17a9e8b8 14df56e8 System.DirectoryServices.Protocols.LdapConnection.SendRequest(System.DirectoryServices.Protocols.DirectoryRequest, System.TimeSpan)
    PARAMETERS:
        this 
= 0x271fdfe0
        request 
= 0x27246e38
        requestTimeout 
= 

17a9e8bc 14df5657 [InlinedCallFrame: 17a9e8bc]
We can now look at the parameters, like the DirectoryRequest that was sent to the SendRequest and SendRequestHelper methods. To do this we simply copy the address of the request, (0x27246e38) and use it as an argument for our next command.
!dumpobject (!do)
This is another crucial command. Dumpobject will dump the object at the specified address, so if we send the address of the request as a parameter we will get the request dumped to screen.:
0:050> !do 0x27246e38
Name: System.DirectoryServices.Protocols.SearchRequest
MethodTable: 14b394c4
EEClass: 14d97ce0
Size: 
52(0x34) bytes
GC Generation: 
0
(C:\WINDOWS\assembly\GAC_MSIL\System.DirectoryServices.Protocols\
2.0.0.0__b03f5f7f11d50a3a\System.DirectoryServices.Protocols.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
02c39310  
4000102        4        System.String  0 instance 00000000 directoryRequestID
14b398bc  
4000103        8 ControlCollection  0 instance 27246e90 directoryControlCollection
02c39310  
4000111        c        System.String  0 instance 27246d00 dn
12579f5c  
4000112       10 .StringCollection  0 instance 27246eb4 directoryAttributes
02c36ca0  
4000113       14        System.Object  0 instance 27246ddc directoryFilter
14b39344  
4000114       18         System.Int32  1 instance        1 directoryScope
14b393fc  
4000115       1c         System.Int32  1 instance        0 directoryRefAlias
0fd3da00  
4000116       20         System.Int32  1 instance        0 directorySizeLimit
1202af88  
4000117       28      System.TimeSpan  1 instance 27246e60 directoryTimeLimit
120261c8  
4000118       24       System.Boolean  1 instance        0 directoryTypesOnly
Okay, so what's this? Well, it's a System.DirectoryServices.Protocols.SearchRequest object. This means that it has various properties defined by the System.DirectoryServices.Protocols.SearchRequest class. If you want to know more about these properties I suggest you look up the SearchRequest class in msdn. We have the RequestId, the Scope, the DistinguishedName, etc.
So, let's say we want to know what the distinguished name is for this particular request. We boldly assume that the dn-property we see in the listing above is what is called DistinguishedName by MSDN. Simply copy the address of the dn-property (27246d00) and use !dumpobject again. We can see that the property is a System.String, so the output should be pretty clear.
0:050> !do 27246d00 
Name: System.String
MethodTable: 02c39310
EEClass: 0fb610ac
Size: 
112(0x70) bytes
GC Generation: 
0
(C:\WINDOWS\assembly\GAC_32\mscorlib\
2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: 
CN=Dummy,CN=Accounts,CN=useradm,DC=dummy,DC=net
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
0fd3da00  
4000096        4         System.Int32  1 instance       48 m_arrayLength
0fd3da00  
4000097        8         System.Int32  1 instance       47 m_stringLength
0fb80010  
4000098        c          System.Char  1 instance       43 m_firstChar
02c39310  
4000099       10        System.String  0   shared   static Empty
    
>> Domain:Value  0019daf0:03380310 11b42540:03380310 <<
0fb86d44  400009a       
14        System.Char[]  0   shared   static WhitespaceChars
    
>> Domain:Value  0019daf0:03380324 11b42540:033855bc <<
Apparently, the distinguished name used was "CN=Dummy,CN=Accounts,CN=useradm,DC=dummy,DC=net". If we want to find out more we simply continue using !dumpobject to examine the objects and their respective values.
 posted on 2009-08-07 14:11  Gieno  阅读(539)  评论(0编辑  收藏  举报