Gieno  DEBUGGING : Windbg Training, Episode 4

Prerequisites
This post will require some basic knowledge of windbg and the sos extension. For this I recommend looking at the following posts:

Introduction
I thought it was time to write another post on how to use windbg for troubleshooting. A lot of my time is spent locating exceptions in various web applications, so I thought this might be a good topic to cover. I've previously written a post specifically targeting OutOfMemoryExceptions, but I thought I should broaden the terms and make it a bit more general. There are two scenarios that are exceptionally common in my line of work:

  1. Clients are reporting 2nd chance exceptions displayed on screen with the classic "Server Error"- page.
  2. Performance is generally bad, and when we investigate it turns out that there are tons of exceptions being thrown every second.

In this post I'll cover how to investigate what exceptions have been thrown by an application, as well as how to use windbg and adplus to automatically gather specific information for us.

Where to start
Okay, so you have a web application that you've been monitoring and you believe it is throwing a lot of exceptions. You've taken a dump of the process and you're ready to begin the investigation. Where do you start?
!dumpallexceptions (!dae)
If your application is running under the .NET Framework 1.1 you can use the !dumpallexceptions-command (!dae) to get a list of all the exceptions still on the heap. Now, remember that an exception is a managed object, so they will eventually be garbage collected just like everything else. This means that when looking at the heap for exceptions you will only get the exceptions still in memory, not every exception thrown by the application since startup.
Anyway, if you run !dae you'll get a list of exceptions that looks like this:

Debugging Scripts

As you can see the !dae-command lists all exception types found on the heap. If possible it also gives us a callstack for each exception. Please note, however, that this doesn't mean that the call stack is more or less the same for all exceptions. In the sample above you might have ~20 different callstacks leading to the 136 NullReferenceExceptions you see.
Unfortunately the !dae command is not available in version 2.0 of sos.dll. Still, it's quite easy to get (more or less) the same result by using the !dumpheap command. If we just type "!dumpheap -type Exception" we'll get a list of all objects with the string "Exception" in their class name. This is almost as good.

0:000> !dumpheap -type Exception -stat
------------------------------
Heap 
0
total 
79 objects
------------------------------
Heap 
1
total 
76 objects
------------------------------
Heap 
2
total 
91 objects
------------------------------
Heap 
3
total 
92 objects
------------------------------
total 
338 objects
Statistics:
      MT    Count    TotalSize Class Name
790ff624        
3           36 System.Text.DecoderExceptionFallback
790ff5d8        
3           36 System.Text.EncoderExceptionFallback
790f9ad4        
1           72 System.ExecutionEngineException
790f9a30        
1           72 System.StackOverflowException
790f998c        
1           72 System.OutOfMemoryException
653c8d04        
1           76 System.Data.SqlClient.SqlException
790f984c        
3          216 System.Exception
66414de0       
18          216 System.Web.HttpApplication+CancelModuleException
7911bc7c       
11          352 System.UnhandledExceptionEventHandler
7911a3b0        
8          608 System.ArgumentNullException
790f9b78       
36         2592 System.Threading.ThreadAbortException
663d9268      
116         9744 System.Web.HttpUnhandledException
7915cf40      
136         9792 System.NullReferenceException
Total 
338 objects

As you can see, this gives us almost the same information except for the callstacks.

Knowing what to ignore
When analyzing data it is always good to know how to filter the information.
The ever-present exceptions
There are a three exceptions that are created as soon as the worker process starts. This means that you will always see them on the heap even if they haven't been thrown at all.:

  • System.ExecutionEngineException
  • System.StackOverflowException
  • System.OutOfMemoryException

So why are they created if we haven't thrown them? - Any guesses?
The answer is quite simple: If you run into a situation where you need to throw any of these exceptions you will probably be in a state where you can't create them. For example, you've run out of memory and are no longer able to allocate even the tiniest string. How would you then be able to allocate enough memory to create a new exception?

So, provided that there's still only one of each on the heap, you can most likely ignore these three exceptions. When it comes to ExecutionEngineExceptions and OutOfMemoryExceptions you will probably have a pretty good idea that this is what you're looking for, and finding a StackOverflowException isn't that hard. If you run !clrstack and find a callstack of 200+ lines you can be more or less certain that this is your problem.
System.Threading.ThreadAbortException
Usually when you see a ThreadAbortExceptions it is because you've called Response.Redirect.
Whenever you call Response.Redirect, this will also result in a call to Response.End. This will terminate the thread prematurely, resulting in a System.Threading.ThreadAbortException. See the callstack below for an example.

    SP       IP       Function
    1ED6F37C 793D74D0 mscorlib_ni
!System.Threading.Thread.Abort(System.Object)+0x2c
    1ED6F390 6600CA8C System_Web_ni
!System.Web.HttpResponse.End()+0x5c
    1ED6F3A4 6600B8C3 System_Web_ni
!System.Web.HttpResponse.Redirect(System.String, Boolean)+0x1f3
    1ED6F3B8 6600B6B7 System_Web_ni
!System.Web.HttpResponse.Redirect(System.String)+0x7
    1ED6F3BC 1DDD2E1D Company_Web
!Company.Web.UI.Page.RedirectToPreviousPage()+0x125

Obviously I'm not saying you should discard all System.Threading.ThreadAbortExceptions as irrelevant. Even if you have no reason to believe that ThreadAbortExceptions are a major concern it's always a good idea to investigate a few of them. Take a minute or two to confirm that there is an underlying call to Response.End caused by a Response.Redirect. Once you think that you have enough statistical data to imply that the ThreadAbortExceptions are caused by Redirects you can move on.

Examining the Exceptions
Okay, so say we want to look at the callstacks for the System.Data.SqlClient.SqlException, well first of all we need the address for it. As you might remember, this is easily obtained by using !dumpheap without the -stat option.

0:000> !dumpheap -type System.Data.SqlClient.SqlException
------------------------------
Heap 
0
Address       MT     Size
total 
0 objects
------------------------------
Heap 
1
Address       MT     Size
total 
0 objects
------------------------------
Heap 
2
Address       MT     Size
0b62cf38 653c8d04       
76     
total 
1 objects
------------------------------
Heap 
3
Address       MT     Size
total 
0 objects
------------------------------
total 
1 objects
Statistics:
      MT    Count    TotalSize Class Name
653c8d04        
1           76 System.Data.SqlClient.SqlException
Total 
1 objects

Now we have the address for the exception. In order to investigate the exception we could use !dumpobject, but there is another command I want to use first.
!printexception (!pe)
Running the !printexception command on the address of an exception will give us some neat information on the exception in question. Here's the result of running !printexception on the SqlException:

0:000> !pe 0b62cf38 
Exception object: 0b62cf38
Exception type: System.Data.SqlClient.SqlException
Message: Transaction (
Process ID 96) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1DACECA8 1D0A4D4C Company_Database_1d3c0000
!Company.Database.AuditTrail.Write(System.String, System.Guid, System.String)+0x194
    1DACED6C 1D0A4B98 Company_Database_1d3c0000
!Company.Database.AuditTrail.AddEntry(EntryType, System.Guid, System.String)+0x48
    1DACED90 1D0A4B24 Company_Database_1d3c0000
!Company.Database.DataFunctions.SaveRow(System.String, System.Guid, Boolean)+0x5e4
    1DACEE20 1DDD7A06 FooBase_1d560000
!FooBase.PersonApplicationOtherQualificationBase.PersonApplicationOtherQualificationBaseManager.BaseSave(System.Guid, Boolean)+0x2e
    1DACEE34 1DDD79BF Foo_1cfd0000
!Foo.PersonApplicationOtherQualification.PersonApplicationOtherQualificationManager.Save(System.Guid, Boolean)+0x27
    1DACEE48 1DDD77F5 Foo_1cfd0000
!UserControls.PersonApplicationOtherQualificationDetail.OnSave()+0x65
    1DACEE70 1DDD2701 Company_Web_1d090000
!Company.Web.UI.Page.InvokeUsercontrolTransaction(System.Web.UI.Control, Company.Web.TransactionType)+0x1d1

StackTraceString: 
HResult: 
80131904
The current thread is unmanaged

This is good stuff. The command was even able to generate a callstack for us. (This may not always be the case, since the callstack may very well have gone out of scope.)
!dumpobject (!do) still has its uses
I wouldn't say that !printexception is a complete replacement for !dumpobject when it comes to examining exceptions. !Printexception will fit the exception into a standard template, and since some exceptions may contain more data than others we sometimes want to use !dumpobject as well. The SqlException has a property called _errors that contains a System.Data.SqlClient.SqlErrorCollection that we might want to look at. This is not in the listing above, so we need to use !dumpobject to look at it.

0:000> !do 0b62cf38 
Name: System.Data.SqlClient.SqlException
MethodTable: 653c8d04
EEClass: 6540a0d0
Size: 
76(0x4c) bytes
(C:\WINDOWS\assembly\GAC_32\System.Data\
2.0.0.0__b77a5c561934e089\System.Data.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
790f9244  40000b5        
4        System.String  0 instance 00000000 _className
79107d4c  40000b6        
8 ection.MethodBase  0 instance 00000000 _exceptionMethod
790f9244  40000b7        c        System.String  
0 instance 00000000 _exceptionMethodString
790f9244  40000b8       
10        System.String  0 instance 0b62cdfc _message
79112734  40000b9       14 tions.IDictionary  0 instance 0b62cf84 _data
790f984c  40000ba       
18     System.Exception  0 instance 00000000 _innerException
790f9244  40000bb       1c        System.String  
0 instance 00000000 _helpURL
790f8a7c  40000bc       
20        System.Object  0 instance 0b62d030 _stackTrace
790f9244  40000bd       
24        System.String  0 instance 00000000 _stackTraceString
790f9244  40000be       
28        System.String  0 instance 00000000 _remoteStackTraceString
790fdb60  40000bf       
34         System.Int32  1 instance        0 _remoteStackIndex
790f8a7c  40000c0       2c        System.Object  
0 instance 00000000 _dynamicMethods
790fdb60  40000c1       
38         System.Int32  1 instance -2146232060 _HResult
790f9244  40000c2       
30        System.String  0 instance 00000000 _source
790fcfa4  40000c3       3c        System.IntPtr  
1 instance        0 _xptrs
790fdb60  40000c4       
40         System.Int32  1 instance -532459699 _xcode
653c8b28  
40017e0       44 qlErrorCollection  0 instance 0b62cd90 _errors

There we have it. Now we can continue using !dumpobject to investigate it even further if we wish.
Inner exceptions
If we take a look at one of the HttpUnhandledExceptions we find that it has an inner exception. It is even nice enough to let us know how to find out more about it.

0:000> !pe 10544f64 
Exception object: 10544f64
Exception type: System.Web.HttpUnhandledException
Message: 
InnerException: System.NullReferenceException, use 
!PrintException 10544df8 to see more
StackTrace (generated):
    SP       IP       
Function
    1E3BE1D8 6614FDB2 System_Web_ni
!System.Web.UI.Page.HandleError(System.Exception)+0x3e6
    1E3BE220 6615681A System_Web_ni
!System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)+0x1b3a
    1E3BF190 66154A8A System_Web_ni
!System.Web.UI.Page.ProcessRequest(Boolean, Boolean)+0xd6
    1E3BF1C8 
66154967 System_Web_ni!System.Web.UI.Page.ProcessRequest()+0x57
    1E3BF200 
66154887 System_Web_ni!System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)+0x13
    1E3BF208 6615481A System_Web_ni
!System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)+0x32
    1E3BF21C 1C741EAE App_Web__ekpvebx
!ASP.sys_pages_application_application_aspx.ProcessRequest(System.Web.HttpContext)+0x1e
    1E3BF228 65FF27D4 System_Web_ni
!System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()+0x130
    1E3BF25C 65FC15B5 System_Web_ni
!System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)+0x41

This means that the System.NullReferenceException mentioned lead to the HttpUnhandledException we're currently investigating. So if we want to find the root cause we'll need to investigate the inner exception as well.

Extra credit
If you've looked at my 3rd post for debugging a while back you saw some examples of the .foreach command. This is a great command to use, for example if you want to see the callstack for all System.ArgumentNullExceptions. Instead of manually iterating through all the exceptions we can now dump them all at once, check their callstacks, etc.

0:000> .foreach(myVariable {!dumpheap -type System.ArgumentNullException -short}){!pe myVariable;.echo *************}
Exception object: 03c6fbb8
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1F15E3E0 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1F15E43C 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 
08378e24
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1CE6EC60 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1CE6ECBC 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 084c0b30
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1E3BEEE0 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 08522f84
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1E3BEEE0 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 0c036bf8
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1DC7EB60 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1DC7EBBC 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 105d7f60
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1F39F360 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1F39F3BC 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 106206fc
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1F39F360 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1F39F3BC 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Exception object: 1077a864
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException: 
StackTrace (generated):
    SP       IP       
Function
    1E3BEEE0 795FC73C mscorlib_ni
!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000
!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString: 
HResult: 
80004003
The current thread is unmanaged
*************
Unknown option: 
------------------------------
*************

Well this post was even longer than usual.
That's all.

 posted on 2009-08-13 13:52  Gieno  阅读(714)  评论(0编辑  收藏  举报