一个asp.net OOM问题
(兄弟,再抓dump的时候,不要用debugdiag 的memory leak选项了,你可以直接用adplus -hang来弄。如果搞好了,再发给我一份。)
一哥们今天发mail给我,它的系统怀疑有Memory Leak的问题,从他的链接上down下来dump后,开始分析。
对于memory问题,一般强烈建议用debugdiag来分析,如我这等半懂不懂的人,用这个正合适,嘿嘿。对于如eparg等大牛,他们用windbg或者单纯用脑子想,就能搞定了。so,打开debugdiag(google一下这个关键字:debugdiag download,第一个就是msdn的下载地址),按照memory模式,把这个dump加上去。分析了N久后,有这么一个报告:
Warning |
webengine.dll is responsible for 92.64 MBytes worth of outstanding allocations. The following are the top 2 memory consuming functions: |
Warning |
WARNING - DebugDiag was not able to locate debug symbols for oracommon9.dll, so the reported function name(s) may not be accurate. |
首先说明一点,debugdiag报告出来的数字,不是那么太准确的(陈、熊等大牛跟我说的,我也不知道为啥)。
从上面报告能看出,webengine怀疑有90多M的泄漏,oracle的有60多M的泄漏。继续看下面的详细信息:
Function |
|
Allocation type |
Heap allocation(s) |
Heap handle |
0x1bc80000 |
Allocation Count |
3056 allocation(s) |
Allocation Size |
92.54 MBytes |
Leak Probability |
41% |
webengine的GetBuffer方法,有41%的可能性存在泄漏。call stack类似于这样:
webengine!MemAlloc+22 |
|
ntdll!RtlAllocateHeap |
webengine!DupStr+32 |
|
webengine!MemAlloc |
webengine!DirMonCompletion::Init+39 |
|
webengine!DupStr |
webengine!DirMonOpen+52 |
|
webengine!DirMonCompletion::Init |
0x210BC00 |
|
|
System.Collections.Hashtable.ContainsKey(System.Object) |
|
|
System.Web.Hosting.VirtualPathProvider.GetCacheDependency(System.Web.VirtualPath, System.Collections.IEnumerable, System.DateTime) |
|
|
System.Web.VirtualPath.GetCacheDependency(System.Collections.IEnumerable, System.DateTime) |
|
0x664F077C |
System.Web.Compilation.MemoryBuildResultCache.CacheBuildResult(System.String, System.Web.Compilation.BuildResult, Int64, System.DateTime) |
|
System.Web.VirtualPath.GetCacheDependency(System.Collections.IEnumerable, System.DateTime) |
System.Web.Compilation.BuildResultCache.CacheBuildResult(System.String, System.Web.Compilation.BuildResult, System.DateTime) |
|
|
System.Web.Compilation.BuildManager.GetBuildResultFromCacheInternal(System.String, Boolean, System.Web.VirtualPath, Int64) |
|
System.Web.Compilation.BuildResultCache.CacheBuildResult(System.String, System.Web.Compilation.BuildResult, System.DateTime) |
System.Web.Compilation.BuildManager.GetVPathBuildResultFromCacheInternal(System.Web.VirtualPath) |
|
System.Web.Compilation.BuildManager.GetBuildResultFromCacheInternal(System.String, Boolean, System.Web.VirtualPath, Int64) |
System.Web.Compilation.BuildManager.GetVPathBuildResultInternal(System.Web.VirtualPath, Boolean, Boolean, Boolean) |
|
System.Web.Compilation.BuildManager.GetVPathBuildResultFromCacheInternal(System.Web.VirtualPath) |
System.Collections.Hashtable.KeyEquals(System.Object, System.Object) |
|
|
System.Collections.Hashtable.get_Item(System.Object) |
|
|
System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object) |
|
|
0x79F047FD |
|
|
System.Threading._TimerCallback.TimerCallback_Context(System.Object) |
|
|
webengine!HttpCompletion::ProcessRequestInManagedCode+1cb |
|
|
webengine!HttpCompletion::ProcessRequestInManagedCode+1cb |
|
|
webengine!HttpCompletion::ProcessCompletion+48 |
|
webengine!HttpCompletion::ProcessRequestInManagedCode |
webengine!CorThreadPoolWorkitemCallback+1a |
|
|
0x79F024CF |
|
|
0x79F0202A |
|
|
0x79FC9840 |
|
|
kernel32!BaseThreadStart+34 |
|
|
郁闷的是,看不到自己的代码在里面。没办法,重新用windbg,硬着头皮看吧!首先看看dump大小,文件是452M,然后看看managed heap大小,!eeheap -gc结果如下:
Number of GC Heaps: 4
------------------------------
Heap 0 (000df710)
generation 0 starts at 0x03625d34
...
Large object heap starts at 0x12980038
segment begin allocated size reserved
12980000 12980038 1308d530 0x0070d4f8(7,394,552) 018f2000
Heap Size 0x1a261b4(27,419,060)
------------------------------
Heap 1 (000e07c8)
generation 0 starts at 0x07588e34
...
Large object heap starts at 0x14980038
segment begin allocated size reserved
14980000 14980038 14da4f50 0x00424f18(4,345,624) 01bdb000
Heap Size 0x1549d40(22,322,496)
------------------------------
Heap 2 (000e1c48)
...
Large object heap starts at 0x16980038
segment begin allocated size reserved
16980000 16980038 17040628 0x006c05f0(7,079,408) 0193f000
Heap Size 0x1a88720(27,821,856)
------------------------------
Heap 3 (000e3138)
...
Large object heap starts at 0x18980038
segment begin allocated size reserved
18980000 18980038 18ea0548 0x00520510(5,375,248) 01adf000
Heap Size 0x15f4a2c(23,022,124)
------------------------------
GC Heap Size 0x5fed040(100,585,536)
不大,一共才100M而已,而整个dump是452M,那么我们有理由怀疑是unmanged code占用了内存,或者是有内存碎片。如果是碎片,那么我们可以看一下是否因为compilation debug=true的开关造成的,可以看看!eeheap -loader,还好,看到的modules不多,几十个而已。
看一下gcheap上面有啥:!dumpheap -stat的结果如下
0:023> !dumpheap -stat
Using our cache to search the heap.
0x7a763d20 1 12 System.Diagnostics.TraceListenerCollection
0x7a762f14 1 12 System.Diagnostics.TraceOptions
。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。
0x6639cb2c 36,789 1,030,092 System.Web.FileMonitorTarget
0x000deee0 224 1,109,336 Free
0x7912dd40 13,390 1,131,392 System.Char[]
0x7912d9bc 4,367 1,974,312 System.Collections.Hashtable+bucket[]
0x7912d8f8 46,674 3,357,592 System.Object[]
0x79131b20 2,489 5,127,340 System.Decimal[]
0x7912dae8 39,144 33,776,232 System.Byte[]
0x790fd8c4 110,981 41,137,936 System.String
Total 558,319 objects, Total size: 100,552,792
比较有意思的是,一共100M的managed heap,byte + string就占了84M,这有点太多了。。。so,我们需要看一下这些byte和string都是啥。因为这两者数量都很大,一个是3万9,一个是11万,那么我们需要做点取舍,比如说,只看20K以上的东西。
!dumpheap -mt 0x7912dae8 -min 20000,这个输出很大,我们节略看一下:
0:000> !dumpheap -mt 0x7912dae8 -min 20000
------------------------------
Heap 0
Address MT Size
02a8ce9c 7912dae8 22952
02a9618c 7912dae8 22952
。。。。删除一些。。。。。
0360c990 7912dae8 23008
total 207 objects
------------------------------
Heap 1
Address MT Size
06aa8b34 7912dae8 22952
06abd458 7912dae8 23356
。。。。删除一些。。。。。
07551d7c 7912dae8 23000
14a64bf8 7912dae8 131088
total 200 objects
------------------------------
Heap 2
Address MT Size
0aa7fabc 7912dae8 32780
0aae77ac 7912dae8 23008
。。。。删除一些。。。。。
0b64a5f4 7912dae8 24064
0b6a93ac 7912dae8 24064
total 225 objects
------------------------------
Heap 3
Address MT Size
0ea819f0 7912dae8 65548
。。。。删除一些。。。。。
0f73160c 7912dae8 22896
0f825cd8 7912dae8 22968
0f9389c8 7912dae8 20044
total 223 objects
------------------------------
total 855 objects
Statistics:
MT Count TotalSize Class Name
7912dae8 855 20442324 System.Byte[]
Total 855 objects
如果看完整的列表,仔细看的话,我们能看到很多大小一致的object,如上面结果中在heap2里面的红颜色标记的两个对象。因为他们本身是byte数组,那么我们!do看起来是没有意义的,所以,!da看一下(一共24051个输出)
[24044] 0b6af1a0
[24045] 0b6af1a1
[24046] 0b6af1a2
[24047] 0b6af1a3
[24048] 0b6af1a4
[24049] 0b6af1a5
[24050] 0b6af1a6
[24051] 0b6af1a7
所有的值都是连续的,看起来是一个地址。So,我们简单的用dc命令看看,这个数组到底是什么?
0:000> dc 0b6a93ac
0b6a93ac 7912dae8 00005df4 683c0a0d 206c6d74 ...y.]....<html
0b6a93bc 6e6c6d78 68223d73 3a707474 77772f2f xmlns="http://ww
0b6a93cc 33772e77 67726f2e 3939312f 68782f39 w.w3.org/1999/xh
0b6a93dc 226c6d74 3c0a0d3e 64616568 3d646920 tml">..<head id=
0b6a93ec 61654822 3e223164 7469743c 0d3e656c "Head1"><title>.
0b6a93fc 96e6090a 96b8e4b0 e9aabae7 94e58094 ................
0b6a940c 91bde7ae 2f3c0a0d 6c746974 6d3c3e65 ......</title><m
0b6a941c 20617465 656d616e 6261223d 61727473 eta name="abstra
恩,有意思,只是一个html页面,我们继续dc往下看,会有如下结果:
0b6a942c 20227463 746e6f63 3d746e65 8094e922 ct" content="...
0b6a943c e2ae94e5 80e29480 b096e694 e796b8e4 ................
0b6a944c 95e5aaba a18ae586 20423242 e491bde7 ........B2B ....
0b6a945c bae48ab8 9398e6a4 e582b8e5 2022ba9c .............."
0b6a946c 6d3c3e2f 20617465 656d616e 6f63223d /><meta name="co
0b6a947c 69727970 22746867 6e6f6320 746e6574 pyright" content
0b6a948c 6126223d 633b706d 2079706f 34303032 ="&copy 2004
0b6a949c 3030322d 31202037 ffffffff ffffffff -2007 juqiang.
0:000> dc
0b6a94ac 726f4320 61726f70 6e6f6974 646e6120 Corporation and
0b6a94bc 73746920 63696c20 6f736e65 202e7372 its licensors.
0b6a94cc 206c6c41 68676972 72207374 72657365 All rights reser
0b6a94dc 2e646576 3e2f2022 74656d3c 64692061 ved." /><meta id
0b6a94ec 6564223d 69726373 6f697470 6e20226e ="description" n
0b6a94fc 3d656d61 73656422 70697263 6e6f6974 ame="description
0b6a950c 3e2f2022 74656d3c 64692061 656b223d " /><meta id="ke
0b6a951c 726f7779 20227364 656d616e 656b223d ywords" name="ke
0:000> dc
0b6a952c 726f7779 20227364 2f3c3e2f 64616568 ywords" /></head
0b6a953c 200a0d3e 3c202020 6b6e696c 65726820 >.. <link hre
0b6a954c 68223d66 3a707474 74732f2f 2e656c79 f="http://style.
0b6a955c ffffffff ffffffff 687a2f6d 2f6e632d juqiang./zh-cn/
0b6a956c 5f707041 2f626557 61666544 2e746c75 App_Web/Default.
0b6a957c 22737363 70797420 74223d65 2f747865 css" type="text/
0b6a958c 22737363 6c657220 7453223d 73656c79 css" rel="Styles
0b6a959c 74656568 3e2f2022 20200a0d 6c3c2020 heet" />.. <l
0:000> dc
0b6a95ac 206b6e69 66657268 7468223d 2f3a7074 ink href="http:/
0b6a95bc 7974732f 312e656c ffffffff ffffffff /style.juqiang.
0b6a95cc 2d687a2f 412f6e63 575f7070 772f6265 /zh-cn/App_Web/w
0b6a95dc 65686265 632e6461 20227373 65707974 ebhead.css" type
0b6a95ec 6574223d 632f7478 20227373 3d6c6572 ="text/css" rel=
0b6a95fc 79745322 6873656c 22746565 0d3e2f20 "Stylesheet" />.
0b6a960c 2020200a 696c3c20 68206b6e 3d666572 . <link href=
0b6a961c 74746822 2f2f3a70 6c797473 3631ffff "http://style.ju
0:000> dc
0b6a962c ffffffff 2f6d6f63 632d687a 70412f6e qiangom/zh-cn/Ap
0b6a963c 65575f70 53552f62 4f435245 4f52544e p_Web/USERCONTRO
0b6a964c 49445f4c 70412f56 62655770 6e61425f L_DIV/AppWeb_Ban
0b6a965c 2e72656e 22737363 6c657220 7453223d ner.css" rel="St
0b6a966c 73656c79 74656568 79742022 223d6570 ylesheet" type="
0b6a967c 74786574 7373632f 3e2f2022 20200a0d text/css" />..
0b6a968c 6c3c2020 206b6e69 66657268 7468223d <link href="ht
(红色的部分是dump中的客户数据,我自己随便替换掉了)
如果我们看其他byte[],结果和这个类似。Ok,我们继续看string都有什么东西
0x790fd8c4 110,981 41,137,936 System.String
这个!dumpheap –stat的信息中,string一共41M,我们看一下:
!dumpheap –mt 0x790fd8c4 -min 100000
0:000>
------------------------------
Heap 0
Address MT Size
129921f8 790fd8c4 131096
129f1e40 790fd8c4 262168
。。。。。删除一些。。。。。
130725f0 790fd8c4 110384
total 38 objects
------------------------------
Heap 1
Address MT Size
14980048 790fd8c4 131096
149a0070 790fd8c4 131096
。。。。。删除一些。。。。。
14d44f00 790fd8c4 131096
14d64f28 790fd8c4 262168
total 28 objects
------------------------------
Heap 2
Address MT Size
16980048 790fd8c4 262168
169c0070 790fd8c4 131096
。。。。。删除一些。。。。。
16fa05c0 790fd8c4 262168
17020610 790fd8c4 131096
total 37 objects
------------------------------
Heap 3
Address MT Size
18980048 790fd8c4 131096
189a0070 790fd8c4 262168
。。。。。删除一些。。。。。
18e404f8 790fd8c4 262168
18e80520 790fd8c4 131096
total 32 objects
------------------------------
total 135 objects
Statistics:
MT Count TotalSize Class Name
790fd8c4 135 22903416 System.String
Total 135 objects
仔细观察这些字符串,能看到大小相等的非常之多。随便看一个,类似于下面结果:
0:000> !do -nofields 12c91f18
Name: System.String
MethodTable: 790fd8c4
EEClass: 790fd824
Size: 131090(0x20012) bytes
(C:"WINDOWS"assembly"GAC_32"mscorlib"2.0.0.0__b77a5c561934e089"mscorlib.dll)
String: <P><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体;">适用于各种大型作业、施工现场和建筑物立面等场所的大范围照明,也可满足工程机械、设施以及城市景观等照明需要。。。。。后面从略
很明显,这也是一个html页面的内容。Html页面是asp.net engine在server端生成的,那么如果该页面或者该页面的某些信息被放到了session里面,并且session没有被清除,那么这些页面就会对系统内存使用产生影响。而session是放到System.Web.Caching.Cache里面的,so,我们需要看System.Web.Caching.Cache是不是有什么东西。在上面的dumpheap –stat结果中,我们能找到这么一条:
6639d878 1 12 System.Web.Caching.Cache
Ok,我们look一下:!dumpheap –mt 6639d878,结果如下:
0:000> !dumpheap -mt 6639d878
------------------------------
Heap 0
Address MT Size
total 0 objects
------------------------------
Heap 1
Address MT Size
total 0 objects
------------------------------
Heap 2
Address MT Size
0a99a408 6639d878 12
total 1 objects
------------------------------
Heap 3
Address MT Size
total 0 objects
------------------------------
total 1 objects
Statistics:
MT Count TotalSize Class Name
6639d878 1 12 System.Web.Caching.Cache
Total 1 objects
只有一个object,我们首先看一下大小
0:000> !objsize 0a99a408
sizeof(0a99a408) = 32764788 ( 0x1f3f374) bytes (System.Web.Caching.Cache)
晕倒!居然cache了32M的东西!而整个gcheap才100M大小。那么可以想象,当我的负载增加的时候,随着gcheap的增长,cache会变得很大。
0:023> !do 0a99a408
Name: System.Web.Caching.Cache
MethodTable: 6639d878
EEClass: 6639d808
Size: 12(0xc) bytes
(C:"WINDOWS"assembly"GAC_32"System.Web"2.0.0.0__b03f5f7f11d50a3a"System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
6639e2c8 4001391 4 ...ing.CacheInternal 0 instance 0a99a4f0 _cacheInternal
7910c878 400138f 1b8 System.DateTime 1 shared static NoAbsoluteExpiration
>> Domain:Value 000c8c48:NotInit 000ff288:0a99e8b4 <<
7911228c 4001390 1bc System.TimeSpan 1 shared static NoSlidingExpiration
>> Domain:Value 000c8c48:NotInit 000ff288:0a99e8c4 <<
Cache内部最重要的一个成员是_cacheInternal,继续看:
0:023> !do 0a99a4f0
Name: System.Web.Caching.CacheMultiple
MethodTable: 6639dc9c
EEClass: 6639dc24
Size: 24(0x18) bytes
(C:"WINDOWS"assembly"GAC_32"System.Web"2.0.0.0__b03f5f7f11d50a3a"System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
6639d7a4 40013bc 4 ...ching.CacheCommon 0 instance 0a99a328 _cacheCommon
79102290 40013d3 c System.Int32 1 instance 0 _disposed
7912d8f8 40013d4 8 System.Object[] 0 instance 0a99a508 _caches
79102290 40013d5 10 System.Int32 1 instance 3 _cacheIndexMask
哦,_caches是一个数组,so,我们看一下是什么:
0:023> !da -details 0a99a508
Name: System.Web.Caching.CacheSingle[]
MethodTable: 7912d8f8
EEClass: 7912de6c
Size: 32(0x20) bytes
Array: Rank 1, Number of elements 4, Type CLASS
Element Methodtable: 6639dd2c
[0] 0a99a528
Name: System.Web.Caching.CacheSingle
MethodTable: 6639dd2c
EEClass: 6644183c
Size: 76(0x4c) bytes
(C:"WINDOWS"assembly"GAC_32"System.Web"2.0.0.0__b03f5f7f11d50a3a"System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
6639d7a4 40013bc 4 ...ching.CacheCommon 0 instance 0a99a328 _cacheCommon
79101fe4 40013c3 8 ...ections.Hashtable 0 instance 0a99a574 _entries
6639df10 40013c4 c ...hing.CacheExpires 0 instance 0a99a5b8 _expires
6639e100 40013c5 10 ...aching.CacheUsage 0 instance 0a99b218 _usage
。。。。。。。。。。。。。省略。。。。。。。。。。。。。。。。
>> Domain:Value 000c8c48:NotInit 000ff288:NotInit <<
[3] 0a99d0f0
Name: System.Web.Caching.CacheSingle
MethodTable: 6639dd2c
EEClass: 6644183c
Size: 76(0x4c) bytes
(C:"WINDOWS"assembly"GAC_32"System.Web"2.0.0.0__b03f5f7f11d50a3a"System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
6639d7a4 40013bc 4 ...ching.CacheCommon 0 instance 0a99a328 _cacheCommon
79101fe4 40013c3 8 ...ections.Hashtable 0 instance 0a99d13c _entries
6639df10 40013c4 c ...hing.CacheExpires 0 instance 0a99d174 _expires
6639e100 40013c5 10 ...aching.CacheUsage 0 instance 0a99ddd4 _usage
790fd0f0 40013c6 14 System.Object 0 instance 0a99df4c _lock
79102290 40013c7 20 System.Int32 1 instance 0 _disposed
。。。。。。。。。。。。。省略。。。。。。。。。。。。。。。。
>> Domain:Value 000c8c48:NotInit 000ff288:NotInit <<
我们选择一个CacheSingle来看,比如说最后一个,看里面的hashtable的信息
!do 0a99d13c
0:023> !do 0a99d13c
Name: System.Collections.Hashtable
MethodTable: 79101fe4
EEClass: 79101f74
Size: 56(0x38) bytes
(C:"WINDOWS"assembly"GAC_32"mscorlib"2.0.0.0__b77a5c561934e089"mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
7912d9bc 400092b 4 ...ashtable+bucket[] 0 instance 0b04f858 buckets
79102290 400092c 1c System.Int32 1 instance 596 count
79102290 400092d 20 System.Int32 1 instance 517 occupancy
79102290 400092e 24 System.Int32 1 instance 794 loadsize
7910790c 400092f 28 System.Single 1 instance 0.720000 loadFactor
79102290 4000930 2c System.Int32 1 instance 4601 version
7910be50 4000931 30 System.Boolean 1 instance 0 isWriterInProgress
79107ef8 4000932 8 ...tions.ICollection 0 instance 00000000 keys
79107ef8 4000933 c ...tions.ICollection 0 instance 00000000 values
79116ef8 4000934 10 ...IEqualityComparer 0 instance 0a99a5ac _keycomparer
790fd0f0 4000935 14 System.Object 0 instance 00000000 _syncRoot
79111df0 4000936 18 ...SerializationInfo 0 instance 00000000 m_siInfo
通过上面的count,能知道,里面一共有596个对象。但是keys和values是0,郁闷(我也不知道这是什么原因),不过还好,这里还有一个buckets!我们用reflector反编译一下HashTable的实现,能找到这么一段:
// Nested Types
[StructLayout(LayoutKind.Sequential)]
private struct bucket
{
public object key;
public object val;
public int hash_coll;
}
Ok,说明这是一个struct,实际上,如果我们dump出buckets的内容,可以证实这点:
0:023> !da -details 0b04f858
Name: System.Collections.Hashtable+bucket[]
MethodTable: 7912d9bc
EEClass: 7912da74
Size: 13248(0x33c0) bytes
Array: Rank 1, Number of elements 1103, Type VALUETYPE
Element Methodtable: 791021d8
[0] 0b04f860
Name: System.Collections.Hashtable+bucket
MethodTable 791021d8
EEClass: 79102154
Size: 20(0x14) bytes
(C:"WINDOWS"assembly"GAC_32"mscorlib"2.0.0.0__b77a5c561934e089"mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790fd0f0 4000937 0 System.Object 0 instance 02e8370c key
790fd0f0 4000938 4 System.Object 0 instance 02e8370c val
79102290 4000939 8 System.Int32 1 instance -618753223 hash_coll
。。。。。。。。省略。。。。。。。。。。。。。。。。
[1101] 0b052bfc
Name: System.Collections.Hashtable+bucket
MethodTable 791021d8
EEClass: 79102154
Size: 20(0x14) bytes
(C:"WINDOWS"assembly"GAC_32"mscorlib"2.0.0.0__b77a5c561934e089"mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790fd0f0 4000937 0 System.Object 0 instance 07332234 key
790fd0f0 4000938 4 System.Object 0 instance 07332234 val
79102290 4000939 8 System.Int32 1 instance 426306189 hash_coll
[1102] 0b052c08
Name: System.Collections.Hashtable+bucket
MethodTable 791021d8
EEClass: 79102154
Size: 20(0x14) bytes
(C:"WINDOWS"assembly"GAC_32"mscorlib"2.0.0.0__b77a5c561934e089"mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790fd0f0 4000937 0 System.Object 0 instance 02f3a36c key
790fd0f0 4000938 4 System.Object 0 instance 02f3a36c val
79102290 4000939 8 System.Int32 1 instance -1416280683 hash_coll
这里面列出了每一个的key和value。
其实到这里,我们基本能猜测出来了,cache里面的东西过大,很可能是因为代码中只有Add没有Remove或者没有超时机制造成的。
重新看kb
0:023> kb 200
ChildEBP RetAddr Args to Child
1bf99e0c 7c957d0b 71a819d6 00000634 00000001 ntdll!KiFastSystemCallRet
1bf99e10 71a819d6 00000634 00000001 1bf99e38 ntdll!NtWaitForSingleObject+0xc
1bf99e4c 71a8c517 00000634 00000c0c 00000002 mswsock!SockWaitForSingleObject+0x3a
1bf99ec4 71b694e5 00000c0c 1bf99f24 00000001 mswsock!WSPRecv+0x203
1bf99f00 619156b4 00000c0c 1bf99f24 00000001 ws2_32!WSARecv+0x77
WARNING: Stack unwind information not available. Following frames may be wrong.
1bf99f34 61912fc1 00000c0c 1bb1136e 00000810 orantcp9!nttini+0x4694
1bf9ebfc 1bc40802 00000000 00000000 00000000 System_Data_OracleClient!System.Data.OracleClient.OracleCommand.Execute(System.Data.OracleClient.OciStatementHandle, System.Data.CommandBehavior, Boolean, System.Data.OracleClient.OciRowidDescriptor ByRef, System.Collections.ArrayList ByRef)+0x3e5
App_Code_nz6nfqpz!EC168CH.OracleDAL.DataPaging.GetRecordsByPaging_v6_2(System.String, System.String, System.String, System.Decimal, System.Decimal, System.Decimal, System.Decimal, System.Decimal)+0x663
1bf9ee98 1b00bf8b 00000000 00000000 00000000 App_Code_nz6nfqpz!EC168CH.OracleDAL.DataPaging.GetRecordsByPaging_v6(System.String, System.String, System.String, System.Decimal, System.Decimal, System.Decimal, System.Decimal ByRef, System.Decimal ByRef)+0x266
1bf9eee0 1b00bf0c 00000000 00000000 00000000 App_Code_nz6nfqpz!EC168CH.OracleDAL.DataPaging.GetRecordsByPaging(System.String, System.String, System.String, System.Decimal, System.Decimal, System.Decimal, System.Decimal ByRef, System.Decimal ByRef)+0x63
1bf9ef2c 1b00b993 00000000 00000000 00000000 App_Code_nz6nfqpz!DataPaging.GetRecordsByPaging(System.String, System.String, System.String, System.Decimal, System.Decimal, System.Decimal, System.Decimal ByRef, System.Decimal ByRef)+0x74
029efddc 1b00abd2 00000000 00000000 00000000 App_Web_ur5qxxzw!App_web_Buy_BuyList.BindData(System.Decimal)+0x1bb
1bf9f14c 66f12980 00000000 00000000 00000000 App_Web_ur5qxxzw!App_web_Buy_BuyList.Page_Load(System.Object, System.EventArgs)+0x92a
1bf9f38c 6628efd2 00000000 00000000 00000000
000000 System_Web_ni!System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)+0x59d
1bf9f3c4 6614d80f 00000000 00000000 00000000 System_Web_ni!System.Web.UI.Page.ProcessRequest(Boolean, Boolean)+0x67
1bf9f400 6614d72f 00000000 00000000 00000000
mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
1bf9fd14 79f95a2e 00000000 1bf9fd28 00000001 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x223
1bf9ffb8 7c824829 1b1969c8 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49
1bf9ffec 00000000 79f959e8 1b1969c8 00000000 kernel32!BaseThreadStart+0x34
上述的callstack我做了很大的简化(节省页面,免得有凑字数的嫌疑),但是我们能看到,和我们代码相关的逻辑中,是在App_Web_ur5qxxzw!App_web_Buy_BuyList.Page_Load调用BindData,它调用GetRecordsByPaging,最后执行oracle的一些东西。
那么,我们要看看,Page_Load里面究竟在做什么?看下面的结果:
0:023> !name2ee App_Web_ur5qxxzw!App_web_Buy_BuyList.Page_Load
Module: 1b0185c4 (App_Web_ur5qxxzw.dll)
Token: 0x06000019
MethodDesc: 1b019108
Name: App_web_Buy_BuyList.Page_Load(System.Object, System.EventArgs)
JITTED Code Address: 1b00a2a8
Ok,我们得到了module的地址,那么可以使用命令把这个module存起来:
!savemodule 1b0185c4 c:"1.dll
然后用reflector打开1.dll,找到Page_Load(这个例子中,在基类里面实现的Page_Load),我们发现有如下代码:
protected void Page_Load(object sender, EventArgs e)
{
// 其他省略
this.CurrentPageNum = Convert.ToInt32(this.Pg);
if ((this.BizPageCount > 0M) && (this.dtBiz.Rows.Count < 1))
{
this.BindData(decimal.op_Decrement(this.CurrentPageNum));
}
this.BindJuqiang();
}
BindData方法,最终按照上面的callstack走下去的,没什么可疑的。我们看一下BindJuqiang方法代码,最终调用了类似代码:
public static DataTable readJuqiang() { // 此处省略 if (HttpRuntime.Cache[key] == null) { keyVal = dal.readProvince(); common.AddCache(key, keyVal, null, CacheTimeOut); return keyVal; } keyVal = (DataTable) HttpRuntime.Cache[key]; if (keyVal == null) { keyVal = new DataTable(); } return keyVal; } 注意上面的common.AddCache方法,最后一个参数是CacheTimeOut(我们先不管这个值),先看AddCache方法,它最终调用了: HttpRuntime.Cache.Add(Key, KeyVal, dependencies, DateTime.Now.AddSeconds((double) CacheDuration), Cache.NoSlidingExpiration, CacheItemPriority.High, null); 这里的CacheDuration就是上面的CacheTimeOut,那么CacheTimeOut怎么定义的呢? 在类中,有如下定义: public class News { // Fields private static readonly int CacheTimeOut; private static readonly INews dal; 在reflector中,我们找到cacheTimeOut,然后analyze,看used by,ok,找到如下信息:
我们在dumpheap –stat中查找Appsettings,一共找到两个,分别是: 0x648f0fdc 2 136 System.Configuration.AppSettingsSection 细节: 0:000> !dumpheap -short -mt 0x648f0fdc 0x0e9d4fa4 0x0e9d5220 执行!gcroot 0x0e9d4fa4,有N长的东西出现,(未完待续,关键是那个CacheTimeOut是多少?我现在还不知道怎么找。。。) |
结论基本上有了,代码中往cache里面塞了很多东西,但是expire time可能很长或者是0(永远不remove),造成了内存越来越大……