应用程序高级调试-请求挂起分析

在平时的开发过程中,遇到一个网关服务请求挂起问题,以此作为切入点,简单介绍一下Windbg这个调试工具,以及如何使用这个工具分析问题。 

一、背景介绍

1、业务背景

      最近在开发新的业务系统,采用微服务的框架,前后端分离;后端提供的SG服务,前端运用Vue开发页面。后端的SG服务使用的是C#语音,数据库,Redis和Sqlserver等。开发过程中,后端服务在VS中调试代码,单元测试等都非常顺利;部署到开发联调环境前后端联调也有序进行着。

2、技术架构

     简单绘制了一下现有微服务的技术架构图。

 

3、遇到的问题

      突然有一天,在开发联调环境进行测试联调时,发现每次调用后端的SG服务都超时,响应时长超过10s,服务也没有抛出异常信息,非常影响开发效率,重启SG 服务器没有解决问题。

      于是,考虑抓个dump进行分析一下超时的原因。接下来先给大家普及一下dump的分析工具,这个问题的分析思路,然后详细说一下如何在dump文件中找到异常的具体过程。

二、 Windbg介绍

1、Windbg是个非常强大的调试器,它设计了极其丰富的功能来支持各种调试。针对几个常用的应用场景进行了对比分析 

支持的类型

说明

应用场景

用户态调试

附加进程的方式调试。调试器与被调试程序建立联系,程序向调试器发送暂停和恢复调试指令。

类似VS的单步调试,可以是设置断点单步调制

内核态调试

用来在本地和远程计算机调试内核

1、在系统启动的早期阶段或者系统关闭的后期阶段当不存在交互式的控制台时调试进程;

2、分析进程间通信问题

转储文件调试

 

 转储文件(Dump)是一个快照,它显示正在执行的进程和某个时刻为应用程序加载的模块。转储文件带有堆信息的转储还包括应用程序在该点的内存快照。

1、性能分析,内存泄漏,线程阻塞,

2、故障异常排查

3、进程Crash分析等

远程调试

通过调试服务器DbgSrv进行远程调试

1、程序运行需要时候全屏

2、程序在客户的机器上Crash崩溃

2、Windbg典型的窗口程序,但是它的大多数的调试功能,还是以输入命令进行的,命令不区分大小写。

命令分类

用途

数量

举例

说明

标准命令

适用于各种调试目标的最基本调试功能:查看,结束,帮助等

 

20多个

观察栈的命令K

显示线程的命令~

显示进程的命令|

结束调试的命令q

显示标准命令的?

1、通常是一两个字符或者符合,例外:version

2、部分命令代表一系列以这个字符开头的双字符命令

 

元命令

标准命令没有提供的调试功能。

140多个

加载模块 .loadby

              .load

显示已加载模块  .chain

元命令是内置在调试器引擎或者程序文件中的,可以直接用的

都是一个点开始(.)也叫点命令

扩展命令

1、用于扩展某一方面的调试功能;

2、用户也可以编写自己的扩展模块和命令

 

难以计数

查看线程 !threads

查看对象信息:!do

 

 

使用扩展命令时,命令以!开头

完整的格式:

![扩展模块名].<扩展命令名>[参数]

其中扩展模块名可以省略

3、加载扩展的命令:

       1)使用 .load 命令加上扩展模块的完整路径来加载它

       2)使用.loadby命令加上扩展模块的名称,自动到扩展模块路径中搜索匹配的模块

       3)使用!扩展模块名.扩展命令名的方式会自动搜索和加载指定的模块

4、默认的符号设置:

       srv*c:\symcache*http://msdl.microsoft.com/download/symbolsc:\symcache

       

三、网关服务请求挂起问题的分析过程

首先说明一下这个问题的根因:Redis hang住导致服务连接超时。正常重启redis即可解决,为避免后续再发生类似的问题,修改redis默认设置问题。  

            问题具体分析的步骤: 

        1、打开dump文件,加载符号文件.loadby sos clr ;     .load  c:\mycache\mex.dll   

0:000> .loadby sos clr
0:000> .load c:\symcache\mex.dll
Mex External 3.0.0.7172 Loaded!

2、查看所有的线程!threads, 发现存在lock的46号线程。 

0:000> !threads
PDB symbol for clr.dll not loaded
ThreadCount:      40
UnstartedThread:  0
BackgroundThread: 27
PendingThread:    0
DeadThread:       11
Hosted Runtime:   no
                                                                                                        Lock  
       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
  14    1 121e4 000001d447d948e0    28220 Preemptive  000001D5C85CC5A8:000001D5C85CE008 000001d447d8a0f0 0     Ukn 
  32    2 107d4 000001d447efa1e0    2b220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     MTA (Finalizer) 
  34    3 11318 000001d8d3ace3e0  102a220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     MTA (Threadpool Worker) 
  35    4 134dc 000001d8d3adf920    21220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
  37    6 101d8 000001d8d3bc89b0  1020220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn (Threadpool Worker) 
  39    7 b808 000001d8d3c68870  202b220 Preemptive  000001D6C81F6560:000001D6C81F7FD0 000001d8d3ade8e0 1     MTA 
  40    8 102c0 000001d8d3c67840  8029220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     MTA (Threadpool Completion Port) 
  41   10 10994 000001d8d3c70ae0  8029220 Preemptive  000001D7481920A0:000001D748193FD0 000001d447d8a0f0 0     MTA (Threadpool Completion Port) 
  22    9 145a8 000001d8d3c70310  8028220 Preemptive  000001D4486DA060:000001D4486DBFD0 000001d447d8a0f0 0     MTA (Threadpool Completion Port) 
  42   11 1425c 000001d8d3c712b0  202b220 Preemptive  000001D5C86C91B0:000001D5C86C9FD0 000001d8d3ade8e0 0     MTA 
  44   12 11a5c 000001d8d4ec22c0  3029220 Preemptive  000001D548294B00:000001D548295FD0 000001d8d3ade8e0 0     MTA (Threadpool Worker) 
  45   13 7ca0 000001d8d4ec2a90  3029220 Preemptive  000001D6484E75D0:000001D6484E8FD0 000001d8d3ade8e0 0     MTA (Threadpool Worker) 
  46   14 13320 000001d8d4ec6ca0  3029220 Preemptive  000001D7C8454AE8:000001D7C8455FD0 000001d8d3ade8e0 3     MTA (Threadpool Worker) 
  47   15 11098 000001d8d4ec7470  1029220 Preemptive  000001D6C8201300:000001D6C8201FD0 000001d447d8a0f0 0     MTA (Threadpool Worker) 
  48   16 a670 000001d8d4eca730  202b220 Preemptive  000001D4C8365CF8:000001D4C8365FD0 000001d8d3ade8e0 0     MTA 
  49   17 12b2c 000001d8d4ed4ca0  202b220 Preemptive  000001D6484E9330:000001D6484EAFD0 000001d8d3ade8e0 0     MTA 
  50   18 11dc8 000001d8d4eec8e0  202b220 Preemptive  000001D4486D01C8:000001D4486D1FD0 000001d8d3ade8e0 0     MTA 
  51   19 13da0 000001d8d3e58f60    2b020 Preemptive  000001D6C81FC508:000001D6C81FDFD0 000001d8d3ade8e0 1     MTA 
  52   20 cd10 000001d8d3e5ed20  202b220 Preemptive  0000000000000000:0000000000000000 000001d8d3ade8e0 0     MTA 
  53   21 101e8 000001d8d3e59730  202b220 Preemptive  000001D4C85C4CB8:000001D4C85C5FD0 000001d8d3ade8e0 0     MTA 
  54   22 2d90 000001d8d3e5fcc0  202b220 Preemptive  000001D4486E6178:000001D4486E7FD0 000001d8d3ade8e0 0     MTA 
  55   23 13a74 000001d8d4f16d90  202b220 Preemptive  000001D7C8413638:000001D7C8413FD0 000001d8d3ade8e0 0     MTA 
  56   24 1364c 000001d8d4f15df0    2b020 Preemptive  000001D6484EF488:000001D6484F0FD0 000001d8d3ade8e0 1     MTA 
  57   25 13890 000001d8d4f165c0  202b220 Preemptive  000001D6C81FE0A0:000001D6C81FFFD0 000001d8d3ade8e0 0     MTA 
  58   26 119d8 000001d8d4f18500  1029220 Preemptive  000001D4486E9C58:000001D4486E9FD0 000001d447d8a0f0 0     MTA (Threadpool Worker) 
  59   27 14614 000001d4475e59f0  1029220 Preemptive  000001D5485ACAD8:000001D5485ADFD0 000001d447d8a0f0 0     MTA (Threadpool Worker) 
XXXX   28    0 000001d4475e61c0   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   29    0 000001d4475e2b10   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   30    0 000001d4475e1b70   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   31    0 000001d4475e6990   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   32    0 000001d4475e32e0   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   33    0 000001d4475e13a0   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   34    0 000001d4475e3ab0   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   35    0 000001d4475e2340   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   36    0 000001d4475e5220   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
XXXX   37    0 000001d4475e4280   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
  60   38 46f4 000001d4475e4a50  1029220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     MTA (Threadpool Worker) 
  61   39 13d24 000001d4475e8100  1029220 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     MTA (Threadpool Worker) 
XXXX    5    0 000001d8d4f93b70   839820 Preemptive  0000000000000000:0000000000000000 000001d447d8a0f0 0     Ukn 
  62   40 10980 000001d8d4f9a8d0    2b220 Preemptive  000001D5485D63B8:000001D5485D7FD0 000001d8d3ade8e0 0     MTA 

3、查看线程的用户态执行时间,这里也发现46号线程在上面

0:000> !runaway
 User Mode Time
  Thread       Time
   14:121e4     0 days 0:00:00.625
   46:13320     0 days 0:00:00.156
   42:1425c     0 days 0:00:00.062
   53:101e8     0 days 0:00:00.046
   48:a670     0 days 0:00:00.031
   55:13a74     0 days 0:00:00.015
   32:107d4     0 days 0:00:00.015
    0:116dc     0 days 0:00:00.015
   62:10980     0 days 0:00:00.000
   61:13d24     0 days 0:00:00.000
   60:46f4     0 days 0:00:00.000
   59:14614     0 days 0:00:00.000
   58:119d8     0 days 0:00:00.000
   57:13890     0 days 0:00:00.000
   56:1364c     0 days 0:00:00.000
   54:2d90     0 days 0:00:00.000
   52:cd10     0 days 0:00:00.000
   51:13da0     0 days 0:00:00.000
   50:11dc8     0 days 0:00:00.000
   49:12b2c     0 days 0:00:00.000
   47:11098     0 days 0:00:00.000
   45:7ca0     0 days 0:00:00.000
   44:11a5c     0 days 0:00:00.000
   43:14518     0 days 0:00:00.000
   41:10994     0 days 0:00:00.000
   40:102c0     0 days 0:00:00.000
   39:b808     0 days 0:00:00.000
   38:fee4     0 days 0:00:00.000
   37:101d8     0 days 0:00:00.000
   36:334      0 days 0:00:00.000
   35:134dc     0 days 0:00:00.000
   34:11318     0 days 0:00:00.000
   33:eaf0     0 days 0:00:00.000
   31:1858     0 days 0:00:00.000
   30:11a98     0 days 0:00:00.000
   29:e30      0 days 0:00:00.000
   28:10840     0 days 0:00:00.000
   27:117c0     0 days 0:00:00.000
   26:b0f8     0 days 0:00:00.000
   25:baf0     0 days 0:00:00.000
   24:ffb8     0 days 0:00:00.000
   23:11c14     0 days 0:00:00.000
   22:145a8     0 days 0:00:00.000
   21:145d4     0 days 0:00:00.000
   20:12660     0 days 0:00:00.000
   19:d7f0     0 days 0:00:00.000
   18:230c     0 days 0:00:00.000
   17:12378     0 days 0:00:00.000
   16:8134     0 days 0:00:00.000
   15:12914     0 days 0:00:00.000
   13:11fd8     0 days 0:00:00.000
   12:13044     0 days 0:00:00.000
   11:3ef8     0 days 0:00:00.000
   10:1160c     0 days 0:00:00.000
    9:13a5c     0 days 0:00:00.000
    8:11924     0 days 0:00:00.000
    7:12068     0 days 0:00:00.000
    6:1386c     0 days 0:00:00.000
    5:e73c     0 days 0:00:00.000
    4:13b34     0 days 0:00:00.000
    3:f4a0     0 days 0:00:00.000
    2:10cdc     0 days 0:00:00.000
    1:14688     0 days 0:00:00.000

4、切换到46号线程   ~46s 

0:000> ~46s
ntdll!NtWaitForMultipleObjects+0x14:
00007ff8`d8ac67c4 c3              ret

5、查看线程的调用栈信息   !clrstack,发现访问redis获取配置信息时存在异常,并重新连接调用信息

0:046> !clrstack
PDB symbol for clr.dll not loaded
OS Thread Id: 0x13320 (46)
        Child SP               IP Call Site
0000002f001fb3a8 00007ff8d8ac67c4 [HelperMethodFrame_1OBJ: 0000002f001fb3a8] System.Threading.Thread.JoinInternal(Int32)
0000002f001fb4b0 00007ff86780b33a ServiceStack.Redis.RedisNativeClient.Connect()
0000002f001fb580 00007ff86780b1d4 ServiceStack.Redis.RedisNativeClient.TryConnectIfNeeded()
0000002f001fb5c0 00007ff86781590a ServiceStack.Redis.RedisNativeClient.SendReceive[[System.__Canon, mscorlib]](Byte[][], System.Func`1<System.__Canon>, System.Action`1<System.Func`1<System.__Canon>>, Boolean)
0000002f001fb620 00007ff867815cb4 ServiceStack.Redis.RedisNativeClient.SendExpectData(Byte[][])
0000002f001fb690 00007ff867817a22 ServiceStack.Redis.RedisNativeClient.get_Info()
0000002f001fb700 00007ff86780ae9e ServiceStack.Redis.RedisClient.GetServerRole()
0000002f001fb730 00007ff867809ec3 ServiceStack.Redis.RedisResolver.CreateRedisClient(ServiceStack.Redis.RedisEndpoint, Boolean)
0000002f001fb7e0 00007ff867809bed ServiceStack.Redis.PooledRedisClientManager.CreateRedisClient()
0000002f001fb850 00007ff867805d2c ServiceStack.Redis.PooledRedisClientManager.GetClient()
0000002f001fb8f0 00007ff867663e75 ****.****.****.RedisPoolManager.GetClient(System.String)
0000002f001fb9a0 00007ff867663bc4 ****.****.****.CacheService.GetClient()
0000002f001fb9e0 00007ff867662ccd ****.****.****.ServiceConfigCacheService.GetAvailableConfigByCache(Boolean)
0000002f001fbc30 00007ff8676619a1 ****.****.****.ServiceDAC.GetAvailableConfig()
0000002f001fbc70 00007ff867661646 ****.****.****.ServiceConfigCache.GetConfigFromCache()
0000002f001fbcb0 00007ff86766157c ****.****.****.ServiceConfigCache..ctor()
0000002f001fbd10 00007ff86765e778 ****.****.****.ServiceConfigCache.get_Current()
0000002f001fbd80 00007ff86765fadb ****.****.****.TCPRounter.Load()
0000002f001fbee0 00007ff86765ebce ****.****.****.TCPRounter..ctor()
0000002f001fbfc0 00007ff86765d7dd ****.****.****.RounterService.GetService(****.****.****.RounterContext, ****.****.****.ProxyModel, System.Collections.Generic.List`1<System.String>)
0000002f001fc090 00007ff86765de1b ****.****.****.TCPProxy.GetService[[System.__Canon, mscorlib]](System.Collections.Generic.List`1<****.****.****.SPI.ServiceConfig>, Boolean, System.Collections.Generic.List`1<System.String>)
…………

6、!dso 查看线程栈上的托管对象。发现存在redis连接异常,并尝试连接的信息


0:046> !dso
OS Thread Id: 0x13320 (46)
RSP/REG          Object           Name
0000002F001FAF50 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[]
0000002F001FB0D8 000001d5482a32c0 System.Web.AspNetSynchronizationContext
0000002F001FB1F8 000001d7c84548e0 System.Threading.ThreadStart
0000002F001FB248 000001d7c8454858 System.Threading.Thread
0000002F001FB290 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[]
0000002F001FB2C0 000001d7c8454858 System.Threading.Thread
0000002F001FB348 000001d7c8454818 System.Threading.ThreadStart
0000002F001FB368 000001d7c8454858 System.Threading.Thread
0000002F001FB3D8 000001d5482aa550 System.Security.Principal.GenericPrincipal
0000002F001FB3F0 000001d7c8454818 System.Threading.ThreadStart
0000002F001FB410 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[]
0000002F001FB420 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult
0000002F001FB428 000001d7c8452838 System.Random
0000002F001FB430 000001d54834dd30 System.Threading.ExecutionContext
0000002F001FB440 000001d7c8454858 System.Threading.Thread
0000002F001FB460 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult
0000002F001FB488 000001d5482aa550 System.Security.Principal.GenericPrincipal
0000002F001FB490 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult
0000002F001FB498 000001d7c8452838 System.Random
0000002F001FB4B0 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]]
0000002F001FB4C0 000001d7c8454818 System.Threading.ThreadStart
0000002F001FB4C8 000001d7c8454858 System.Threading.Thread
0000002F001FB550 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]]
0000002F001FB558 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB560 000001d7c8453cd8 System.Byte[][]
0000002F001FB568 000001d7c8452838 System.Random
0000002F001FB580 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB5A8 000001d7c8454618 ServiceStack.Redis.RedisRetryableException
0000002F001FB5B0 000001d7c8453cd8 System.Byte[][]
0000002F001FB5D0 000001d7c8454618 ServiceStack.Redis.RedisRetryableException
0000002F001FB5D8 000001d7c8453cd8 System.Byte[][]
0000002F001FB5E8 000001d7c8454148 ServiceStack.Redis.RedisRetryableException
0000002F001FB600 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB608 000001d7c8453cd8 System.Byte[][]
0000002F001FB620 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB628 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB630 000001d7c8453cd8 System.Byte[][]
0000002F001FB638 000001d7c8453d48 System.Func`1[[System.Byte[], mscorlib]]
0000002F001FB650 000001d7c8452838 System.Random
0000002F001FB658 000001d7c8453cf8 System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.String, mscorlib]]
0000002F001FB668 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB670 000001d7c8453cd8 System.Byte[][]
0000002F001FB678 000001d7c8452838 System.Random
0000002F001FB680 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult
0000002F001FB690 000001d7c8453cd8 System.Byte[][]
0000002F001FB6A0 000001d5485ae110 System.Byte[]
0000002F001FB6B0 000001d7c8452838 System.Random
0000002F001FB6B8 000001d54858ec78 ServiceStack.Redis.RedisResolver
0000002F001FB6C8 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB6E0 000001d5482fd000 System.Web.Http.Filters.IExceptionFilter[]
0000002F001FB6E8 000001d7c8452838 System.Random
0000002F001FB6F0 000001d5482fd2f0 System.Web.Http.Controllers.ExceptionFilterResult
0000002F001FB700 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint
0000002F001FB718 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint
0000002F001FB720 000001d5485a6c78 ServiceStack.Redis.RedisEndpoint
0000002F001FB788 000001d7c8453140 ServiceStack.Redis.RedisClient
0000002F001FB790 000001d7c8452950 ServiceStack.Redis.RedisClient
0000002F001FB7B0 000001d54858ec78 ServiceStack.Redis.RedisResolver
0000002F001FB7B8 000001d54858e7d8 ServiceStack.Redis.PooledRedisClientManager
0000002F001FB7C0 000001d7c8452818 ServiceStack.Redis.PooledRedisClientManager+<>c__DisplayClass77_0
0000002F001FB7C8 000001d7c8452838 System.Random
0000002F001FB7E0 000001d54858ec78 ServiceStack.Redis.RedisResolver
0000002F001FB810 000001d54858ebe8 System.Object
0000002F001FB828 000001d54858ebc8 System.Collections.Concurrent.ConcurrentStack`1[[ServiceStack.Redis.RedisClient, ServiceStack.Redis]]
0000002F001FB838 000001d5482fd1d0 System.Web.Http.ExceptionHandling.CompositeExceptionLogger
0000002F001FB868 000001d548334ad8 System.Object[]    (System.Object[])
0000002F001FB870 000001d7c84527f0 System.Diagnostics.Stopwatch
0000002F001FB8C8 000001d548334ad8 System.Object[]    (System.Object[])
0000002F001FB8D0 000001d7c84527f0 System.Diagnostics.Stopwatch
………………

7、使用命令!mex.do2查看redis的访问信息   ,连接的是本服务器上的redis

0:046> !mex.do2 000001d7c8452950
0x000001d7c8452950 ServiceStack.Redis.RedisClient
[statics]
  0000  endData                            : 000001d7c8452ad0 (System.Byte[]) [Length: 2]
  0008  lastCommand                        : NULL
  0010  lastSocketException                : NULL
  0018  socket                             : NULL
  0020  Bstream                            : NULL
  0028  sslStream                          : NULL
  0030  transaction                        : 000001d7c8452988 (System.Void)
  0038  pipeline                           : NULL
  0040  <ClientManager>k__BackingField     : NULL
  0048  <Host>k__BackingField              : 000001d5485a6c20  "127.0.0.1" [9] (System.String)
  0050  <NamespacePrefix>k__BackingField   : NULL
  0058  <Password>k__BackingField          : 000001d5485a6b60  "***********" [16] (System.String)
  0060  <Client>k__BackingField            : NULL
  0068  <ConnectionFilter>k__BackingField  : NULL
  0070  <SendCmdFilter>k__BackingField     : NULL
  0078  cmdBuffer                          : 000001d7c8452af0 (System.Collections.Generic.List<System.ArraySegment<System.Byte>>) [Length: 0]
  0080  currentBuffer                      : 000001d7c8452b18 (System.Byte[]) [Length: 1450]
  0088  <OnBeforeFlush>k__BackingField     : NULL
  0090  deactivatedAtTicks                 : 0 (System.Int64)
  0098  LastConnectedAtTimestamp           : 0 (System.Int64)
  00a0  <Id>k__BackingField                : 0 (System.Int64)
  00a8  db                                 : 0 (System.Int64)
  00b0  clientPort                         : 0 (System.Int32)
  00b4  active                             : 0 (System.Int32)
  00b8  <Port>k__BackingField              : 6379 (System.Int32)
  00bc  <ConnectTimeout>k__BackingField    : -1 (System.Int32)
  00c0  <RetryCount>k__BackingField        : 0 (System.Int32)
  00c4  <SendTimeout>k__BackingField       : -1 (System.Int32)
  00c8  <ReceiveTimeout>k__BackingField    : -1 (System.Int32)
  00cc  <IdleTimeOutSecs>k__BackingField   : 240 (System.Int32)
  00d0  currentBufferIndex                 : 0 (System.Int32)
  00d4  <Ssl>k__BackingField               : False (System.Boolean)
  00d5  <IsDisposed>k__BackingField        : False (System.Boolean)
  00d8  <SslProtocols>k__BackingField      : 000001d7c8452a30 (System.Nullable<System.Security.Authentication.SslProtocols>)
  00e0  retryTimeout                       : 000001d7c8452a38 00:00:03 (System.TimeSpan)
  00e8  <Name>k__BackingField              : NULL
  00f0  OnDispose                          : NULL
  00f8  registeredTypeIdsWithinPipelineMap : 000001d7c8452a80 (System.Collections.Generic.Dictionary<System.String,System.Collections.Generic.HashSet<System.String>>)
  0100  <Hashes>k__BackingField            : 000001d7c8453128 (ServiceStack.Redis.RedisClient+RedisClientHashes)
  0108  <Lists>k__BackingField             : 000001d7c84530e0 (ServiceStack.Redis.RedisClient+RedisClientLists)
  0110  <Sets>k__BackingField              : 000001d7c84530f8 (ServiceStack.Redis.RedisClient+RedisClientSets)
  0118  <SortedSets>k__BackingField        : 000001d7c8453110 (ServiceStack.Redis.RedisClient+RedisClientSortedSets)

8、使用命令!mex.do2查看连接redis抛出的异常,  提示连接redis异常

0:046> !mex.do2 000001d7c8454618 
0x000001d7c8454618 ServiceStack.Redis.RedisRetryableException
  0000  _className                : NULL
  0008  _exceptionMethod          : NULL
  0010  _exceptionMethodString    : NULL
  0018  _message                  : 000001d5485afca8  "Socket is not connected" [23] (System.String)
  0020  _data                     : NULL
  0028  _innerException           : NULL
  0030  _helpURL                  : NULL
  0038  _stackTrace               : 000001d7c84546f8 (System.SByte[]) [Length: 48]
  0040  _watsonBuckets            : NULL
  0048  _stackTraceString         : NULL
  0050  _remoteStackTraceString   : NULL
  0058  _dynamicMethods           : NULL
  0060  _source                   : NULL
  0068  _safeSerializationManager : 000001d7c84546c0 (System.Runtime.Serialization.SafeSerializationManager)
  0070  _xptrs                    : 0000000000000000 (System.IntPtr)
  0078  _ipForWatsonBuckets       : 00007ff867815a33 (System.UIntPtr)
  0080  _remoteStackIndex         : 0 (System.Int32)
  0084  _HResult                  : -2146233088 (System.Int32)
  0088  _xcode                    : -532462766 (System.Int32)
  0090  <Code>k__BackingField     : NULL

 9、问题根因:检查Redis运行窗口,发现redis没有正常启动,重新启动redis,服务访问恢复正常

     检测redis 窗口时,发现Redis为hang住的状态,如下图所示:

 

 重启Redis,下图是正常运行的效果:

 

            确定到问题根因: redis属性中【快速编辑模型模式】属性设置为勾选的情况下,启动过程中,点击页面中的位置,会中止启动,此时redis为不可用状态。

 

四、问题分析方法总结:

       1、找和问题相关的因素(锁,运行时间等)。

       2、定位疑似存在问题的线程,查看线程的堆栈。

       3、查看堆栈上的托管对象,看到明细的异常信息。

       4、明确问题根因 。

       5、修复验证。 

       以上是这个网关服务请求挂起问题的总结。借助WindDbg调试工具,分析Dump文件,通过分析业务逻辑内部的执行情况,找到问题的根因。把这个工具和分析过程分享给大家,希望对大家有所帮助。 

 

张宁涛

2022/05/05 更新

posted on 2021-12-28 11:02  张宁涛  阅读(1008)  评论(2编辑  收藏  举报

导航