gdb 分析core文件 小记
测试环境twemproxy进程突然出core退出,记录一下gdb分析过程
解析 coredump文件
bt -- 打印crash时的堆栈
# gdb /root/proxy/bin/twemproxy /tmp/cordump.file (gdb) bt #0 0x00007f9f3b0d4337 in ssignal () from /lib64/libc.so.6 #1 0x00007f9f3b0d5a28 in abort () from /lib64/libc.so.6 #2 0x000000000041e27b in nc_assert (cond=cond@entry=0x44ea8a "pr->type == MSG_REQ_REDIS_DEL", file=file@entry=0x44ea18 "nc_redis.c", line=line@entry=2387, panic=panic@entry=1) at nc_util.c:353 #3 0x0000000000428065 in redis_pre_coalesce (r=0x22aaf30) at nc_redis.c:2387 #4 0x000000000041443b in rsp_forward (msg=0x22aaf30, s_conn=0x2b36690, ctx=0x1c33570) at nc_response.c:289 #5 rsp_recv_done (ctx=0x1c33570, conn=0x2b36690, msg=0x22aaf30, nmsg=<optimized out>) at nc_response.c:345 #6 0x0000000000410926 in msg_parsed (msg=0x22aaf30, conn=0x2b36690, ctx=0x1c33570) at nc_message.c:803 #7 msg_parse (msg=0x22aaf30, conn=0x2b36690, ctx=0x1c33570) at nc_message.c:841 #8 msg_recv_chain (msg=0x22aaf30, conn=0x2b36690, ctx=0x1c33570) at nc_message.c:896 #9 msg_recv (ctx=0x1c33570, conn=0x2b36690) at nc_message.c:973 #10 0x000000000040ba44 in core_recv (conn=0x2b36690, ctx=0x1c33570) at nc_core.c:247 #11 core_core (ctx=0x1c33570, conn=0x2b36690, events=1) at nc_core.c:385 #12 0x000000000040bcbc in core_loop (ctx=ctx@entry=0x1c33570) at nc_core.c:446 #13 0x0000000000402b48 in nc_run (nci=0x65a0e0 <nci>) at nc.c:709 #14 main (argc=<optimized out>, argv=0x7fff5f1ee568) at nc.c:763
可以看到异常的时2层,进入到它的上一层,然后打印层2 显示的变量pr
(gdb) frame 3 #3 0x0000000000428065 in redis_pre_coalesce (r=0x22aaf30) at nc_redis.c:2387 2387 nc_redis.c: No such file or directory. (gdb) p *pr $1 = {c_tqe = {tqe_next = 0x1fb14a0, tqe_prev = 0x25dde00, trace = {lastfile = 0x4472d0 "nc_request.c", lastline = 307, prevfile = 0x4472d0 "nc_request.c", prevline = 329}}, s_tqe = {tqe_next = 0x0, tqe_prev = 0x0, trace = { lastfile = 0x4472d0 "nc_request.c", lastline = 340, prevfile = 0x4472d0 "nc_request.c", prevline = 316}}, m_tqe = { tqe_next = 0x0, tqe_prev = 0x0, trace = {lastfile = 0x4463e8 "nc_message.c", lastline = 1065, prevfile = 0x4463e8 "nc_message.c", prevline = 1012}}, id = 2984568133, peer = 0x22aaf30, owner = 0x28cb430, tmo_rbe = {left = 0x0, right = 0x0, parent = 0x0, key = 0, data = 0x0, color = 0 '\000'}, mhdr = {stqh_first = 0x0, stqh_last = 0x28f3868}, mlen = 4357, state = 0, depth = 0, pos = 0x0, token = 0x0, parser = 0x423030 <redis_parse_req>, result = MSG_PARSE_OK, fragment = 0x428270 <redis_fragment>, add_auth = 0x4284c0 <redis_add_auth>, failure = 0x427f20 <redis_failure>, reply = 0x428890 <redis_reply>, pre_coalesce = 0x427f60 <redis_pre_coalesce>, post_coalesce = 0x428340 <redis_post_coalesce>, mbuf_get = 0x40d960 <mbuf_get>, type = MSG_REQ_REDIS_MSET, keys = 0x1f9f450, vlen = 0, end = 0x0, narg_start = 0x0, narg_end = 0x0, narg = 12, rnarg = 0, rnargs = { 0, 0, 0}, rlen = 0, integer = 0, frag_owner = 0x2834e10, nfrag = 0, frag_id = 134161, nfrag_done = 0, frag_seq = 0x0, err = 0, error = 0, ferror = 0, request = 1, quit = 0, noreply = 0, noforward = 0, done = 1, fdone = 0, swallow = 0, redis = 1, mseterr = 0, ticket = 0, concurrent = 0, param_err = 0, start_tv = {tv_sec = 0, tv_usec = 0}, send_server_tv = {tv_sec = 1599718440, tv_usec = 23785}, recv_server_tv = {tv_sec = 0, tv_usec = 0}, dump_len = 0, dump_data = "DEL spring:session:expirations:1599718440000 w_limit:loanDistribution call('pexpire', KEYS[1], ARGV[2]); else return nil; end", forward_server = {len = 19, data = 0x1f46ce0 "127.0.0.1:10021:1"}}
可以根据dump_data得到当时解析到的命令,只保留128字节
然后结合着函数一起看
可以看到,打印出来的pr->type 为 MSG_REQ_REDIS_MSET,但是却走到了mget和del类型指令的预解析流程中
接下来就得具体看代码看上层逻辑为这么走到了这一层