线上问题debug过程(cat,grep,tr,awk,sort,uniq,comm等工具的综合使用)
问题:发现线上到货单的数量,小于实际到货的数量. 怀疑一些隐藏的条件,将部分唯一码进行了过滤,导致数量变少.
开展了如下的跟踪流程:
1.找到其中一个明细的唯一码
grep 6180e-4b09f pms.log>> tmp1
2.查找出问题的方法所输出的日志
grep purchaseConfirm tmp1 >> tmp2
内容如下:
2017-02-28 16:14:25.040 [DubboServerHandler-10.26.235.193:20885-thread-100] INFO com.ejlerp.dal.framework.service.advice.AvoidRepeatInvokeAdvice.aroundAdvice - 拦截幂等性方法:purchaseConfirm,参数列表:[CallerInfo{tenantId=100033, operatorId=100163, timestamp=1488269665036, remark='null'}, 100114, [id=null,pmsArrivalRecordId=null,scanCode=D-A237-C78-157X,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=62567-52f59,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=64065-4c942,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=62928-4ce7e,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=64594-4c667,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=6238f-4b71b,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=6217b-55c88,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=62853-51e41,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=629e0-4b6f4,skuNo=null,skuId=0,uniqueCode=null,vendorId=0,purchaserId=0,arrivalNum=1,creator=100163,createdAt=Tue Feb 28 16:14:25 CST 2017,lastUpdater=null,lastUpdated=null,tenantId=100033,isUsable=null, id=null,pmsArrivalRecordId=null,scanCode=628a4-49850,skuNo=null,
其中包含了需要的全部请求唯一码,
3.将唯一码字段,抽取出来
cat tmp2 |tr ',' '\n'| grep scanCode| awk '{gsub("scanCode=","");print $0}' >>part1
得到日志中最全部分的唯一码
4.查询数据库中已经插入的唯一码 至part2文件
5.对part1和part进行排序
cat part1 | sort | uniq > med1
cat part2 | sort | uniq > med2
6.左边的集合大,很大一部分都是相同的,求左边存在而右边不存在的
comm -3 med1 med2
7.得到3个缺少的唯一码
8.经过查证,分别是平台换商品的sku,以及订单处理删除商品导致的唯一码失效
总结:
虽然线上问题很多时候可以通过sql就可以定位,
但是部分问题,可能只在debug日志中一闪而过,因此需要掌握更多的武器(如,cat,grep,tr,awk,sort,uniq,comm.....),来追踪工作中各种各样的问题