2017-4-21 Shell+Python对抓包文件后的文本处理过程
这几天毕设的事情,需要把Modbus数据包变成十六进制形式,但是wireshark不是非常给力,也可能是我还没找到窍门吧。这几天的文本处理把我整的够惨,有些问题以前从来没想过,遇到了真是让人觉得书到用时方恨少呀。做下笔记,以后用的着。
一、目录结构解析
[ root@ssd #] ls /tmp
1.txt 10_BCD.sh 7.sh get_final.py README
(1)[ root@ssd #] cat 1.txt ##其中1.txt是原始抓包文件,
No. Time Source Destination Protocol Length Info
246 166.994531 192.168.1.100 192.168.1.101 Modbus/TCP 66 Query: Trans: 0; Unit: 1, Func: 3: Read Holding Registers
Frame 246: 66 bytes on wire (528 bits), 66 bytes captured (528 bits) on interface 0
Ethernet II, Src: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39), Dst: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
Destination: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
Address: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
Address: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.1.100, Dst: 192.168.1.101
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
Total Length: 52
Identification: 0x6971 (26993)
Flags: 0x02 (Don't Fragment)
Fragment offset: 0
Time to live: 128
Protocol: TCP (6)
Header checksum: 0x0d39 [validation disabled]
[Header checksum status: Unverified]
Source: 192.168.1.100
Destination: 192.168.1.101
[Source GeoIP: Unknown]
[Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 58708, Dst Port: 502, Seq: 1, Ack: 1, Len: 12
Source Port: 58708
Destination Port: 502
[Stream index: 1]
[TCP Segment Len: 12]
Sequence number: 1 (relative sequence number)
[Next sequence number: 13 (relative sequence number)]
Acknowledgment number: 1 (relative ack number)
Header Length: 20 bytes
Flags: 0x018 (PSH, ACK)
Window size value: 16425
[Calculated window size: 65700]
[Window size scaling factor: 4]
Checksum: 0xb0f0 [unverified]
[Checksum Status: Unverified]
Urgent pointer: 0
[SEQ/ACK analysis]
[PDU Size: 12]
Modbus/TCP
Transaction Identifier: 0
Protocol Identifier: 0
Length: 6
Unit Identifier: 1
Modbus
.000 0011 = Function Code: Read Holding Registers (3)
Reference Number: 0
Word Count: 10
No. Time Source Destination Protocol Length Info
247 167.015547 192.168.1.101 192.168.1.100 Modbus/TCP 83 Response: Trans: 0; Unit: 1, Func: 3: Read Holding Registers
Frame 247: 83 bytes on wire (664 bits), 83 bytes captured (664 bits) on interface 0
Ethernet II, Src: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e), Dst: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
Destination: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
Address: HonHaiPr_65:5d:39 (1c:3e:84:65:5d:39)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Source: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
Address: AskeyCom_1c:52:1e (e0:ca:94:1c:52:1e)
.... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
.... ...0 .... .... .... .... = IG bit: Individual address (unicast)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.1.101, Dst: 192.168.1.100
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
Total Length: 69
Identification: 0x1d8e (7566)
Flags: 0x02 (Don't Fragment)
Fragment offset: 0
Time to live: 64
Protocol: TCP (6)
Header checksum: 0x990b [validation disabled]
[Header checksum status: Unverified]
Source: 192.168.1.101
Destination: 192.168.1.100
[Source GeoIP: Unknown]
[Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 502, Dst Port: 58708, Seq: 1, Ack: 13, Len: 29
Source Port: 502
Destination Port: 58708
[Stream index: 1]
[TCP Segment Len: 29]
Sequence number: 1 (relative sequence number)
[Next sequence number: 30 (relative sequence number)]
Acknowledgment number: 13 (relative ack number)
Header Length: 20 bytes
Flags: 0x018 (PSH, ACK)
Window size value: 256
[Calculated window size: 65536]
[Window size scaling factor: 256]
Checksum: 0xdaf5 [unverified]
[Checksum Status: Unverified]
Urgent pointer: 0
[SEQ/ACK analysis]
[PDU Size: 29]
Modbus/TCP
Transaction Identifier: 0
Protocol Identifier: 0
Length: 23
Unit Identifier: 1
Modbus
.000 0011 = Function Code: Read Holding Registers (3)
[Request Frame: 246]
Byte Count: 20
Register 0 (UINT16): 0
Register 1 (UINT16): 0
Register 2 (UINT16): 0
Register 3 (UINT16): 1
Register 4 (UINT16): 0
Register 5 (UINT16): 0
Register 6 (UINT16): 0
Register 7 (UINT16): 0
Register 8 (UINT16): 0
Register 9 (UINT16): 0
(2)[ root@ssd #] cat 10_BCD.sh
#!/bin/bash
if [ ! -d test ];then
mkdir test
fi
grep -iA57 "Modbus/TCP 66 " *.txt |grep -iA8 "^Modbus/TCP" >test/b.txt
cd test
yum install dos2unix -y --quiet ##windows文件放在linux下有个^M字符编码问题,下个dos2unix即可解决
dos2unix b.txt
cat b.txt |grep "Transaction" |awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 111
cat b.txt |grep "Prot" |awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 222
cat b.txt |grep "Leng" |awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 333
cat b.txt |grep "Unit Identifier" |awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 444
cat b.txt |grep "Function"|grep "Register" |awk -F ":" '{print $2}'|awk -F "(" '{print $2}'|awk -F ")" '{print $1}'> 555
cat b.txt |grep "Refe" |awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 666
cat b.txt |grep "Word"|awk -F ":" '{print $2}'|sed 's/^[ \t]*//g'> 777
if [ $? -eq 0 ];then
paste -d "," 111 222 333 444 555 666 777 > c.txt
sed -i '/,,/d' c.txt
line_number=`cat c.txt | awk -F "," '{if ($NF==NULL)print NR}' ` ##删除最后一个字符是空的行
arr=($line_number) ##把字符串转换为数组,arr默认是arr[0]数组第一个元素的意思
sed -i $arr',$d' c.txt ##sed命令在shell中太被动了,这个命令害惨我了
cd ..
echo "====十进制结果都在test目录下的c.txt文件中=====!"
fi
(3)[ root@ssd # ] cat get_final.py
#!/usr/bin/env python
# -*- coding: utf-8 -*
import os
import commands
commands.getoutput(" /bin/bash 10_BCD.sh >&/dev/null ")
def num_bcd(num): ##十进制转16进制,取四位!
a = hex(num)## 25转换为0x19
if num > 16:
a = a[:1]+'0'+a[2:4] ##0x19转换为0019
a = a[:2]+','+a[2:4]+',' ##0019转换为00,19
else: ##比如如果是10,就不好办了
a = a[:1]+'0,0'+a[2]+','
return a
def fun2(num): ##取两位二进制,比如10转换为0a而不是00,0a
a = hex(num)
if num > 16:
a = a[2:4] + ',' ##字符串切片
else:
a = a[:1]+a[2] + ','
return a
f = open('test/c.txt')
contents = []
for line in f.readlines():
b = line.split(",") ##line由字符串变成了列表
for i in range(len(b)):
if b[i] == " ": ##如果是空的,认为数据帧是不完整的
break
else:
b[i] = int(b[i])
var1 = " "
if i == 3 or i == 4: ##保证数据帧第4个和第5个数字只留2位
var1 = fun2(b[i])
contents.append(var1)
else:
var1 = num_bcd(b[i])
contents.append(var1)
f.close()
filename = 'new.ini'
fobj = open(filename, 'w')
fobj.writelines(['%s%s' % (eachline, os.linesep) for eachline in contents]) ##新的内容放在列表中
fobj.close()
commands.getoutput(" /bin/bash 7.sh >& /dev/null ")
print "结果在final.txt文件中!"
(4)[ root@ssd # ] cat 7.sh
#!/bin/bash
cat new.ini | awk -F "," '{if (NR%7!=0)ORS=" ";else ORS="\n";print}' >final_Result
if [ -f new.ini ];then
rm -f new.ini
fi
(5)[ root@ssd # ] cat README
===================操作指南============================
.txt的文件都是是初始抓包文件!
Note: 只需要执行python get_final.py即可,数据帧结果保存在final_result文件中
过程描述:
1、执行python get_final.py的时候,首先调用10_BCD.sh,把原始抓包文件转换为十进制文件,在test目录有7个小文件,最后进行合并,得到b.txt
2、在python主体中,执行从十进制到十六进制的转换,但是每7列的十六进制形式是分散的
3、最后调用7.sh把十六进制排成一行,得到最后的结果final_Result
二、执行结果
[root@ssd modbus]# cat test/c.txt ##最开始是这样的格式
32,0,6,1,3,0,10
32,0,23,1,3,0,10
33,0,6,1,3,0,10
33,0,23,1,3,0,10
34,0,6,1,3,0,10
35,0,6,1,3,0,10
36,0,6,1,3,0,10
37,0,6,1,3,0,10
34,0,23,1,3,0,10
38,0,6,1,3,0,10
#32,0,6,1,3,0,, #最开始删不掉这种含有两个逗号,中间没有数字的的行
#42,0,6,1,3,0,, #在shell中,使用awk找到对应行号,然后arr转换为数组,然后sed删除从该行到末尾的行。sed -i $arr',$d' c.txt
[root@ssd modbus]# cat final_Result ##结果就是必须这样的十六形式
00,20, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,20, 00,00, 00,17, 01, 03, 00,00, 00,0a,
00,21, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,21, 00,00, 00,17, 01, 03, 00,00, 00,0a,
00,22, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,23, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,24, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,25, 00,00, 00,06, 01, 03, 00,00, 00,0a,
00,22, 00,00, 00,17, 01, 03, 00,00, 00,0a,
00,26, 00,00, 00,06, 01, 03, 00,00, 00,0a,