从ciscn中学习pyd逆向

生成 pyd 文件

pyd文件在 linux下一般是 .so的形式,在windows下是 .pyd的形式

pip install cython

编写测试文件:

vim test_for_pyd.py

from ctypes import * from struct import pack,unpack def encrypt(v, key): v0, v1 = c_uint32(v[0]), c_uint32(v[1]) delta = 0x54646454 total = c_uint32(0) for i in range(64): v0.value += (((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]) # print(hex(v1.value << 3),hex(v1.value >> 6),hex((((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]))) total.value += delta v1.value += (((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[(total.value>>11) & 3]) # print(hex(v0.value << 3),hex(v0.value >> 6),hex((((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[total.value & 3]))) print(hex(v0.value),hex(v1.value)) return v0.value, v1.value def decrypt(v, key): v0, v1 = c_uint32(v[0]), c_uint32(v[1]) delta = 0x54646454 total = c_uint32(delta * 64) for i in range(64): v1.value -= (((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[(total.value>>11) & 3]) total.value -= delta v0.value -= (((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]) return v0.value, v1.value def xtea(inp:bytes,key:bytes): from struct import pack,unpack k = unpack("<4I",key) inp_len = len(inp) // 4 value = unpack(f"<{inp_len}I",inp) res = b"" for i in range(0,inp_len,2): v = [value[i],value[i+1]] # x = encrypt(v,k) x = decrypt(v,k) res += pack("<2I",*x) return res l2b = lambda lst: b''.join(i.to_bytes(4, 'little') for i in lst) cip_li = [0x481F56C9,0xc7EE5EF4,0xa00A5B72,0x7648F086,0x307F948,0x29B379B0] cip = l2b(cip_li) key = b"f\x00\x00\x00l\x00\x00\x00a\x00\x00\x00g\x00\x00\x00" ret = xtea(cip,key) flag = b"" for i in range(0,len(ret),4): x = (ret[i:i+4])[::-1] flag += x print(flag)

编写 setup.py文件

from setuptools import setup from Cython.Build import cythonize setup( name="test_for_pyd", ext_modules=cythonize('test_for_pyd.py') )
python setup.py build_ext --inplace

编译生成

这种方法生成的pyd文件在windows下默认是无符号的,在linux下默认是有符号的

不过我们可以指定生成有符号的 pyd文件:

python setup.py build_ext --inplace --debug

不过需要指定下载这个东西

image-20241218100645851

不然会缺少 python312_d.lib 文件

pyd逆向技巧

尝试恢复符号

可以尝试编译对应版本的python demo。然后bindiff恢复部分符号。

不过我觉得,除非pyd文件非常大,一般不需要bindiff恢复符号。

打印pyd文件的所有属性和信息

import rand0m x = dir(rand0m) print(x)

image-20241218101617939

这样可以初步了解程序的基本信息

直接使用 pyd的函数

上面我们可以发现这个 rand0m 库中存在 check、rand0m这些函数库,那我们可以直接 rand0m.rand0m()来调用函数

定位关键函数

image-20241218101935007

直接翻 .data段,一般上面是 "check" 字符串,下面就是函数

动态调试

可以字节写个 python文件调用 rand0m中的库函数,然后设置 input停顿一下,再让ida附加上去

frida hook 相关库函数

类似这种,直接 hook相关逻辑运算的函数,但这种做法的缺点就是,你需要解析一下PyLongObject结构体,因为不同版本下Long类型结构体不一样,还有就是,这种方法也hook不全所有的函数,比如对于Lshift、Rshift、And等逻辑操作来说,是hook不到的,因为有可能他就不走 Py提供的库函数,直接在pyd中自己实现了

var hook_list = [ "PyNumber_Add", "PyNumber_And", "PyNumber_Rshift", "PyNumber_Lshift", "PyNumber_Xor", "PyNumber_InPlaceRshift", "PyNumber_InPlaceAdd", "PyNumber_Multiply", "PyNumber_Power", "PyNumber_Index", "PyObject_RichCompare", "PyNumber_Remainder" ] /* =========> python12 PyLongObject structure <======== #define _PyLong_NON_SIZE_BITS 3 The number of digits (ndigits) is stored in the high bits of the lv_tag field (lvtag >> _PyLong_NON_SIZE_BITS). The sign of the value is stored in the lower 2 bits of lv_tag. - 0: Positive - 1: Zero - 2: Negative The third lowest bit of lv_tag is reserved for an immortality flag, but is not currently used. struct _object { Py_ssize_t ob_refcnt; PyTypeObject *ob_type; }; typedef struct _PyLongValue { uintptr_t lv_tag; //Number of digits, sign and flags // digit ob_digit[1]; } _PyLongValue; typedef struct _object PyObject; #define PyObject_HEAD PyObject ob_base; struct _longobject { PyObject_HEAD _PyLongValue long_value; }; typedef struct _longobject PyLongObject; */ function parseInt_python12(addr){ var ob_refcnt = addr.readU64() var ob_type = addr.add(0x8).readU64() var lv_tag = addr.add(0x10).readU64() var sign = lv_tag & 3 var numdigits = lv_tag >> 3 let val = 0 for(var i=0;i<numdigits;i++){ val += addr.add(0x18 + 4*i).readU32() * (2 ** (30*i)) // counld not handle BigInt } return (1-sign) * val } function parseInt_python12(addr){ var ob_refcnt = addr.readU64() var ob_type = addr.add(0x8).readU64() var ob_size = addr.add(0x10).readS64() // not unsigned var numdigits = ob_size if(ob_size < 0){ numdigits = -numdigits; } let val = 0 if (numdigits > 0x10000){ // too big , maby not PyLong val = addr.add(0x18).readU32() console.log("unexpected data , " + "0x" + val.toString(16)) }else{ for(var i=0;i<numdigits;i++){ val += addr.add(0x18 + 4*i).readU32() * (2 ** (30*i)) // counld not handle BigInt } } if(ob_size == 0){ return 0 }else if(ob_size >0){ return val }else if(ob_size <0){ return -val } } for(let i=0;i<hook_list.length;i++){ var funcAddress = Module.findExportByName('python312.dll', hook_list[i]); if(funcAddress !== null){ const op = hook_list[i] Interceptor.attach(funcAddress,{ onEnter: function(args){ this.arg1 = parseInt_python12(args[0]) this.arg2 = parseInt_python12(args[1]) }, onLeave: function(retval){ var ret = parseInt_python12(retval) var pri = "0x" + this.arg1.toString(16).padEnd(20) + op.padEnd(25) + "0x" + this.arg2.toString(16).padEnd(20) console.log(pri + "==>\t" + "0x" + ret.toString(16)) }, }) }else{ console.log("fail find export fun"); } }

我觉得frida+动态调试是大杀器。

相关CTF题目

rand0m

frida先hook一下关键函数

var hook_list = [ "PyNumber_Add", "PyNumber_And", "PyNumber_Rshift", "PyNumber_Lshift", "PyNumber_Xor", "PyNumber_InPlaceRshift", "PyNumber_InPlaceAdd", "PyNumber_Multiply", "PyNumber_Power", "PyNumber_Index", "PyObject_RichCompare", "PyNumber_Remainder" ] /* =========> python12 PyLongObject structure <======== #define _PyLong_NON_SIZE_BITS 3 The number of digits (ndigits) is stored in the high bits of the lv_tag field (lvtag >> _PyLong_NON_SIZE_BITS). The sign of the value is stored in the lower 2 bits of lv_tag. - 0: Positive - 1: Zero - 2: Negative The third lowest bit of lv_tag is reserved for an immortality flag, but is not currently used. struct _object { Py_ssize_t ob_refcnt; PyTypeObject *ob_type; }; typedef struct _PyLongValue { uintptr_t lv_tag; //Number of digits, sign and flags // digit ob_digit[1]; } _PyLongValue; typedef struct _object PyObject; #define PyObject_HEAD PyObject ob_base; struct _longobject { PyObject_HEAD _PyLongValue long_value; }; typedef struct _longobject PyLongObject; */ function parseInt_python12(addr){ var ob_refcnt = addr.readU64() var ob_type = addr.add(0x8).readU64() var lv_tag = addr.add(0x10).readU64() var sign = lv_tag & 3 var numdigits = lv_tag >> 3 let val = 0 for(var i=0;i<numdigits;i++){ val += addr.add(0x18 + 4*i).readU32() * (2 ** (30*i)) // counld not handle BigInt } return (1-sign) * val } function parseInt_python12(addr){ var ob_refcnt = addr.readU64() var ob_type = addr.add(0x8).readU64() var ob_size = addr.add(0x10).readS64() // not unsigned var numdigits = ob_size if(ob_size < 0){ numdigits = -numdigits; } let val = 0 if (numdigits > 0x10000){ // to big , maby not PyLong val = addr.add(0x18).readU32() console.log("unexpected data , " + "0x" + val.toString(16)) }else{ for(var i=0;i<numdigits;i++){ val += addr.add(0x18 + 4*i).readU32() * (2 ** (30*i)) // counld not handle BigInt } } if(ob_size == 0){ return 0 }else if(ob_size >0){ return val }else if(ob_size <0){ return -val } } for(let i=0;i<hook_list.length;i++){ var funcAddress = Module.findExportByName('python312.dll', hook_list[i]); if(funcAddress !== null){ const op = hook_list[i] Interceptor.attach(funcAddress,{ onEnter: function(args){ this.arg1 = parseInt_python12(args[0]) this.arg2 = parseInt_python12(args[1]) }, onLeave: function(retval){ var ret = parseInt_python12(retval) var pri = "0x" + this.arg1.toString(16).padEnd(20) + op.padEnd(25) + "0x" + this.arg2.toString(16).padEnd(20) console.log(pri + "==>\t" + "0x" + ret.toString(16)) }, }) }else{ console.log("fail find export fun"); } }

之后慢慢动调即可,需要爆破 6 位二进制

exp:

# # enc def enc(): flag ="1122334455667777abcdefffaa44ff55" # cmp_list1 = [0x112287f38,0x10a30f74d,0x1023a1268,0x208108807] cmp1 = [0x12287F38,0x4a30f74d,0x23a1268,0x88108807] cmp2 = [0x98d24b3a,0xe0f1db77,0xadf38403,0xd8499bb6] for i in range(4): part = int(flag[8*i+0:8*i+8],16) x = (part >> 28) y = (part << 4) y &= 0xfa3affff y += x assert y == cmp1[i] p1 = (part ^ 0x9e3779b9) >> 0xb res = (p1 ** 0x10001) % 0xfffffffd assert res == cmp2[i] #dec import gmpy2 def set_bit(idx:int,val:int,sel:int): if sel == 0: return val & ~(1 << idx) else: return val | (1 << idx) # 0xfa3affff ==> 1111 1010 0011 1010 1111 1111 1111 1111 # 爆破各位 --> 2**6的时间复杂度 bin_idx_list = [26,24,23,22,18,16] cip1 = [0x12287F38,0x4a30f74d,0x23a1268,0x88108807] cip2 = [0x98d24b3a,0xe0f1db77,0xadf38403,0xd8499bb6] flag = [] for i in range(len(cip1)): val = cip1[i] x = val & 0xf y = val-x # blast y ==> 2 ** 6 for sel1 in range(2): y = set_bit(bin_idx_list[0],y,sel1) for sel2 in range(2): y = set_bit(bin_idx_list[1],y,sel2) for sel3 in range(2): y = set_bit(bin_idx_list[2],y,sel3) for sel4 in range(2): y = set_bit(bin_idx_list[3],y,sel4) for sel5 in range(2): y = set_bit(bin_idx_list[4],y,sel5) for sel6 in range(2): y = set_bit(bin_idx_list[5],y,sel6) part = ((x << 28) | (y >> 4)) & 0xffffffff p1 = (part ^ 0x9e3779b9) >> 0xb res = gmpy2.powmod(p1,0x10001,0xfffffffd) if res == cip2[i]: flag.append(part) print(f"{i+1} ==> {hex(part)}") for i in flag: t = hex(i)[2:] print(t,end = " ") # 813a97f3d4b34f74802ba12678950880

cython

同样frida hook 一下,需要注意的是这里是python11版本,PyLongObject结构体有些不同

之后动调即可,其实都不用动调,这一眼魔改xtea

const hook_list = [ "PyNumber_Add", "PyNumber_And", "PyNumber_Rshift", "PyNumber_Lshift", "PyNumber_Xor", "PyNumber_InPlaceRshift", "PyNumber_InPlaceAdd", "PyNumber_Multiply", "PyNumber_Power", "PyNumber_Index", "PyObject_RichCompare", "PyNumber_Remainder" ] /* =========> python11 PyLongObject structure <======== ===> Long integer representation. The absolute value of a number is equal to SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(SHIFT*i) Negative numbers are represented with ob_size < 0; zero is represented by ob_size == 0. In a normalized number, ob_digit[abs(ob_size)-1] (the most significant digit) is never zero. Also, in all cases, for all valid i, 0 <= ob_digit[i] <= MASK. The allocation function takes care of allocating extra memory so that ob_digit[0] ... ob_digit[abs(ob_size)-1] are actually available. We always allocate memory for at least one digit, so accessing ob_digit[0] is always safe. However, in the case ob_size == 0, the contents of ob_digit[0] may be undefined. CAUTION: Generic code manipulating subtypes of PyVarObject has to aware that ints abuse ob_size's sign bit. typedef __int64 Py_ssize_t; struct _object { Py_ssize_t ob_refcnt; PyTypeObject *ob_type; }; typedef struct _object PyObject; typedef intptr_t Py_ssize_t; typedef struct { PyObject ob_base; Py_ssize_t ob_size; // Number of items in variable part } PyVarObject; #define PyObject_VAR_HEAD PyVarObject ob_base; struct _longobject { PyObject_VAR_HEAD digit ob_digit[1]; }; typedef struct _longobject PyLongObject; */ function deepClone(obj) { return JSON.parse(JSON.stringify(obj)); } function parseInt_python11(addr){ var ob_refcnt = addr.readU64() var ob_type = addr.add(0x8).readU64() var ob_size = addr.add(0x10).readS64() // not unsigned var numdigits = ob_size if(ob_size < 0){ numdigits = -numdigits; } let val = 0 if (numdigits > 0x10000){ // to big , maby not PyLong val = addr.add(0x18).readU32() console.log("unexpected data , " + "0x" + val.toString(16)) }else{ for(var i=0;i<numdigits;i++){ val += addr.add(0x18 + 4*i).readU32() * (2 ** (30*i)) // counld not handle BigInt } } if(ob_size == 0){ return 0 }else if(ob_size >0){ return val }else if(ob_size <0){ return -val } } for(let i=0;i<hook_list.length;i++){ var funcAddress = Module.findExportByName('python311.dll', hook_list[i]); if(funcAddress !== null){ const op = hook_list[i] Interceptor.attach(funcAddress,{ onEnter: function(args){ this.arg1 = parseInt_python11(args[0]) this.arg2 = parseInt_python11(args[1]) }, onLeave: function(retval){ var ret = parseInt_python11(retval) var pri = "0x" + this.arg1.toString(16).padEnd(20) + op.padEnd(25) + "0x" + this.arg2.toString(16).padEnd(20) console.log(pri + "==>\t" + "0x" + ret.toString(16)) }, }) }else{ console.log("fail find export fun"); } }

exp:

from ctypes import * from struct import pack,unpack def encrypt(v, key): v0, v1 = c_uint32(v[0]), c_uint32(v[1]) delta = 0x54646454 total = c_uint32(0) for i in range(64): v0.value += (((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]) # print(hex(v1.value << 3),hex(v1.value >> 6),hex((((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]))) total.value += delta v1.value += (((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[(total.value>>11) & 3]) # print(hex(v0.value << 3),hex(v0.value >> 6),hex((((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[total.value & 3]))) print(hex(v0.value),hex(v1.value)) return v0.value, v1.value def decrypt(v, key): v0, v1 = c_uint32(v[0]), c_uint32(v[1]) delta = 0x54646454 total = c_uint32(delta * 64) for i in range(64): v1.value -= (((v0.value << 3) ^ (v0.value >> 6)) + v0.value) ^ (total.value + key[(total.value>>11) & 3]) total.value -= delta v0.value -= (((v1.value << 3) ^ (v1.value >> 6)) + v1.value) ^ (total.value + key[total.value & 3]) return v0.value, v1.value def xtea(inp:bytes,key:bytes): from struct import pack,unpack k = unpack("<4I",key) inp_len = len(inp) // 4 value = unpack(f"<{inp_len}I",inp) res = b"" for i in range(0,inp_len,2): v = [value[i],value[i+1]] # x = encrypt(v,k) x = decrypt(v,k) res += pack("<2I",*x) return res l2b = lambda lst: b''.join(i.to_bytes(4, 'little') for i in lst) cip_li = [0x481F56C9,0xc7EE5EF4,0xa00A5B72,0x7648F086,0x307F948,0x29B379B0] cip = l2b(cip_li) key = b"f\x00\x00\x00l\x00\x00\x00a\x00\x00\x00g\x00\x00\x00" ret = xtea(cip,key) flag = b"" for i in range(0,len(ret),4): x = (ret[i:i+4])[::-1] flag += x print(flag)

pyd对Long类型数据的处理

参考: https://tenthousandmeters.com/blog/python-behind-the-scenes-8-how-python-integers-work/

image-20241218103131993

事实上对于64位程序来说,PyLong的一个digit数据类型是四字节,但其只存储32位的数据。

pyLongObject结构

Include\pytypedefs.h\Include\longintrepr.h中能看到相关定义

定义上面有关于字段的解释,读一下就ok了


__EOF__

本文作者_TLSN
本文链接https://www.cnblogs.com/lordtianqiyi/p/18614184.html
关于博主:评论和私信会在第一时间回复。或者直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
声援博主:如果您觉得文章对您有帮助,可以点击文章右下角推荐一下。您的鼓励是博主的最大动力!
posted @   TLSN  阅读(180)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· C#/.NET/.NET Core技术前沿周刊 | 第 29 期(2025年3.1-3.9)
· 从HTTP原因短语缺失研究HTTP/2和HTTP/3的设计差异
点击右上角即可分享
微信分享提示