Python爬虫:通过js逆向我发现了斗鱼视频请求参数的加密原理
1. 找到相关视频的m3u8文件的请求链接
前一段时间不是kpl比赛吗?然后就是比赛完之后斗鱼视频上面就会有相应的视频,以供大家观看,现在小编我想把其中的一个视频下载,具体怎样做呢?请看下面讲解。
按电脑键盘的F12键,来到开发者工具,点击网络下的xhr,按F5刷新(这里直接省略小编在这个过程种一些其他过程),发现此时小编正在观看的这个视频的m3u8文件在这个链接下面。
这个getStreamUrl请求是一个POST请求,具体链接和请求参数如下:
看到了这个请求参数,小编我第一时间想到的是这个请求参数sign肯定是进行了加密,另外几个参数,经过小编分析,有一些参数是固定值,有一些参数可以通过利用requests模块请求这个视频网址,然后利用相应的数据解析模块得到。
具体得到如下结论:
参数 | 含义 |
---|---|
sign:afff8a5abe2ceccf2ecc327485e84ef5 | 进行了加密,并且,每一次请求这个网址是,这个参数都是变化的,可想sign这个里面应该有时间或者随机数等等信息 |
v: 220320220115 | 固定值,应该是version的简写 |
did: 10000000000000000000000000001501 | 固定值 |
tt: 1642225268 | 时间戳,js种实现为:parseInt(new Date().getTime()/1000) python中实现为(首先导入time模块):int(time.time()) |
vid: 8pa9v5Zqn5D7VrqA | 是视频的id,应该是video_id的简写吧!可以从视频网址https://v.douyu.com/show/8pa9v5Zqn5D7VrqA 中利用re(正则表达式模块)获取得到 |
2. 了解sign这个参数的加密原理(js逆向)
我们点击到启动器,如下:
可以看到这个启动器下面有一大堆js文件的链接,我们点到图上的第6个链接(这里直接省略点击上面5个链接的步骤了,小编是一个个都点击过的,可以并没有发现东西,知道点击到第6个链接后),发现如下:
小编在这一段代码这里全部打上断点并监听,发现如下:
找到这个window[l],发现它的加密函数如下:
看到这么长的加密函数,小编想它斗鱼视频加密也真的是煞费苦心啊!经过仔细查看这个加密函数,并没有发现sign这样字符串字样,于是小编考虑到这个加密函数的这个字眼,如下:
它其实就是利用js中的eval这个函数执行js字符串代码呗!strc这个字符串里面肯定有sign这个字样,如是小编利用Python执行了一下这个加密函数,想看看返回结果和刚才我看到的那个结果是不是一样的。
代码如下:
`
import execjs
js_str = '''var vdwdae325w_64we = "220320220115";
var c030b39 = [0x6fbfb911, 0x633d16f, 0x7bbfb2a1, 0xad6384c5, 0x30c47c2c, 0xac8f216e, 0xff2384b4, 0xffb13f26, 0xaeef3c6, 0x8333f31d, 0x756526a, 0x293a366f, 0xdfb82e76, 0x1a091d2e, 0x568ae74e, 0xc88fa782, 0x7da6168c, 0x6eca0125, 0x28c9be39, 0xf3fc9111, 0xb864440e, 0x1e011e7f, 0xf5102339, 0x247bac9, 0xff904227, 0x6f466d95, 0xd14458d4, 0xd15e30ea, 0x2769d289, 0x992151f7, 0xba68c2e0, 0x487e3290, 0x4cc0b13a, 0x41efc04, 0x4e2bc8be, 0x96cd5994, 0xa15e7be8, 0xe79ff2fb, 0x71aa9ae1, 0xe182ca8e, 0xcdde2608, 0xfd0ded5e, 0xb93434c9, 0x4514a45a, 0x238f5ab3, 0x9c0fc6e9, 0x1444d134, 0x93c382d, 0x9d95e0d8, 0x7082526a, 0xfa121, 0xecde8477, 0x9d66c44e, 0x645f8766, 0xd2fd2e5f, 0xfcb7c021, 0xc808e41c, 0x216c964, 0xfe957ec2, 0x631e4791, 0x4b95a934, 0x3a4231eb, 0x386ec5c1, 0x3b83f005, 0x51c97851, 0x86513b4c, 0xffbbb5a9, 0xe733881a, 0x2925339f, 0xffaf6650, 0x301a6adf, 0x1fc2209a, 0x5d964d95, 0x867e6534, 0x884b5bf3, 0x83fd9705, 0xde93ec0a, 0xec525586, 0x58da1126, 0x3d30894a, 0xcdfa4b70, 0xac4923bf, 0x8860665b, 0x9138c21a, 0x42bf9b11, 0xc2b7ee6a, 0x693d6c2d, 0x7ea9d38c, 0x84d8b976, 0xb30f9fc, 0x9979c3fc, 0x76271eb, 0xb750bf61, 0xd7381557, 0x2d317e6b, 0x5b528fb8, 0xc24a3ecf, 0x714a9e11, 0x36c17e9f, 0xcf1123a5, 0x15be9207, 0xef9a66d5, 0x73879337, 0x8c02b9c, 0x6d30cf, 0xadd8689c, 0x7305ab7d, 0xe14d8a47, 0x48722c86, 0x891330b7, 0xc57f418e, 0xe5ff642c, 0x98adee85, 0x3735930f, 0x217cce7, 0x35dbc148, 0xa2ae2d0f, 0x607c20e1, 0x31a3484, 0x2611dbd6, 0xbd7723b0, 0x1ea5ba26, 0x9b584717, 0x25a5261b, 0x9f681e70, 0x307ddb95, 0x956c1a40, 0x17e751a5, 0xa09ee85c, 0x69122c83, 0xecd75754, 0x38e40854, 0x1af42614, 0xee471ac1, 0x51c17e9f, 0x3964ae00, 0x4e15f7d4, 0x5b0f119f, 0xd6520721, 0xcc4c2ac, 0xaf3aed2f, 0xd596fae1, 0x8f4fa4bf, 0xead00739, 0x5f6a4979, 0xb81f95cf, 0xabafafa2, 0x112ff59, 0x26342f2c, 0x5e1d3eb6, 0x1fdd15c7, 0x1129a028, 0xbf9aeebf, 0x18ff3ba3, 0x88b477aa, 0x25a74729, 0xc29c3c94, 0x33f22726, 0xf17b1fd2, 0x6e71d6b, 0xdbbacb25, 0x481cb29d, 0x936f3f3d, 0xa2d18ead, 0xbde185d, 0x6408d8fc, 0xb8034882, 0x9f7e499c, 0x73af78c6, 0xbd445423, 0xf1498eeb, 0x74e6de1f, 0xb71d2de, 0x19a649dc, 0xbd101153, 0x9c310b4c, 0xef14a401, 0xdc5dbcb0, 0xbf638641, 0x7beaef7e, 0x2f13922b, 0x5504ac9d, 0x40558c96, 0xf1d64153, 0x8ca05a73, 0xf0453c2e, 0xf0bef25, 0x9a557c5d, 0xe475cf37, 0xb27d08a8, 0x97e62a1e, 0x42b8d679, 0x65363687, 0x8f47c7db, 0x640d169f, 0x9a5b6e38, 0x6abe0558, 0xe327a5c4, 0xbdd2eeaf, 0x963489e9, 0xbc0aeb70, 0x9fb4a09, 0x61384733, 0xbab80ce7, 0x29ad2369, 0xbd64f138, 0x345db802, 0x971d40a4, 0x5aaac376, 0x6ae39936, 0xe9b11bc7, 0x3adce8ef, 0xa6377014, 0x82efbadc, 0x35810196, 0xf1cc9fc, 0x16d0e2fb, 0x125f038e, 0x68560794, 0x780a2f8, 0x8dfb433d, 0xce4a8c0, 0x2dd1c455, 0x488e779c, 0xfa0503af, 0xcf65b956, 0x1b8f4097, 0x7e59c553, 0x6e32534a, 0xb508a53, 0xad3fd69a, 0xb4d57093, 0x1b4030ac, 0x771f07c8, 0xfd0f151a, 0xe2d23460, 0x733a5f1f, 0xb2fb1075, 0xdd6eb088, 0x6c7f4f0f, 0x2734304a, 0xf90607ff, 0x51b1c58f, 0x754ecb10, 0x96ad1882, 0xeda1274e, 0xb0b23895, 0x31b41727, 0xcf765728, 0xd0645dcd, 0x8d1224f6, 0x5cccf3d, 0x378bf872, 0x98eb5172, 0x3c4ff55e, 0x6107e27a, 0x8d3076e, 0x1d323224, 0x809a36f0, 0x728711a7, 0xd9412a42, 0xf5ee597e, 0x960ea359, 0x7341987a, 0xfe6e7d95, 0x49331f46, 0x5036870a, 0x462d0b31, 0xa75dee4e, 0xe8bac2dc, 0xe6110e96, 0x9b8732de, 0x78d5bad5, 0xa578ce0f, 0x892a2fba, 0xa08e887a, 0x8cf4dc98, 0x6c00e332, 0x4c7d6937, 0x509d1dc4, 0x4e59ca6a, 0xe63c33bb, 0xed112d0f, 0xa14df6ec, 0x7c33840d, 0x11786f09, 0x8c51a724, 0x371fcaa4, 0x1a9d695c, 0x1cc0ed4e, 0xf7d610a9, 0x2a1b559a, 0xf8deb155, 0xa0c605bc, 0xa61c09e9, 0xb807daf, 0x3cca9d69, 0xd88f971d, 0xefaaaa38, 0x9c597b8d, 0x55ccc315, 0x60f99381, 0x66070ed4, 0xb9167fde, 0xf3ca85a2, 0x53ed971f, 0x4d8bd81e, 0x2f347103, 0x540b3f38, 0x7d59d5f, 0x46fe9993, 0x456a9834, 0x8b4f91c6, 0x850567fc, 0xb49b3731, 0xde6f6902, 0xa701ffc2, 0xc475ffb2, 0xa2018f39, 0x4cda6409, 0xfc87fc1a, 0xc5f70250, 0xc9da4751, 0x437a9130, 0x1cfbda7b, 0xc8e57142, 0x7119c24a, 0x1acfe20b, 0xb5ba754d, 0x252197e4, 0xda1082af, 0xb30f4ff5, 0xbe34d699, 0x7a895d75, 0xaf8d69ea, 0x65bb395e, 0xf06638fb, 0x8c9b7895, 0x67356f4e, 0xb195c907, 0x990c1bb0, 0x9dd84e05, 0xc4fa1ccd, 0xde7e4556, 0xa35e7cb3, 0xc9e64b5, 0x37cd330b, 0x89dfe70, 0x40f7d7c2, 0xb3fa8160, 0xecadff03, 0x7779e974, 0x58cc1cc1, 0x2976b655, 0x4457c4e6, 0x134be486, 0x494850cb, 0x66d34fa6, 0x2317efcd, 0xe03a32c, 0x33699bca, 0xbc55a8f8, 0xb2094c93, 0xbc7e0d2d, 0x4c0aa51, 0xc3ba14d8, 0xd3127386, 0x4eb479a0];
function ub98484234(c030b390, c030b391, c030b392) {
var rk = [41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15, 33, 14, 34, 19, 10, 41, 29, 40, 15];
var k2 = [0xf38a929, 0x643b2580];
var lk = [0xf38a929, 0x643b2580];
var v = c030b39.slice(0);
var k = [0xbd7375f8, 0xdfcfb71b, 0x3f7fed8e, 0xf7bb9631];
for (var O = 0; O < 368; O++) {
v[O] ^= 0x2a976da5;
}
v[1] -= lk[1];
v[0] -= lk[0];
v[3] += lk[1];
v[2] -= lk[0];
v[5] += lk[1];
v[4] -= lk[0];
v[7] += lk[1];
v[6] ^= lk[0];
v[9] = (v[9] >>> (lk[1] % 16)) | (v[9] << (32 - (lk[1] % 16)));
v[8] = (v[8] >>> (lk[0] % 16)) | (v[8] << (32 - (lk[0] % 16)));
v[11] += lk[1];
v[10] = (v[10] << (lk[0] % 16)) | (v[10] >>> (32 - (lk[0] % 16)));
v[13] ^= lk[1];
v[12] ^= lk[0];
v[15] ^= lk[1];
v[14] = (v[14] >>> (lk[0] % 16)) | (v[14] << (32 - (lk[0] % 16)));
v[17] -= lk[1];
v[16] -= lk[0];
v[19] -= lk[1];
v[18] -= lk[0];
v[21] += lk[1];
v[20] -= lk[0];
v[23] += lk[1];
v[22] -= lk[0];
v[25] += lk[1];
v[24] ^= lk[0];
v[27] = (v[27] >>> (lk[1] % 16)) | (v[27] << (32 - (lk[1] % 16)));
v[26] = (v[26] >>> (lk[0] % 16)) | (v[26] << (32 - (lk[0] % 16)));
v[29] += lk[1];
v[28] = (v[28] << (lk[0] % 16)) | (v[28] >>> (32 - (lk[0] % 16)));
v[31] ^= lk[1];
v[30] ^= lk[0];
v[33] ^= lk[1];
v[32] = (v[32] >>> (lk[0] % 16)) | (v[32] << (32 - (lk[0] % 16)));
v[35] -= lk[1];
v[34] -= lk[0];
v[37] -= lk[1];
v[36] -= lk[0];
v[39] += lk[1];
v[38] -= lk[0];
v[41] += lk[1];
v[40] -= lk[0];
v[43] += lk[1];
v[42] ^= lk[0];
v[45] = (v[45] >>> (lk[1] % 16)) | (v[45] << (32 - (lk[1] % 16)));
v[44] = (v[44] >>> (lk[0] % 16)) | (v[44] << (32 - (lk[0] % 16)));
v[47] += lk[1];
v[46] = (v[46] << (lk[0] % 16)) | (v[46] >>> (32 - (lk[0] % 16)));
v[49] ^= lk[1];
v[48] ^= lk[0];
v[51] ^= lk[1];
v[50] = (v[50] >>> (lk[0] % 16)) | (v[50] << (32 - (lk[0] % 16)));
v[53] -= lk[1];
v[52] -= lk[0];
v[55] -= lk[1];
v[54] -= lk[0];
v[57] += lk[1];
v[56] -= lk[0];
v[59] += lk[1];
v[58] -= lk[0];
v[61] += lk[1];
v[60] ^= lk[0];
v[63] = (v[63] >>> (lk[1] % 16)) | (v[63] << (32 - (lk[1] % 16)));
v[62] = (v[62] >>> (lk[0] % 16)) | (v[62] << (32 - (lk[0] % 16)));
v[65] += lk[1];
v[64] = (v[64] << (lk[0] % 16)) | (v[64] >>> (32 - (lk[0] % 16)));
v[67] ^= lk[1];
v[66] ^= lk[0];
v[69] ^= lk[1];
v[68] = (v[68] >>> (lk[0] % 16)) | (v[68] << (32 - (lk[0] % 16)));
v[71] -= lk[1];
v[70] -= lk[0];
v[73] -= lk[1];
v[72] -= lk[0];
v[75] += lk[1];
v[74] -= lk[0];
v[77] += lk[1];
v[76] -= lk[0];
v[79] += lk[1];
v[78] ^= lk[0];
v[81] = (v[81] >>> (lk[1] % 16)) | (v[81] << (32 - (lk[1] % 16)));
v[80] = (v[80] >>> (lk[0] % 16)) | (v[80] << (32 - (lk[0] % 16)));
v[83] += lk[1];
v[82] = (v[82] << (lk[0] % 16)) | (v[82] >>> (32 - (lk[0] % 16)));
v[85] ^= lk[1];
v[84] ^= lk[0];
v[87] ^= lk[1];
v[86] = (v[86] >>> (lk[0] % 16)) | (v[86] << (32 - (lk[0] % 16)));
v[89] -= lk[1];
v[88] -= lk[0];
v[91] -= lk[1];
v[90] -= lk[0];
v[93] += lk[1];
v[92] -= lk[0];
v[95] += lk[1];
v[94] -= lk[0];
v[97] += lk[1];
v[96] ^= lk[0];
v[99] = (v[99] >>> (lk[1] % 16)) | (v[99] << (32 - (lk[1] % 16)));
v[98] = (v[98] >>> (lk[0] % 16)) | (v[98] << (32 - (lk[0] % 16)));
v[101] += lk[1];
v[100] = (v[100] << (lk[0] % 16)) | (v[100] >>> (32 - (lk[0] % 16)));
v[103] ^= lk[1];
v[102] ^= lk[0];
v[105] ^= lk[1];
v[104] = (v[104] >>> (lk[0] % 16)) | (v[104] << (32 - (lk[0] % 16)));
v[107] -= lk[1];
v[106] -= lk[0];
v[109] -= lk[1];
v[108] -= lk[0];
v[111] += lk[1];
v[110] -= lk[0];
v[113] += lk[1];
v[112] -= lk[0];
v[115] += lk[1];
v[114] ^= lk[0];
v[117] = (v[117] >>> (lk[1] % 16)) | (v[117] << (32 - (lk[1] % 16)));
v[116] = (v[116] >>> (lk[0] % 16)) | (v[116] << (32 - (lk[0] % 16)));
v[119] += lk[1];
v[118] = (v[118] << (lk[0] % 16)) | (v[118] >>> (32 - (lk[0] % 16)));
v[121] ^= lk[1];
v[120] ^= lk[0];
v[123] ^= lk[1];
v[122] = (v[122] >>> (lk[0] % 16)) | (v[122] << (32 - (lk[0] % 16)));
v[125] -= lk[1];
v[124] -= lk[0];
v[127] -= lk[1];
v[126] -= lk[0];
v[129] += lk[1];
v[128] -= lk[0];
v[131] += lk[1];
v[130] -= lk[0];
v[133] += lk[1];
v[132] ^= lk[0];
v[135] = (v[135] >>> (lk[1] % 16)) | (v[135] << (32 - (lk[1] % 16)));
v[134] = (v[134] >>> (lk[0] % 16)) | (v[134] << (32 - (lk[0] % 16)));
v[137] += lk[1];
v[136] = (v[136] << (lk[0] % 16)) | (v[136] >>> (32 - (lk[0] % 16)));
v[139] ^= lk[1];
v[138] ^= lk[0];
v[141] ^= lk[1];
v[140] = (v[140] >>> (lk[0] % 16)) | (v[140] << (32 - (lk[0] % 16)));
v[143] -= lk[1];
v[142] -= lk[0];
v[145] -= lk[1];
v[144] -= lk[0];
v[147] += lk[1];
v[146] -= lk[0];
v[149] += lk[1];
v[148] -= lk[0];
v[151] += lk[1];
v[150] ^= lk[0];
v[153] = (v[153] >>> (lk[1] % 16)) | (v[153] << (32 - (lk[1] % 16)));
v[152] = (v[152] >>> (lk[0] % 16)) | (v[152] << (32 - (lk[0] % 16)));
v[155] += lk[1];
v[154] = (v[154] << (lk[0] % 16)) | (v[154] >>> (32 - (lk[0] % 16)));
v[157] ^= lk[1];
v[156] ^= lk[0];
v[159] ^= lk[1];
v[158] = (v[158] >>> (lk[0] % 16)) | (v[158] << (32 - (lk[0] % 16)));
v[161] -= lk[1];
v[160] -= lk[0];
v[163] -= lk[1];
v[162] -= lk[0];
v[165] += lk[1];
v[164] -= lk[0];
v[167] += lk[1];
v[166] -= lk[0];
v[169] += lk[1];
v[168] ^= lk[0];
v[171] = (v[171] >>> (lk[1] % 16)) | (v[171] << (32 - (lk[1] % 16)));
v[170] = (v[170] >>> (lk[0] % 16)) | (v[170] << (32 - (lk[0] % 16)));
v[173] += lk[1];
v[172] = (v[172] << (lk[0] % 16)) | (v[172] >>> (32 - (lk[0] % 16)));
v[175] ^= lk[1];
v[174] ^= lk[0];
v[177] ^= lk[1];
v[176] = (v[176] >>> (lk[0] % 16)) | (v[176] << (32 - (lk[0] % 16)));
v[179] -= lk[1];
v[178] -= lk[0];
v[181] -= lk[1];
v[180] -= lk[0];
v[183] += lk[1];
v[182] -= lk[0];
v[185] += lk[1];
v[184] -= lk[0];
v[187] += lk[1];
v[186] ^= lk[0];
v[189] = (v[189] >>> (lk[1] % 16)) | (v[189] << (32 - (lk[1] % 16)));
v[188] = (v[188] >>> (lk[0] % 16)) | (v[188] << (32 - (lk[0] % 16)));
v[191] += lk[1];
v[190] = (v[190] << (lk[0] % 16)) | (v[190] >>> (32 - (lk[0] % 16)));
v[193] ^= lk[1];
v[192] ^= lk[0];
v[195] ^= lk[1];
v[194] = (v[194] >>> (lk[0] % 16)) | (v[194] << (32 - (lk[0] % 16)));
v[197] -= lk[1];
v[196] -= lk[0];
v[199] -= lk[1];
v[198] -= lk[0];
v[201] += lk[1];
v[200] -= lk[0];
v[203] += lk[1];
v[202] -= lk[0];
v[205] += lk[1];
v[204] ^= lk[0];
v[207] = (v[207] >>> (lk[1] % 16)) | (v[207] << (32 - (lk[1] % 16)));
v[206] = (v[206] >>> (lk[0] % 16)) | (v[206] << (32 - (lk[0] % 16)));
v[209] += lk[1];
v[208] = (v[208] << (lk[0] % 16)) | (v[208] >>> (32 - (lk[0] % 16)));
v[211] ^= lk[1];
v[210] ^= lk[0];
v[213] ^= lk[1];
v[212] = (v[212] >>> (lk[0] % 16)) | (v[212] << (32 - (lk[0] % 16)));
v[215] -= lk[1];
v[214] -= lk[0];
v[217] -= lk[1];
v[216] -= lk[0];
v[219] += lk[1];
v[218] -= lk[0];
v[221] += lk[1];
v[220] -= lk[0];
v[223] += lk[1];
v[222] ^= lk[0];
v[225] = (v[225] >>> (lk[1] % 16)) | (v[225] << (32 - (lk[1] % 16)));
v[224] = (v[224] >>> (lk[0] % 16)) | (v[224] << (32 - (lk[0] % 16)));
v[227] += lk[1];
v[226] = (v[226] << (lk[0] % 16)) | (v[226] >>> (32 - (lk[0] % 16)));
v[229] ^= lk[1];
v[228] ^= lk[0];
v[231] ^= lk[1];
v[230] = (v[230] >>> (lk[0] % 16)) | (v[230] << (32 - (lk[0] % 16)));
v[233] -= lk[1];
v[232] -= lk[0];
v[235] -= lk[1];
v[234] -= lk[0];
v[237] += lk[1];
v[236] -= lk[0];
v[239] += lk[1];
v[238] -= lk[0];
v[241] += lk[1];
v[240] ^= lk[0];
v[243] = (v[243] >>> (lk[1] % 16)) | (v[243] << (32 - (lk[1] % 16)));
v[242] = (v[242] >>> (lk[0] % 16)) | (v[242] << (32 - (lk[0] % 16)));
v[245] += lk[1];
v[244] = (v[244] << (lk[0] % 16)) | (v[244] >>> (32 - (lk[0] % 16)));
v[247] ^= lk[1];
v[246] ^= lk[0];
v[249] ^= lk[1];
v[248] = (v[248] >>> (lk[0] % 16)) | (v[248] << (32 - (lk[0] % 16)));
v[251] -= lk[1];
v[250] -= lk[0];
v[253] -= lk[1];
v[252] -= lk[0];
v[255] += lk[1];
v[254] -= lk[0];
v[257] += lk[1];
v[256] -= lk[0];
v[259] += lk[1];
v[258] ^= lk[0];
v[261] = (v[261] >>> (lk[1] % 16)) | (v[261] << (32 - (lk[1] % 16)));
v[260] = (v[260] >>> (lk[0] % 16)) | (v[260] << (32 - (lk[0] % 16)));
v[263] += lk[1];
v[262] = (v[262] << (lk[0] % 16)) | (v[262] >>> (32 - (lk[0] % 16)));
v[265] ^= lk[1];
v[264] ^= lk[0];
v[267] ^= lk[1];
v[266] = (v[266] >>> (lk[0] % 16)) | (v[266] << (32 - (lk[0] % 16)));
v[269] -= lk[1];
v[268] -= lk[0];
v[271] -= lk[1];
v[270] -= lk[0];
v[273] += lk[1];
v[272] -= lk[0];
v[275] += lk[1];
v[274] -= lk[0];
v[277] += lk[1];
v[276] ^= lk[0];
v[279] = (v[279] >>> (lk[1] % 16)) | (v[279] << (32 - (lk[1] % 16)));
v[278] = (v[278] >>> (lk[0] % 16)) | (v[278] << (32 - (lk[0] % 16)));
v[281] += lk[1];
v[280] = (v[280] << (lk[0] % 16)) | (v[280] >>> (32 - (lk[0] % 16)));
v[283] ^= lk[1];
v[282] ^= lk[0];
v[285] ^= lk[1];
v[284] = (v[284] >>> (lk[0] % 16)) | (v[284] << (32 - (lk[0] % 16)));
v[287] -= lk[1];
v[286] -= lk[0];
v[289] -= lk[1];
v[288] -= lk[0];
v[291] += lk[1];
v[290] -= lk[0];
v[293] += lk[1];
v[292] -= lk[0];
v[295] += lk[1];
v[294] ^= lk[0];
v[297] = (v[297] >>> (lk[1] % 16)) | (v[297] << (32 - (lk[1] % 16)));
v[296] = (v[296] >>> (lk[0] % 16)) | (v[296] << (32 - (lk[0] % 16)));
v[299] += lk[1];
v[298] = (v[298] << (lk[0] % 16)) | (v[298] >>> (32 - (lk[0] % 16)));
v[301] ^= lk[1];
v[300] ^= lk[0];
v[303] ^= lk[1];
v[302] = (v[302] >>> (lk[0] % 16)) | (v[302] << (32 - (lk[0] % 16)));
v[305] -= lk[1];
v[304] -= lk[0];
v[307] -= lk[1];
v[306] -= lk[0];
v[309] += lk[1];
v[308] -= lk[0];
v[311] += lk[1];
v[310] -= lk[0];
v[313] += lk[1];
v[312] ^= lk[0];
v[315] = (v[315] >>> (lk[1] % 16)) | (v[315] << (32 - (lk[1] % 16)));
v[314] = (v[314] >>> (lk[0] % 16)) | (v[314] << (32 - (lk[0] % 16)));
v[317] += lk[1];
v[316] = (v[316] << (lk[0] % 16)) | (v[316] >>> (32 - (lk[0] % 16)));
v[319] ^= lk[1];
v[318] ^= lk[0];
v[321] ^= lk[1];
v[320] = (v[320] >>> (lk[0] % 16)) | (v[320] << (32 - (lk[0] % 16)));
v[323] -= lk[1];
v[322] -= lk[0];
v[325] -= lk[1];
v[324] -= lk[0];
v[327] += lk[1];
v[326] -= lk[0];
v[329] += lk[1];
v[328] -= lk[0];
v[331] += lk[1];
v[330] ^= lk[0];
v[333] = (v[333] >>> (lk[1] % 16)) | (v[333] << (32 - (lk[1] % 16)));
v[332] = (v[332] >>> (lk[0] % 16)) | (v[332] << (32 - (lk[0] % 16)));
v[335] += lk[1];
v[334] = (v[334] << (lk[0] % 16)) | (v[334] >>> (32 - (lk[0] % 16)));
v[337] ^= lk[1];
v[336] ^= lk[0];
v[339] ^= lk[1];
v[338] = (v[338] >>> (lk[0] % 16)) | (v[338] << (32 - (lk[0] % 16)));
v[341] -= lk[1];
v[340] -= lk[0];
v[343] -= lk[1];
v[342] -= lk[0];
v[345] += lk[1];
v[344] -= lk[0];
v[347] += lk[1];
v[346] -= lk[0];
v[349] += lk[1];
v[348] ^= lk[0];
v[351] = (v[351] >>> (lk[1] % 16)) | (v[351] << (32 - (lk[1] % 16)));
v[350] = (v[350] >>> (lk[0] % 16)) | (v[350] << (32 - (lk[0] % 16)));
v[353] += lk[1];
v[352] = (v[352] << (lk[0] % 16)) | (v[352] >>> (32 - (lk[0] % 16)));
v[355] ^= lk[1];
v[354] ^= lk[0];
v[357] ^= lk[1];
v[356] = (v[356] >>> (lk[0] % 16)) | (v[356] << (32 - (lk[0] % 16)));
v[359] -= lk[1];
v[358] -= lk[0];
v[361] -= lk[1];
v[360] -= lk[0];
v[363] += lk[1];
v[362] -= lk[0];
v[365] += lk[1];
v[364] -= lk[0];
v[367] += lk[1];
v[366] ^= lk[0];
for (var I = 0; I < 368; I += 2) {
var i, v0 = v[I] ^ k2[0], v1 = v[I + 1] ^ k2[1], d = 0x9E3779B9, sum = d * rk[I / 2];
for (i = 0; i < rk[I / 2]; i++) {
v1 -= (((v0 << 4) ^ (v0 >>> 5)) + v0) ^ (sum + k[(sum >>> 11) & 3]);
sum -= d;
v0 -= (((v1 << 4) ^ (v1 >>> 5)) + v1) ^ (sum + k[sum & 3]);
}
v[I] = v0 ^ k2[1];
v[I + 1] = v1 ^ k2[0];
}
for (var O = 367; O > 0; O--) {
v[O] ^= v[O - 1];
}
v[0] ^= 0x2a976da5;
var strc = "";
for (var i = 0; i < v.length; i++) {
strc += String.fromCharCode(v[i] & 0xff, v[i] >>> 8 & 0xff, v[i] >>> 16 & 0xff, v[i] >>> 24 & 0xff);
}
return eval(strc)(c030b390, c030b391, c030b392);
}
'''
a = 25685173
o = '10000000000000000000000000001501'
s = 1642228425
ctx = execjs.compile(js_str)
md5_str = ctx.call('ub98484234',a,o,s)
print(md5_str)
结果报如下错误:
`
说CryptoJS这个没有定义
于是小编改了一下这个加密函数的js代码,把strc这个字段返回,如下:
小编看到了输出结果中有CryptoJS这个字段,的确小编没有定义。
var cb=xx0+xx1+xx2+"220320220115";
var rb=CryptoJS.MD5(cb).toString();
小编于是想不如直接用Python实现上述这段代码的效果,然后再拼接到上述输出结果的字符串中。
import hashlib
import execjs
import re
js_str = '''
(function (xx0,xx1,xx2){var cb=xx0+xx1+xx2+"220320220115";var rb=CryptoJS.MD5(cb).toString();var re=[];for(var i=0;i<rb.length/8;i++)re[i]=(parseInt(rb.substr(i*8,2),16)&0xff)|((parseInt(rb.substr(i*8+2,2),16)<<8)&0xff00)|((parseInt(rb.substr(i*8+4,2),16)<<24)>>>8)|(parseInt(rb.substr(i*8+6,2),16)<<24);var k2=[0x145ac5cc,0x2a3656bd,0x1920e1,0x6bf023c3];for(var I=0;I<2;I++){var v0=re[I*2],v1=re[I*2+1],sum=0,i=0;var delta=0x9e3779b9;for(i=0;i<32;i++){sum+=delta;v0+=((v1<<4)+k2[0])^(v1+sum)^((v1>>>5)+k2[1]);v1+=((v0<<4)+k2[2])^(v0+sum)^((v0>>>5)+k2[3]);}re[I*2]=v0;re[I*2+1]=v1;}re[0]+=k2[0];re[0]+=k2[2];re[0]+=k2[2];re[1]+=k2[1];re[1]-=k2[3];re[1]^=k2[3];re[2]+=k2[0];re[2]-=k2[2];re[2]-=k2[0];re[2]-=k2[2];re[3]^=k2[1];re[3]-=k2[3];re[3]-=k2[1];re[3]=(re[3]>>>(k2[3]%16))|(re[3]<<(32-(k2[3]%16)));re[0]=(re[0]<<(k2[0]%16))|(re[0]>>>(32-(k2[0]%16)));re[0]=(re[0]<<(k2[2]%16))|(re[0]>>>(32-(k2[2]%16)));re[0]-=k2[2];re[1]=(re[1]>>>(k2[1]%16))|(re[1]<<(32-(k2[1]%16)));re[1]-=k2[3];re[1]-=k2[1];re[1]=(re[1]>>>(k2[3]%16))|(re[1]<<(32-(k2[3]%16)));re[1]^=k2[3];re[2]^=k2[0];re[2]^=k2[2];re[2]^=k2[2];re[2]^=k2[2];re[3]=(re[3]<<(k2[1]%16))|(re[3]>>>(32-(k2[1]%16)));re[3]^=k2[3];re[3]=(re[3]>>>(k2[3]%16))|(re[3]<<(32-(k2[3]%16)));{var hc='0123456789abcdef'.split('');for(var i=0;i<re.length;i++){var j=0,s='';for(;j<4;j++)s+=hc[(re[i]>>(j*8+4))&15]+hc[(re[i]>>(j*8))&15];re[i]=s;}re=re.join('');}var rt="v=220320220115"+"&did="+xx1+"&tt="+xx2+"&sign="+re;return rt;});
'''
a = 25685173
o = '10000000000000000000000000001501'
s = 1642228425
cb = str(a)+o+str(s)+'220320220115'
rb = hashlib.md5(cb.encode('utf-8')).hexdigest()
rindex = js_str.rfind(')')
index = js_str.find('(')
js_str = js_str[index+1:rindex].replace('var rb=CryptoJS.MD5(cb).toString()','var rb='{}''.format(rb)).replace('function (xx0,xx1,xx2){','function md5(xx0,xx1,xx2){')
ctx = execjs.compile(js_str)
md5_str = ctx.call('md5',a,o,s)
print(md5_str)
运行结果:
把这个运行结果和刚才js断点调式的进行比较,发现完全一致。
现在我们已经实现了模拟斗鱼视频的加密,之后我们直接请求即可。但是又发现了一个bug。
就是利用小编刚才这个模拟加密代码去请求另外一个斗鱼视频网址时,总是提示请求参数错误,经过仔细分析,小编发现,原来斗鱼视频上的加密函数并不是一样的,也就是指你去请求另外一个视频网址时,它的加密函数里面变化了一些。
3. 完整实现代码
import requests
from crawlers.userAgent import useragent
import re
import demjson # 用于json数据
import execjs # 用于执行js代码
import hashlib # 用于模拟js中的md5加密
import time
import m3u8 # 用于解析m3u8文件
from queue import Queue # 队列
import os
import threading # 导入多线程模块,实现多线程下载.ts文件
useragent_1 = useragent()
url_1 = input('输入视频链接:')
vid = re.findall('https://v.douyu.com/show/(.*)?',string=url_1)[0]
rsp = requests.get(url=url_1, headers={
'user-agent': useragent_1.getUserAgent()
})
rsp.encoding = 'utf-8'
html_1 = rsp.text
json_1 = re.findall('window.$DATA=(.*?);</script>', html_1)[0]
dict_1 = demjson.decode(json_1)
a = dict_1['ROOM']['point_id']
o = "10000000000000000000000000001501" # 固定值
s = int(time.time())
script_1 = re.findall('<script> var.*} ;</script>',html_1)[0]
js_1 = script_1[9:-9]
js_2 = re.sub('return eval(strc)(.*);','return strc;',js_1)
ctx = execjs.compile(js_2)
js_3 = ctx.call('ub98484234',a,o,s)
js_4 = re.findall('function (xx0,xx1,xx2){var cb=.*?;',js_3)[0]
js_4 += 'return cb;}'
js_4 = js_4.replace('function (','function cb2(')
ctx_2 = execjs.compile(js_4)
to_Be_EncryptedStr = ctx_2.call('cb2',a,o,s) # 待加密的字符串
md5Str = hashlib.md5(to_Be_EncryptedStr.encode('utf-8')).hexdigest() # md5加密
rindex = js_3.rfind(')')
index = js_3.find('(')
js_31 = js_3[index+1:rindex].replace('var rb=CryptoJS.MD5(cb).toString()','var rb='{}''.format(md5Str)).replace('function (xx0,xx1,xx2){','function md52(xx0,xx1,xx2){')
ctx_3 = execjs.compile(js_31)
str_1 = ctx_3.call('md52',a,o,s)
data = {}
list_1 = str_1.split('&')
for str2 in list_1:
index1 = str2.find('=')
data[str2[:index1]] = str2[index1+1:]
data['vid'] = vid
rsp2 = requests.post(url='https://v.douyu.com/api/stream/getStreamUrl',
headers={
'user-agent':useragent_1.getUserAgent()
},data=data)
info_1 = demjson.decode(rsp2.text)
rsp_data = info_1['data']['thumb_video']
print('m3u8文件下载链接如下:')
for key in rsp_data:
try:
print(key,rsp_data[key]['url'])
except:
pass
m3u8_uri = rsp_data['high']['url']
pre_uri = re.findall('.*.m3u8',m3u8_uri)[0]
rindex = m3u8_uri.rfind('/')
pre_uri = pre_uri[:rindex] # .ts链接的前缀
m3u8_obj = m3u8.load(uri=m3u8_uri)
q = Queue(len(m3u8_obj.files))
# https://play-tx-recpub.douyucdn2.cn/wsd-tx-rec-pub/record/HLS/live-1863767rkpl_2010
for seg in m3u8_obj.segments:
ts_uri = pre_uri+'/'+seg.uri
q.put(ts_uri)
print('下载阶段------')
dir_name = input('创建文件夹名称:')
try:
os.mkdir('./{}'.format(dir_name))
except:
pass
def run(ts_queue:Queue,dir_name:str):
tt_name = threading.current_thread().getName()
# 当前线程的名称
while not ts_queue.empty():
uri = ts_queue.get()
rsp = requests.get(url=uri,headers={
'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36'
},stream=True)
filename = re.findall('[a-zA-Z0-9-]+.ts',uri)[0].strip()
with open(file='./{}/{}'.format(dir_name,filename),mode='wb') as f:
f.write(rsp.content)
print(tt_name+' '+filename+' 下载成功!')
threads =[] # 线程列表
for i in range(15):
thread = threading.Thread(target=run,args=(q,dir_name))
thread.setName('线程%d'%(i))
threads.append(thread)
for t in threads:
t.start()
for t in threads:
t.join()
print('下载完毕!')
list_2 = os.listdir('./{}'.format(dir_name))
list_2.sort(key=lambda x:int(x[:x.rfind('.ts')]))
file_str = ''
for ts_name in list_2:
file_str = file_str + "file '{}'\n".format(ts_name)
with open(file='./{}/file_list.txt'.format(dir_name),mode='w',encoding='utf-8') as f:
f.write(file_str)
因为这里小编讲的重点是加密,所以小编关于多线程部分和另外一些部分并没有讲,希望读者能够理解。
合并所有ts文件代码
import os
os.system('ffmpeg -f concat -i file_list.txt -c copy output.mp4')
os.system('del *.ts')
os.system('del file_list.txt')
记得上述代码必须放到和ts文件在同一个目录下,且必须安装ffmpeg这个软件,并配置相应的环境变量
运行结果在这,链接为:运用python爬虫下载斗鱼视频