爬取有道翻译

url:http://fanyi.youdao.com/

f12抓包

 

 

 

 有4个加密参数

ctrl + shift +f,搜索salt,找到他的加密过程

 

 可以看到r代表当前时间,i为当前时间 +上一个随机整数

t为版本信息,所以t这个参数其实是一直不变的,直接复制上就行

sign:明显是将 两端字符串和 e,i共同进行md5加密得到

在这里设置断点,单步调试

 

 e的值就等于我输入翻译的内容,接下来写代码模拟它这个加密过程就可以了

 1 import requests
 2 import hashlib
 3 import time
 4 import random
 5 import math
 6 def main():
 7     r = math.floor(time.time()*1000)
 8     i = r + int(random.random()*10)
 9     salt = i
10     e = input('翻译内容:\n')
11     sign =hashlib.md5(("fanyideskweb" + e + str(i) + "Nw(nmmbP%A-r6U3EUn]Aj").encode('utf-8')).hexdigest()
12     url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
13     headers = {
14     'Accept': 'application/json, text/javascript, */*; q=0.01',
15     'Accept-Encoding': 'gzip, deflate',
16     'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
17     'Connection': 'keep-alive',
18     'Content-Length': '251',
19     'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
20     'Cookie': 'OUTFOX_SEARCH_USER_ID=-446566454@10.108.160.18; JSESSIONID=aaahlBDeJ38SEg_0B8gbx; OUTFOX_SEARCH_USER_ID_NCOO=668509478.1310366; DICT_UGC=be3af0da19b5c5e6aa4e17bd8d90b28a|; JSESSIONID=abcEY-8G7W-aseXSei_ex; _ntes_nnid=7705a1ceb59666b8545bc121466fe1ed,1586915641487; ___rl__test__cookies=1587197444202',
21     'Host': 'fanyi.youdao.com',
22     'Referer': 'http://fanyi.youdao.com/',
23     'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36 Edg/81.0.416.58',
24     'X-Requested-With': 'XMLHttpRequest'
25     }
26     data = {
27     'i': e,
28     'client': 'fanyideskweb',
29     'salt': salt,
30     'sign': sign,
31     'version': '2.1',
32     'keyfrom': 'fanyi.web',
33     }
34     html = requests.post(url,headers=headers,data=data).json()
35     print(html['translateResult'][0][0]['tgt'])
36     translated = html['translateResult'][0][0]['tgt']
37 
38 #请求音频文件
39 words = translated.replace(" ",'%20')
40 url = 'http://tts.youdao.com/fanyivoice?word={}&le=eng&keyfrom=speaker-target'.format(words)
41 html =requests.get(url).content
42 with open('{}.mp3'.format(words),'wb') as f:
43     f.write(html)
44 
45 if __name__ == '__main__':
46     while(1):
47         main()

测试,成功

 

posted @ 2020-04-24 09:58  TrueDZ  阅读(300)  评论(2编辑  收藏  举报