js逆向——酷狗signature&酷狗音乐爬虫
寒假期间当然要开卷了。今天我们要爬取酷狗音乐的歌曲,个人觉得酷狗还是比较容易的。虽然付费音乐的apl我没找到,但有个会员就能听,能听就能下载,就不用单曲购买了,会员到期了也能听付费音乐了。想想还是不错滴。
OK,长话短说,来到酷狗音乐,输入想听的歌。点击播放,进入播放页面。这里我搜的是周杰伦的晴天。要点击播放才会出现indix.pip?r这个接口。可以看到这个接口里面是有音乐播放地址的。
请求的是这个玩意儿
请求的URL是:https://wwwapi.kugou.com/yy/index.php。当然,你直接请求是拿不到数据的,因为没有携带参数。
同样,看看要携带的参数哪些是变化的。
可以看到只有encode_album_audio_id和时间戳是变化的。但是啊,这个encode_album_audio_id应该不是加密的,为什么呐?因为我们请求的URL里面是不是就有这个encode_album_audio_id啊,这个参数一变,音乐就变了。所以encode_album_audio_id应该是单独一首歌的“身份证”。
回到我们的搜索页面,
不错,看来encode_album_audio_id确实是固定的。同样,我们只需要带上参数去请求https://complexsearch.kugou.com/v2/search/song就可以。参数如下:
只有signature和keyword不一样,还有个clienttime时间戳。signature只有32位,猜测是MD5加密。这里我们用跟栈的方法(通过了最后一个栈请求就发出去了,所以沿着堆栈往回找,就能很准确的找到参数生成的位置。应该是这样的。)找到了signature加密的位置,如下。上面有时间戳生成的方法clienttime = (new Date).getTime()。
在控制台打印一下s,
可以看到s里面的东西就是上面请求https://complexsearch.kugou.com/v2/search/song所需要的。
然后 s.join("")就是把请求参数变成字符串,然后再MD5加密(后面有一行“H5签名前参数”,所以d函数应该就是MD5加密算法)返回结果就是signature。
分析完毕!!具体代码如下:
import hashlib import time import requests import json name = input('请输入:') text = [ "NVPh5oo715z5DIWAeQlhMDsWXXQV4hwt", "appid=1014", "bitrate=0", "callback=callback123", "clienttime={}".format(int(time.time() * 1000)), "clientver=1000", "dfid=4XSnWz14ZQos2PYFIl2MiDLH", "filter=10", "inputtype=0", "iscorrection=1", "isfuzzy=0", "keyword={}".format(name), "mid=8a6709b0f4f0674f12dabeb3a710313a", "page=1", "pagesize=30", "platform=WebFilter", "privilege_filter=0", "srcappid=2919", "token=", "userid=0", "uuid=8a6709b0f4f0674f12dabeb3a710313a", "NVPh5oo715z5DIWAeQlhMDsWXXQV4hwt" ] data = "".join(text)#变成字符串 md5 = hashlib.md5(data.encode(encoding='utf-8')).hexdigest()#md5加密 url = "https://complexsearch.kugou.com/v2/search/song" headers = { "cookie": "",#这里加上自己的cookie,有会员就可以下载付费音乐,自己充还是白嫖别人的就看各位了 "authority": "complexsearch.kugou.com", "referer": "https://www.kugou.com/", "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36" } params = { "callback": "callback123", "srcappid": "2919", "clientver": "1000", "clienttime": int(time.time() * 1000), "mid": "8a6709b0f4f0674f12dabeb3a710313a", "uuid": "8a6709b0f4f0674f12dabeb3a710313a", "dfid": "4XSnWz14ZQos2PYFIl2MiDLH", "keyword": f"{name}", "page": "1", "pagesize": "30", "bitrate": "0", "isfuzzy": "0", "inputtype": "0", "platform": "WebFilter", "userid": "0", "iscorrection": "1", "privilege_filter": "0", "filter": "10", "token": "", "appid": "1014", "signature": md5 } lll = json.loads(requests.get(url=url, headers=headers, params=params).text[12:-2]) kkk = lll['data']['lists'] for s, li in enumerate(kkk): ids = li['EMixSongID'] AlbumName = li['SongName'] singername = li['SingerName'] print(s + 1, AlbumName, singername) num = input('下载哪一个:') ID = lll['data']['lists'][int(num) - 1]['EMixSongID'] name = lll['data']['lists'][int(num) - 1]['SongName'] urls = f'https://wwwapi.kugou.com/yy/index.php?r=play/getdata&callback=jQuery191015294419033165485_1674051666168&dfid=4XSnWz14ZQos2PYFIl2MiDLH&appid=1014&mid=8a6709b0f4f0674f12dabeb3a710313a&platid=4&encode_album_audio_id={ID}&_=1674051666169' params = { "r": "play/getdata", "callback": "jQuery19101351666471912658_1674051302167", "dfid": "4XSnWz14ZQos2PYFIl2MiDLH", "appid": "1014", "mid": "8a6709b0f4f0674f12dabeb3a710313a", "platid": "4", "encode_album_audio_id": f"{ID}", "_": "1674051302168" } respons = json.loads(requests.get(url=urls, headers=headers, params=params).text[41:-2]) last = respons['data']['play_url'] downlode=requests.get(url=last,headers=headers).content with open('D:/音乐/'+f'{name}.mp3','wb')as sp: sp.write(downlode) print(last) print(f'{name}下载完成')
OK,大功告成!!