【JS逆向】新X社评论数据获取——逆向Signature参数
免责申明
本公众号的技术文章仅供参考,此文所提供的信息只为网络安全人员对自己所负责的网站、服务器等(包括但不限于)进行检测或维护参考,未经授权请勿利用文章中的技术资料对任何计算机系统进行入侵操作。利用此文所提供的信息而造成的直接或间接后果和损失,均由使用者本人负责。本文所提供的工具仅用于学习,禁止用于其他!!!
URL分析
本文章只针对单篇文章进行分析
请求id为134d749的文章,利用浏览器F12大法进行分析,找到包含评论数据的数据包。
定位数据包
直接在调式工具中搜索关键字,例如评论的数据,找到包含评论数据的数据包
请求数据包的URL目录为1014/n/newsapi/h5/news-detail/newscomment
复制当前请求包数据到爬虫工具库中生成爬虫工具库requests的Python代码
分析参数
通过对请求头、和GET请求中提交的参数,发现只有Signature和Timestamp会影响校验
随机修改Signature的值后再请求,提示非法的数据签名
随机修改Timestamp会出现请求已超时的提示,通过分析得当前参数是时间戳
JS逆向分析Signature参数
以上分析完成参数,只需要分析Signature参数如何生成
直接搜索Signature,并设置断点
通过分析得到,Signature参数是由当前代码生成Signature: v = Object(y.sm3)(v),其中var v = "Key=" + s + "&Timestamp=" + l + "&DeviceAccessId=" + o + "&DeviceNet=&Longitude=&Latitude=&Token=" + r + "&Request=" + d;
"" === o && (v = "Key=" + s + "&Timestamp=" + l + "&Token=" + r + "&Request=" + d);
断点分析,在请求当前页面时v的值为'Key=4bb7c7298e0778524f45f240d922d85b5bbc525c313a2f011148273f4ccbd186&Timestamp=1706781303802&Token=&Request={"docid":"11880956","doctype":0,"loadtype":0,"lastcommid":0,"pageSize":20}'
其中KEY为固定值
Request为当前请求是提交的参数{"docid":"11880956","doctype":0,"loadtype":0,"lastcommid":0,"pageSize":20}'
Signature: v = Object(y.sm3)(v),分析y函数有哪些方法,包括sm2,sm3,sm4,利用sm3对v进行hash处理得到Signature的值
利用python构建获取Signature的值
param = '{"docid":"11880956","doctype":0,"loadtype":0,"lastcommid":0,"pageSize":20}'
times = str(int(time.time()) * 1000)
# print(times)
key = f'Key=4bb7c7298e0778524f45f240d922d85b5bbc525c313a2f011148273f4ccbd186&Timestamp={times}&Token=&Request={param}'
Signature =sm3.sm3_hash(func.bytes_to_list(key.encode("utf-8")))
#获取到当前Signature 的值为98d9f16745100bde7dbd6f1e50a9cd9316953757c4e0e0718075ceeb171450f0
完善之前的爬虫代码,将Signature,Timestamp替换成对应的值
完整代码如下:
import requests,time
import json
from gmssl import sm3,func
def get_pson(Signature,times):
data = {
"docid": "11880956",
"doctype": 0,
"loadtype": 0,
"lastcommid": 0,
"pageSize": 20
}
headers = {
"Accept": "application/json, text/plain, */*",
"Accept-Language": "zh-CN,zh;q=0.9",
"Connection": "keep-alive",
"Content-Type": "application/json;charset=UTF-8",
"Device-Access-Id": "",
"Origin": "https://h.xxxxx.com",
"Referer": "https://h.xxxxx.com/vh512/share/11880956?d=134d749",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"Signature": Signature,
"Timestamp": times,
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
"sec-ch-ua": "\"Not A(Brand\";v=\"99\", \"Google Chrome\";v=\"121\", \"Chromium\";v=\"121\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Windows\""
}
url = "https://xxxxx/1014/n/newsapi/h5/news-detail/newscomment"
data = json.dumps(data, separators=(',', ':'))
response = requests.post(url, headers=headers, data=data)
print(response.text)
print(response)
if __name__ == "__main__" :
param = '{"docid":"11880956","doctype":0,"loadtype":0,"lastcommid":0,"pageSize":20}'
times = str(int(time.time()) * 1000)
# print(times)
key = f'Key=4bb7c7298e0778524f45f240d922d85b5bbc525c313a2f011148273f4ccbd186&Timestamp={times}&Token=&Request={param}'
Signature =sm3.sm3_hash(func.bytes_to_list(key.encode("utf-8")))
print(Signature)
get_pson(Signature,times)