怎么爬取华为荣耀店的信息
1.我们要爬取的url为:https://m.vmall.com/help/hnrstoreaddr.htm
2.点入一个具体页面看我们要爬取的商店信息
3.我看先我们爬取的信息是否为动态加载的信息,2个方式,第一个在Priview里面看我的页面是否有爬取的内容信息,第二种在Response里面查找我们要爬取的内容(这次爬取为动态加载)
4.我们看爬取的url与参数:根据参数发现ID不同内容不同,所有我们爬取所有的内容就要根据ID爬取
5.爬取ID:在首页我们可以爬取ID,一般ID都与名字捆绑在一起出现
6.循环爬取
import requests
import json
headers = {
'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36"
}
url = 'https://openapi.vmall.com/mcp/offlineshop/getShopList'
data = {"portal":2,"lang":"zh-CN","country":"CN","brand":1,"province":"湖南","city":"岳阳","pageNo":1,"pageSize":20}
response = requests.post(url=url,headers = headers,data=json.dumps(data))
response_detail = response.json()
for shopInfos in response_detail['shopInfos']:
id = shopInfos['id']
url = 'https://openapi.vmall.com/mcp/offlineshop/getShopById'
params = {
'portal': '2',
'version': '10',
'country': 'CN',
'shopId':str(id),
'lang': 'zh-CN'
}
response = requests.get(url=url, headers=headers, params=params)
page_text = response.json()
print(page_text['shopInfo']['address'], page_text['shopInfo']['name'],page_text['shopInfo']['serviceTime'],id)