摘要: #就成功了一波,然后被封了。已经着手准备爬去豆瓣所有的电影titleandgradeimport requests from bs4 import BeautifulSoup import random headers = {'user_agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Ge... 阅读全文
posted @ 2016-11-09 18:05 JessisLong 阅读(186) 评论(0) 推荐(0) 编辑
摘要: #第一个模块 抓取所有频道链接 from bs4 import BeautifulSoup import requests start_url = 'http://bj.58.com/sale.shtml' url_host = 'http://bj.58.com' def get_index_url(url): wb_data = requests.get(url) so... 阅读全文
posted @ 2016-11-09 09:20 JessisLong 阅读(421) 评论(0) 推荐(0) 编辑