晨曦yd

2019年8月31日

摘要： import urllib.request import urllib from lxml import etree import requests url="https://tieba.baidu.com/f?kw=%E6%A1%8C%E9%9D%A2&ie=utf-8&pn=50" headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1) A... 阅读全文

posted @ 2019-08-31 21:11 晨曦yd 阅读(200) 评论(0) 推荐(0) 编辑

2019年6月20日

迷宫

摘要： #include #include #include #include using namespace std; typedef struct node { int row; int col; int direction; }; stack path; void ok() { srand(time(NULL)); int i,j,m,n; ... 阅读全文

posted @ 2019-06-20 12:49 晨曦yd 阅读(125) 评论(0) 推荐(0) 编辑

2019年6月15日

爬取全国城市近5.6年来空气质量情况

摘要： import urllib.request import urllib.parse import requests import csv from lxml import etree from selenium import webdriver import time url='https://www.aqistudy.cn/historydata/index.php' #broswer = ... 阅读全文

posted @ 2019-06-15 22:55 晨曦yd 阅读(305) 评论(0) 推荐(0) 编辑

2019年6月5日

爬取去哪网景点数据

摘要： import urllib.parse import urllib.request import requests from bs4 import BeautifulSoup import csv import time import re sd=['名字','地址','价格','月销量','景点概述'] with open('C:\\Users\\惠普\\Desktop\\ac2.csv',... 阅读全文

posted @ 2019-06-05 22:37 晨曦yd 阅读(279) 评论(0) 推荐(0) 编辑

简单爬取网易云评论

摘要： #爬取一首歌的网易云评论import urllib.request import csv import requests import re from lxml import etree from selenium import webdriver import time url='https://music.163.com/#/song?id=435305106' headers={ 'Us... 阅读全文

posted @ 2019-06-05 17:16 晨曦yd 阅读(659) 评论(0) 推荐(0) 编辑

简单爬取美团美食信息

摘要： #selenium自动化测试import urllib.request import requests import csv import time from selenium import webdriver header={'Accept':'application/json', 'Accept-Encoding':'gzip, deflate, br', 'Accept-Language'... 阅读全文

posted @ 2019-06-05 17:05 晨曦yd 阅读(621) 评论(0) 推荐(0) 编辑

2019年5月30日

爬取豆瓣高分电影

摘要： import requests from bs4 import BeautifulSoup import time import re import json import csv urls=[] tc=['名字','评分','导演','演员','时长'] with open('C:\\Users\\lenovo\\Desktop\\go1.csv', 'a+', newline='', en... 阅读全文

posted @ 2019-05-30 14:26 晨曦yd 阅读(374) 评论(0) 推荐(0) 编辑

2019年5月26日

爬取空气质量1

摘要： import urllib.request import requests import csv import re from lxml import etree url='http://www.air-level.com' response=urllib.request.urlopen(url+'/').read().decode() hrefs=re.findall(r'',respon... 阅读全文

posted @ 2019-05-26 16:26 晨曦yd 阅读(205) 评论(0) 推荐(0) 编辑

2019年5月25日

简单爬取小说

摘要： import urllib.request import re #爬取小说是最基础的爬虫，学会思路就能去做一些高级爬虫，思路一样，只是用的库或者JS或者异步等问题不同而已url = "https://www.qb5200.tw/xiaoshuo/36/36143/"#爬取的小说网址 with urllib.request.urlopen(url) as doc: html = doc... 阅读全文

posted @ 2019-05-25 22:37 晨曦yd 阅读(327) 评论(0) 推荐(0) 编辑

爬取梨视频

摘要： #下载网页中的视频 import urllib.request import re#正则表达式 import os #找到起始网页 url ='https://www.pearvideo.com/category_8' html = urllib.request.urlopen(url).read( 阅读全文

posted @ 2019-05-25 22:25 晨曦yd 阅读(154) 评论(0) 推荐(0) 编辑

公告