hefany

2021年1月7日

20、scrapy 创建报错 ImportError:DLL load failed:操作系统无法运行1%

摘要：错误预览解决方法 pip install -I cryptography --user 阅读全文

posted @ 2021-01-07 11:48 hefany 阅读(99) 评论(0) 推荐(0) 编辑

摘要： selenium 练习 # coding="utf-8" from selenium import webdriver from lxml import etree import json import time class Tiantian_spider(): def __init__(self) 阅读全文

posted @ 2021-01-07 10:50 hefany 阅读(137) 评论(0) 推荐(0) 编辑

2021年1月6日

18、让python使用国内镜像

摘要：国内镜像源清华：https://pypi.tuna.tsinghua.edu.cn/simple 阿里云：http://mirrors.aliyun.com/pypi/simple/ 中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/ 华中理工大学：ht 阅读全文

posted @ 2021-01-06 21:55 hefany 阅读(118) 评论(0) 推荐(0) 编辑

17、设置Phantom JS的请求useragent

摘要： dcap = dict(DesiredCapabilities.PHANTOMJS) dcap["phantomjs.page.settings.userAgent"] = ( "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHT 阅读全文

posted @ 2021-01-06 20:09 hefany 阅读(127) 评论(0) 推荐(0) 编辑

16、selenium与phantomJS的使用

摘要： selenium Selenium是一个用于Web应用程序测试的工具。Selenium测试直接运行在浏览器中，就像真正的用户在操作一样。安装selenium库 pip install selenium phantomJS 是一个软件下载地址淘宝镜像下载解压之后，将目录配置到PATH环境变量中阅读全文

posted @ 2021-01-06 16:59 hefany 阅读(142) 评论(0) 推荐(0) 编辑

2021年1月5日

15、python基础四则运算

摘要： python基础运算废话不多说，直接上结果加 + 减 — 乘 * 除 / 阅读全文

posted @ 2021-01-05 22:44 hefany 阅读(72) 评论(0) 推荐(0) 编辑

14、如何保存python中help返回的文件

摘要：保存help返回文件 import sys out = sys.stdout sys.stdout = open("help.txt","w") help(help) sys.stdout.close() sys.stdout =out exit() 阅读全文

posted @ 2021-01-05 14:53 hefany 阅读(64) 评论(0) 推荐(0) 编辑

2021年1月4日

13、数据提取2：beautifulsoup

摘要： beautiful soup 的简单介绍目的：使用beautifulsoup提取爬下来的数据通常多为网页数据，html文本在这里做个简单的介绍 <></> 这种形式的叫做双标签 <p></p> p标签， p 标签的名字，其他同理可推 <p class = "one"> ...</p> cla 阅读全文

posted @ 2021-01-04 20:47 hefany 阅读(68) 评论(0) 推荐(0) 编辑

12、爬虫实践1：静态网页数据爬取

摘要：爬虫实践：静态网页爬取目标网址：https://movie.douban.com/top250 爬取数据目标：电影排名，电影名称，评分，评价数量页面分析每页显示25条数据，共计10页，一共250条数据。检查网页源码：所需要的数据在网页源码均有检查网页链接：第一页：https://m 阅读全文

posted @ 2021-01-04 18:57 hefany 阅读(412) 评论(0) 推荐(0) 编辑

11、爬虫的数据提取1

摘要： python 爬虫数据提取常见的爬虫数据提取有三种方式：正则表达式，beautifulsoup模块， lxml模块正则表达式正则表达式手册具体内容请点击连接，仔细阅读。需要提及的是，与正则表达式匹配数据源类型是：str beautifulsoup beautifulsoup官方手册在使用阅读全文

posted @ 2021-01-04 16:32 hefany 阅读(97) 评论(0) 推荐(0) 编辑

公告