随笔档案「2020年5月」 - peng_li

（六）pandas 日常使用技巧

摘要：pandas数据处理 1、删除重复元素 import numpy as np import pandas as pd from pandas import Series,DataFrame df = DataFrame({"color":["red","white","red","green"], 阅读全文

posted @ 2020-05-28 17:35 peng_li 阅读(401) 评论(0) 推荐(0)

（四）pandas的拼接操作

摘要：pandas的拼接操作 #重点 pandas的拼接分为两种：级联：pd.concat, pd.append 合并：pd.merge, pd.join 0. 回顾numpy的级联 import numpy as np import pandas as pd from pandas import Se 阅读全文

posted @ 2020-05-28 17:33 peng_li 阅读(477) 评论(0) 推荐(0)

（三）pandas 层次化索引

摘要：pandas层次化索引 1. 创建多层行索引 1) 隐式构造最常见的方法是给DataFrame构造函数的index参数传递两个或更多的数组 Series也可以创建多层索引 import numpy as np import pandas as pd from pandas import Serie 阅读全文

posted @ 2020-05-28 17:32 peng_li 阅读(882) 评论(0) 推荐(0)

（二）pandas处理丢失数据

摘要：处理丢失数据有两种丢失数据： None np.nan(NaN) import numpy as np type(None) NoneType type(np.nan) float 1. None None是Python自带的，其类型为python object。因此，None不能参与到任何计算中。阅读全文

posted @ 2020-05-28 17:30 peng_li 阅读(395) 评论(0) 推荐(0)

（一）pandas的两种对象

摘要：Pandas的数据结构 import pandas as pd from pandas import Series,DataFrame import numpy as np 1、Series Series是一种类似与一维数组的对象，由下面两个部分组成： values：一组数据（ndarray类型）阅读全文

posted @ 2020-05-28 17:28 peng_li 阅读(319) 评论(0) 推荐(0)

numpy基础用法学习

摘要：numpy get started 导入numpy库，并查看numpy版本 import numpy as np np.__version__ '1.14.0' 一、创建ndarray 1. 使用np.array()由python list创建参数为列表： [1, 4, 2, 5, 3] 注意：阅读全文

posted @ 2020-05-28 17:03 peng_li 阅读(183) 评论(0) 推荐(0)

scrapy (6)-CrawlSpider的使用

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-26 14:51 peng_li 阅读(536) 评论(0) 推荐(0)

scrapy (5)-爬取二级页面的内容

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-26 13:18 peng_li 阅读(3749) 评论(0) 推荐(0)

scrapy (4)-请求传参

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-26 13:17 peng_li 阅读(379) 评论(0) 推荐(0)

scrapy (3)- post请求

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-26 13:15 peng_li 阅读(317) 评论(2) 推荐(0)

scrapy (2)- get请求

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-26 13:14 peng_li 阅读(1517) 评论(0) 推荐(0)

python中的join()函数的用法

摘要：Python中有join()和os.path.join()两个函数，具体作用如下: join()：连接字符串数组。将字符串、元组、列表中的元素以指定的字符(分隔符)连接生成一个新的字符串 os.path.join()：将多个路径组合后返回一、函数说明 1、join()函数语法： 'sep'. 阅读全文

posted @ 2020-05-19 23:01 peng_li 阅读(787) 评论(0) 推荐(0)

requests.session()会话保持

摘要：首先说一下，为什么要进行会话保持的操作？ requests库的session会话对象可以跨请求保持某些参数，说白了，就是比如你使用session成功的登录了某个网站，则在再次使用该session对象请求该网站的其他网页都会默认使用该session之前使用的cookie等参数。尤其是在保持登陆状态时运阅读全文

posted @ 2020-05-19 22:43 peng_li 阅读(2608) 评论(0) 推荐(0)

scrapy (1)- 基础用法

摘要：”python爬虫系列“目录： Python爬虫（一）-必备基础 Python爬虫（二）- Requests爬虫包及解析工具 xpath Python爬虫（三）- Scrapy爬虫框架系列 scrapy (1)- 基础用法 scrapy (2)- get请求 scrapy (3)- post请求 s 阅读全文

posted @ 2020-05-19 18:29 peng_li 阅读(434) 评论(0) 推荐(0)

python 爬虫由于网络或代理不能用导致的问题处理方法

摘要：平时在爬取某些网页的时候，可能是由于网络不好或者代理池中的代理不能用导致请求失败。此时有们需要重复多次去请求，python中有现成的，相应的包供我们使用： 1. 我们可以利用retry模块进行多次请求，如果全部都失败才报错。当然使用retry库之前也需要先安装,eg: 阅读全文

posted @ 2020-05-18 18:25 peng_li 阅读(1726) 评论(0) 推荐(0)

可迭代对象、迭代器、生成器的区别

摘要：迭代：迭代是访问集合元素的一种方法。可迭代对象：可以被迭代的对象称为可迭代对象。迭代器：迭代器是一个可以记住遍历位置的对象，迭代器对象从集合的第一个元素开始访问，直到所有元素被访问结束，迭代器只能前进不会后退。判断一个对象是不是可迭代对象的方法： python中的可迭代对象有： list 阅读全文

posted @ 2020-05-07 12:47 peng_li 阅读(376) 评论(0) 推荐(0)

scrapy shell 的使用

摘要：是什么？：是一个终端下的调试工具，用来调试scrapy 安装ipython ：pip install ipython 启动： scrapy shell + 需要请求的url 进来之后，response就是响应对象，可以直接使用 response.text response.body response 阅读全文

posted @ 2020-05-06 16:58 peng_li 阅读(235) 评论(0) 推荐(0)

python中可变类型和不可变类型

摘要：1.python中的可变类型和不可变类型 python中的数据类型大致可分为6类：1.Number(数字) 2. String(字符串) 3. Tuple (元组) 4. List(列表) 5. Dictionary (字典) 6. Sets(集合) （bool布尔类型、int整型、float浮点型阅读全文

posted @ 2020-05-06 15:49 peng_li 阅读(6582) 评论(0) 推荐(0)

python PEP8开发规范

摘要：为了使得代码更美观，方便阅读，建议遵循下PEP8规范 1. 每行长度最大不要超过79。 2. 换行可以使用反斜杠，换行点要在操作符的后面敲回车。 3. 类个top level函数定义之间空两行；类中的方法定义之间空一行；函数内逻辑无关的代码块之间空一行；其他地方尽量不要再空行。 4. 模块导入顺序：阅读全文

posted @ 2020-05-06 14:48 peng_li 阅读(241) 评论(0) 推荐(0)

PengLi

一个学生物的程序猿

05 2020 档案

公告