2019 年 10月随笔档案 - 市丸银

numpy初识 old

摘要：一、创建ndarrary 1、使用np.arrary()创建 1)、一维数组 import numpy as np np.array([1, 2, 3, 4]) 2)、二维数组 np.array([[1, 2, 3], [3, 8,0], [3, 2, 5]]) 注意： a、创建数组的时候，数据类型阅读全文

posted @ 2019-10-31 23:47 市丸银阅读(134) 评论(0) 推荐(0) 编辑

Jupyter Notebook 快捷键

摘要：向下添加cell b 进入编辑模式 Enter 结束编辑模式 ESC 运行 Ctrl + Enter 查看用法 shit + tab tab两次用法详解提示 tab 待序阅读全文

posted @ 2019-10-31 22:46 市丸银阅读(138) 评论(0) 推荐(0) 编辑

安装numpy、matplotlib

摘要：一、安装numpy 1、下载 https://pypi.org/project/numpy/#files 2、安装 3、校验二、安装matplotlib 阅读全文

posted @ 2019-10-31 21:21 市丸银阅读(144) 评论(0) 推荐(0) 编辑

安装Jupyter Notebook

摘要：1、安装Ipython 2、安装jupyter 3、运行jupyter 阅读全文

posted @ 2019-10-31 20:41 市丸银阅读(127) 评论(0) 推荐(0) 编辑

安装lxml

摘要：1、安装wheel 2、下载lxml库的whl文件下载地址：https://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml 版本：python 3.7 64位 3、安装lxml 4、检测阅读全文

posted @ 2019-10-31 20:37 市丸银阅读(573) 评论(0) 推荐(0) 编辑

安装anaconda

摘要：清华镜像 https://blog.csdn.net/u014061630/article/details/92744781#21_anaconda_5 一、安装 1、安装(参考官网) 官方：https://docs.anaconda.com/anaconda/ 2、校验从开始菜单查询Anacon 阅读全文

posted @ 2019-10-31 20:35 市丸银阅读(126) 评论(0) 推荐(0) 编辑

Scrapy-redis组件

摘要：核心：共享爬取队列目的：实现分布式一、安装 pip3 install -i https://pypi.douban.com/simple scrapy-redis 二、去重 1、配置文件 scrapy 去重 DUPEFILTER_KEY = 'dupefilter:%(timestamp)s' 阅读全文

posted @ 2019-10-28 23:47 市丸银阅读(211) 评论(0) 推荐(0) 编辑

redis集合

摘要：存值若要存入集合的值已存在(redis)，则返回值r1或r2是 0 阅读全文

posted @ 2019-10-28 23:32 市丸银阅读(109) 评论(0) 推荐(0) 编辑

Scrapy信号量

摘要：1、类 2、配置文件阅读全文

posted @ 2019-10-28 23:24 市丸银阅读(239) 评论(0) 推荐(0) 编辑

Scrapy定制命令开启爬虫

摘要：一、单爬虫运行每次运行scrapy都要在终端输入命令太麻烦了在项目的目录下创建manager.py（任意名称）二、所有爬虫运行 1、在spiders同级创建commands目录(任意) 2、在其中创建 crawlall.py 文件，决定命令的运行 3、配置文件 4、manager.py 阅读全文

posted @ 2019-10-28 23:11 市丸银阅读(254) 评论(0) 推荐(0) 编辑

Scrapy中间件

摘要：一、下载中间件 1、应用场景代理 USER_AGENT(在setting文件中配置即可) 2、定义类 a、process_request 返回None 执行顺序 md1 request -> md2 request -> md2 response -> md1 response b、process 阅读全文

posted @ 2019-10-28 22:56 市丸银阅读(238) 评论(0) 推荐(0) 编辑

Scrapy简介

摘要：一、架构图二、流程 1、引擎从调度器中取出一个URL，用于抓取 2、引擎把URL封装成一个请求(start_requests) 传递给下载器 3、下载器把资源下载下来，并封装成Response 4、爬虫解析(parse) Response 5、解析出实体(yield Item)，交给pipelin 阅读全文

posted @ 2019-10-27 23:25 市丸银阅读(139) 评论(0) 推荐(0) 编辑

Scrapy解析器xpath

摘要：一、使用xpath 不在scrapy框架中通过response HtmlResponse->TextResponse->self.selector.xpath(query, **kwargs)->selector(self)->from scrapy.selector import Selector 阅读全文

posted @ 2019-10-27 23:04 市丸银阅读(2932) 评论(0) 推荐(0) 编辑

Scrapy设置代理

摘要：设置代理的位置:下载中间件一、内置代理(优点：简单，缺点：只能代理一个ip) 1、源码分析 process_request(self, request, spider)在下载器执行前执行 _set_proxy方法(设置代理)->self.proxies[scheme]->self.proxies 阅读全文

posted @ 2019-10-27 22:15 市丸银阅读(2593) 评论(0) 推荐(0) 编辑

Scrapy定制起始请求

摘要：Scrapy引擎来爬虫中取起始的URL 1、调用start_requests方法(父类)，并获取返回值 2、将放回值变成迭代器，通过iter() 3、执行__next__()方法取值 4、把返回值全部放到调度器中在爬虫类中重写start_requests方法 from scrapy import 阅读全文

posted @ 2019-10-26 20:00 市丸银阅读(208) 评论(0) 推荐(0) 编辑

Scrapy深度和优先级

摘要：一、深度配置文件 settings.py 二、优先级配置文件优先级为正数时，随着深度越大，优先级越低源码中，优先级三、源码分析 1、深度前提：scrapy yield request对象 -> 中间件 ->调度器... yield Request对象没有设置meta的值，meta默认为N 阅读全文

posted @ 2019-10-26 16:29 市丸银阅读(1409) 评论(0) 推荐(0) 编辑

Scrapy去重

摘要：一、原生 1、模块 2、RFPDupeFilter方法 a、request_seen 核心：爬虫每执行一次yield Request对象，则执行一次request_seen方法作用：用来去重，相同的url只能访问一次实现：将url值变成定长、唯一的值，如果这个url对象存在，则返回True表名已阅读全文

posted @ 2019-10-25 23:45 市丸银阅读(686) 评论(0) 推荐(0) 编辑

Scrapy持久化(items+pipelines)

摘要：一、items保存爬取的文件 items.py import scrapy class QuoteItem(scrapy.Item): # define the fields for your item here like: # name = scrapy.Field() text = scrapy 阅读全文

posted @ 2019-10-23 23:13 市丸银阅读(313) 评论(0) 推荐(0) 编辑

Scrapy的基本使用

摘要：爬取：http://quotes.toscrape.com 单页面 # -*- coding: utf-8 -*- import scrapy class QuoteSpider(scrapy.Spider): name = 'quote' allowed_domains = ['quotes.to 阅读全文

posted @ 2019-10-23 22:41 市丸银阅读(159) 评论(0) 推荐(0) 编辑

scrapy框架安装及创建

摘要：介绍：大而全的爬虫组件使用Anaconda conda install -c conda-forge scrapy 一、安装： windows 1.下载 https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted 耐心等待网页刷新 pip3 instal 阅读全文

posted @ 2019-10-22 22:47 市丸银阅读(195) 评论(0) 推荐(0) 编辑

requests请求

摘要：requests：伪造浏览器请求请求 1.get requests.get( url='', params={ 'k1': ''v1, 'k2': 'v2' } ) 即 url?k1=k2&k2=v2 2.post requests.post( url='', # data 提交的数据 data={key: value}, # 请求头 headeres={}， # cookies值需要从get请阅读全文

posted @ 2019-10-22 15:28 市丸银阅读(159) 评论(0) 推荐(0) 编辑

爬虫简单使用

摘要：一、常识二、示例阅读全文

posted @ 2019-10-19 22:37 市丸银阅读(191) 评论(0) 推荐(0) 编辑

使用使用django-cors-headers解决跨域问题

摘要：安装注册App 添加中间件必须放在最前面，因为要先解决跨域的问题。只有允许跨域请求，后续的中间件才会正常执行。配置你可以选择不限制跨域访问或者你可以选择设置允许访问的白名单阅读全文

posted @ 2019-10-10 20:29 市丸银阅读(288) 评论(0) 推荐(0) 编辑

Django自定义状态码

摘要：class BaseResponse: def __init__(self): self.code = 1000 self.data = None self.error = None @property def dict(self): return self.__dict__ 阅读全文

posted @ 2019-10-10 20:25 市丸银阅读(1299) 评论(0) 推荐(0) 编辑

去除Linux中的^M

摘要：（1）安装tofrodos sudo apt-get install tofrodos （2）做一些优化 ln -s /usr/bin/todos /usr/bin/unix2dos ln -s /usr/bin/fromdos /usr/bin/dos2unix 第一种方法： cat -A filename 就可以看到Windows下的断元字符 ^M 要去除他，最简单用下面的命令： dos2un 阅读全文

posted @ 2019-10-10 11:50 市丸银阅读(329) 评论(0) 推荐(0) 编辑

Django 信号量

摘要：参考：https://www.cnblogs.com/wupeiqi/articles/5246483.html 一、信号：就是一些动作发生的时候，信号允许特定的发送者去提醒一些接受者如：在执行sql语句前或后，记录一条日志二、用法 1、位置 2、导入模块 3、自定义函数 4、注册三、Djan 阅读全文

posted @ 2019-10-09 22:27 市丸银阅读(616) 评论(0) 推荐(0) 编辑

python IO多路复用

摘要：基于select 作用：I/O多路复用是用于提升效率，单个进程可以同时监听多个网络连接IO。 server端 client端主要作用：定制异步框架阅读全文

posted @ 2019-10-05 13:01 市丸银阅读(164) 评论(0) 推荐(0) 编辑

python IO非阻塞模型

摘要：server端 client端阅读全文

posted @ 2019-10-05 12:56 市丸银阅读(166) 评论(0) 推荐(0) 编辑

python 快速创建字典 fromkes()

摘要：作用：快速创建字典特点：共用value 阅读全文

posted @ 2019-10-04 10:08 市丸银阅读(602) 评论(0) 推荐(0) 编辑

python 自定义expection

摘要：class PricePolicyInvalid(Exception): def __init__(self, msg): self.msg = msg 阅读全文

posted @ 2019-10-03 20:16 市丸银阅读(333) 评论(0) 推荐(0) 编辑

python 虚拟环境

摘要：作用：一台服务器运行不同版本的模块 1、安装 2、过程 a.创建文件夹，用于储存虚拟环境 b.切换到该文件夹下 c.生成no-site-packages e.激活 f.下载模块 g.使无效阅读全文

posted @ 2019-10-02 23:34 市丸银阅读(152) 评论(0) 推荐(0) 编辑

python 找到项目使用的所有组件和版本

摘要：1、下载模块 2、生成文件阅读全文

posted @ 2019-10-02 23:27 市丸银阅读(465) 评论(0) 推荐(0) 编辑

flask-migrate

摘要：一、下载 pip3 install -i https://pypi.douban.com/simple flask-migrate 注意：依赖 flask-script 二、使用 manage.py from flask_script import Manager from flask_migrat 阅读全文

posted @ 2019-10-02 23:17 市丸银阅读(143) 评论(0) 推荐(0) 编辑

flask-script

摘要：一、安装模块二、功能： 1.增加 runserver 重要 manage.py cmd命令 2.位置传参，执行函数 3.关键词传参，执行函数阅读全文

posted @ 2019-10-02 23:09 市丸银阅读(163) 评论(0) 推荐(0) 编辑

flask-sqlalchemy

摘要：一、安装二、使用(文件结构blueprint) 1、__init__.py 注意：a.SQLAlchemy的实例化必须在导入蓝图之前 b.必须导入models.py >储存ORM类 2、models.py a.导入db b.类必须继承db.Model 3、__init__.py中的create_a 阅读全文

posted @ 2019-10-02 22:55 市丸银阅读(195) 评论(0) 推荐(0) 编辑

sqlalchemy 执行原生sql语句

摘要：1、方式一 2、方式二阅读全文

posted @ 2019-10-02 17:43 市丸银阅读(15328) 评论(0) 推荐(1) 编辑

sqlalchemy 多线程创建session

摘要：1、基于threding.local，推荐使用 2、基于多线程阅读全文

posted @ 2019-10-02 17:36 市丸银阅读(2182) 评论(0) 推荐(0) 编辑

sqlalchemy 多对多

摘要：一、表关系注意：要自己创建第三张表二、操作数据阅读全文

posted @ 2019-10-02 17:31 市丸银阅读(248) 评论(0) 推荐(0) 编辑

sqlalchemy 外键

摘要：一、表二、数据操作阅读全文

posted @ 2019-10-02 17:22 市丸银阅读(257) 评论(0) 推荐(0) 编辑

wtforms 钩子函数

摘要：参考： https://www.cnblogs.com/wupeiqi/articles/8202357.html 阅读全文

posted @ 2019-10-02 16:18 市丸银阅读(190) 评论(0) 推荐(0) 编辑

sqlalchemy 单表增删改查

摘要：1、连接数据库，并创建session from sqlalchemy.orm import sessionmaker from sqlalchemy import create_engine engine = create_engine( "mysql+pymysql://root:密码@127.0 阅读全文

posted @ 2019-10-02 00:08 市丸银阅读(238) 评论(0) 推荐(0) 编辑

sqlalchemy 数据库操作

摘要：1、简介一种ORM 2、安装 3、连接数据库 4、创建/删除表(包含连接数据库) a、表类 b、创建/删除表注意：sqlalchemy的表类不像django的orm那样可以更新，只能删除和重建阅读全文

posted @ 2019-10-01 23:44 市丸银阅读(304) 评论(0) 推荐(0) 编辑

市丸银

知行合一

10 2019 档案

公告

搜索

常用链接

我的标签

随笔分类

随笔档案

文章分类

阅读排行榜

评论排行榜

推荐排行榜

最新评论