04 2018 档案

摘要:创建项目: scrapy startproject myproject cd myproject 创建一个spider scrapy genspider spidername spiderurl.com 查看所有命令: scrapy -h 全局命令: startproject settings ru 阅读全文
posted @ 2018-04-26 14:39 qukaige 阅读(139) 评论(0) 推荐(0) 编辑
摘要:import aiohttp import asyncio async def aaa(): async with aiohttp.ClientSession() as session: async with session.get('https://github.com') as response: if response.status ==... 阅读全文
posted @ 2018-04-14 16:51 qukaige 阅读(156) 评论(0) 推荐(0) 编辑
摘要:import asyncio async def hello(): print('Hello start!') await asyncio.sleep(2) print('Hello end!') async def hello222(): print('Hello222 start!') result = await somework() ... 阅读全文
posted @ 2018-04-14 13:19 qukaige 阅读(141) 评论(0) 推荐(0) 编辑
摘要:ret_top = {'111':20,'222':1,'233':5} ret = sorted(ret_top.items(), key=lambda e: e[1], reverse=True) print(ret) # [('111', 20), ('233', 5), ('222', 1)] 阅读全文
posted @ 2018-04-13 17:41 qukaige 阅读(129) 评论(0) 推荐(0) 编辑
摘要:''' 驱动对象 http://selenium-python.readthedocs.io/ browser = webdriver.Chrome() browser = webdriver.Firefox() browser = webdriver.Edge() browser = webdriver.PhantomJS() browser = ... 阅读全文
posted @ 2018-04-05 18:42 qukaige 阅读(165) 评论(0) 推荐(0) 编辑
摘要:内置time 内置datetime 可以做日期加减 Arrow 库 安装 pip install arrow 阅读全文
posted @ 2018-04-04 12:10 qukaige 阅读(1582) 评论(0) 推荐(0) 编辑
摘要:from pyquery import PyQuery as pq html = ''' 123 123 123 123 xcxxx ''' # # doc = pq(html) # 获取单个 # p_text = doc('#cc')[0].text # print(p_text) # URL初始化 ... 阅读全文
posted @ 2018-04-03 19:14 qukaige 阅读(666) 评论(0) 推荐(0) 编辑
摘要:''' 解析器: Python 内置标准库 优势:执行速度适中,文档容错能力强 BeautifulSoup(html,'html.parser') 3.7 or 3.2 容错能力较差 lxml HTML 速度快,文档容错能力强 (最常用) BeautifulSoup(html,'lxml') ... 阅读全文
posted @ 2018-04-03 16:49 qukaige 阅读(1323) 评论(0) 推荐(0) 编辑

点击右上角即可分享
微信分享提示