那些有趣/用的 Python 库
图片处理
pip install pillow
from PIL import Image
import numpy as np
a = np.array(Image.open('test.jpg'))
b = [255,255,255] - a
im = Image.fromarray(b.astype('uint8'))
im.save('new.jpg')
youtube-dl下载国外视频
pip install youtube-dl #直接安装youtube-dl
pip install -U youtube-dl #安装youtube-dl并更新
youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"
查看对象的全部属性和方法
pip install pdir2
>>> import pdir,requests
>>> pdir(requests)
module attribute:
__cached__, __file__, __loader__, __name__, __package__, __path__, __spec__
other:
__author__, __build__, __builtins__, __copyright__, __license__, __title__,
__version__, _internal_utils, adapters, api, auth, certs, codes, compat, cookies
, exceptions, hooks, logging, models, packages, pyopenssl, sessions, status_code
s, structures, utils, warnings
special attribute:
__doc__
class:
NullHandler: This handler does nothing. It's intended to be used to avoid th
e
PreparedRequest: The fully mutable :class:`PreparedRequest <PreparedRequest>
` object,
Request: A user-created :class:`Request <Request>` object.
Response: The :class:`Response <Response>` object, which contains a
Session: A Requests session.
exception:
ConnectTimeout: The request timed out while trying to connect to the remote
server.
ConnectionError: A Connection error occurred.
DependencyWarning: Warned when an attempt is made to import a module with mi
ssing optional
FileModeWarning: A file was opened in text mode, but Requests determined its
binary length.
HTTPError: An HTTP error occurred.
ReadTimeout: The server did not send any data in the allotted amount of time
.
RequestException: There was an ambiguous exception that occurred while handl
ing your
Timeout: The request timed out.
TooManyRedirects: Too many redirects.
URLRequired: A valid URL is required to make a request.
function:
delete: Sends a DELETE request.
get: Sends a GET request.
head: Sends a HEAD request.
options: Sends a OPTIONS request.
patch: Sends a PATCH request.
post: Sends a POST request.
put: Sends a PUT request.
request: Constructs and sends a :class:`Request <Request>`.
session: Returns a :class:`Session` for context-management.
Python 玩转网易云音乐
pip install ncmbot
import ncmbot
#登录
bot = ncmbot.login(phone='xxx', password='yyy')
bot.content # bot.json()
#获取用户歌单
ncmbot.user_play_list(uid='36554272')
下载视频字幕
pip install getsub
Python 财经数据接口包
pip install tushare
import tushare as ts
#一次性获取最近一个日交易日所有股票的交易数据
ts.get_today_all()
代码,名称,涨跌幅,现价,开盘价,最高价,最低价,最日收盘价,成交量,换手率
code name changepercent trade open high low settlement \
0 002738 中矿资源 10.023 19.32 19.32 19.32 19.32 17.56
1 300410 正业科技 10.022 25.03 25.03 25.03 25.03 22.75
2 002736 国信证券 10.013 16.37 16.37 16.37 16.37 14.88
3 300412 迦南科技 10.010 31.54 31.54 31.54 31.54 28.67
4 300411 金盾股份 10.007 29.68 29.68 29.68 29.68 26.98
5 603636 南威软件 10.006 38.15 38.15 38.15 38.15 34.68
6 002664 信质电机 10.004 30.68 29.00 30.68 28.30 27.89
7 300367 东方网力 10.004 86.76 78.00 86.76 77.87 78.87
8 601299 中国北车 10.000 11.44 11.44 11.44 11.29 10.40
9 601880 大连港 10.000 5.72 5.34 5.72 5.22 5.20
10 000856 冀东装备 10.000 8.91 8.18 8.91 8.18 8.10
开源漏洞靶场
# 安装pip
curl -s https://bootstrap.pypa.io/get-pip.py | python3
# 安装docker
apt-get update && apt-get install docker.io
# 启动docker服务
service docker start
# 安装compose
pip install docker-compose
# 拉取项目
git clone git@github.com:phith0n/vulhub.git
cd vulhub
# 进入某一个漏洞/环境的目录
cd nginx_php5_mysql
# 自动化编译环境
docker-compose build
# 启动整个环境
docker-compose up -d
#测试完成后,删除整个环境
docker-compose down
北京实时公交
git https://github.com/XTAYJGDUFVF/beijing_bus.git
pip install -r requirements.txt 安装依赖
python manage.py build_cache 获取离线数据,建立本地缓存
#项目自带了一个终端中的查询工具作为例子,运行: python manage.py cli
>>> from beijing_bus import BeijingBus
>>> lines = BeijingBus.get_all_lines()
>>> lines
[<Line: 运通122(农业展览馆-华纺易城公交场站)>, <Line: 运通101(广顺南大街北口-蓝龙家园)>, ...]
>>> lines = BeijingBus.search_lines('847')
>>> lines
[<Line: 847(马甸桥西-雷庄村)>, <Line: 847(雷庄村-马甸桥西)>]
>>> line = lines[0]
>>> print line.id, line.name
541 847(马甸桥西-雷庄村)
>>> line.stations
[<Station 马甸桥西>, <Station 马甸桥东>, <Station 安华桥西>, ...]
>>> station = line.stations[0]
>>> print station.name, station.lat, station.lon
马甸桥西 39.967721 116.372921
>>> line.get_realtime_data(1) # 参数为站点的序号,从1开始
[
{
'id': 公交车id,
'lat': 公交车的位置,
'lon': 公交车位置,
'next_station_name': 下一站的名字,
'next_station_num': 下一站的序号,
'next_station_distance': 离下一站的距离,
'next_station_arriving_time': 预计到达下一站的时间,
'station_distance': 离本站的距离,
'station_arriving_time': 预计到达本站的时间,
},
...
]
文章提取器
git clone https://github.com/grangier/python-goose.git
cd python-goose
pip install -r requirements.txt
python setup.py install
>>> from goose import Goose
>>> from goose.text import StopWordsChinese
>>> url ='http://www.bbc.co.uk/zhongwen/simp/chinese_news/2012/12/121210_hongkong_politics.shtml'
>>> g = Goose({'stopwords_class': StopWordsChinese})
>>> article = g.extract(url=url)
>>> print article.cleaned_text[:150]
香港行政长官梁振英在各方压力下就其大宅的违章建筑(僭建)问题到立法会接受质询,并向香港民众道歉。
梁振英在星期二(12月10日)的答问大会开始之际在其演说中道歉,但强调他在违章建筑问题上没有隐瞒的意图和动机。
一些亲北京阵营议员欢迎梁振英道歉,且认为应能获得香港民众接受,但这些议员也质问梁振英有
Python 艺术二维码生成器
pip install MyQR
myqr https://github.com
myqr https://github.com -v 10 -l Q
伪装浏览器身份
pip install fake-useragent
from fake_useragent import UserAgent
ua = UserAgent()
ua.ie
# Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US);
ua.msie
# Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)'
ua['Internet Explorer']
# Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.2; SV1; .NET CLR 3.3.69573; WOW64; en-US)
ua.opera
# Opera/9.80 (X11; Linux i686; U; ru) Presto/2.8.131 Version/11.11
ua.chrome
# Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2'
美化 curl
pip install httpstat
httpstat httpbin.org/get
python shell
pip install sh
from sh import ifconfig
print ifconfig("eth0")
处理中文文本内容
pip install -U textblob#英文文本的情感分析
pip install snownlp#中文文本的情感分析
from snownlp import SnowNLP
text = "I am happy today. I feel sad today."
from textblob import TextBlob
blob = TextBlob(text)
TextBlob("I am happy today. I feel sad today.")
blob.sentiment
Sentiment(polarity=0.15000000000000002, subjectivity=1.0)
s = SnowNLP(u'这个东西真心很赞')
s.words # [u'这个', u'东西', u'真心',
# u'很', u'赞']
s.tags # [(u'这个', u'r'), (u'东西', u'n'),
# (u'真心', u'd'), (u'很', u'd'),
# (u'赞', u'Vg')]
s.sentiments # 0.9769663402895832 positive的概率
s.pinyin # [u'zhe', u'ge', u'dong', u'xi',
# u'zhen', u'xin', u'hen', u'zan']
s = SnowNLP(u'「繁體字」「繁體中文」的叫法在臺灣亦很常見。')
s.han # u'「繁体字」「繁体中文」的叫法
# 在台湾亦很常见。'
抓取发放代理
pip install -U getproxy
➜ ~ getproxy --help
Usage: getproxy [OPTIONS]
Options:
--in-proxy TEXT Input proxy file
--out-proxy TEXT Output proxy file
--help Show this message and exit.
-
--in-proxy 可选参数,待验证的 proxies 列表文件
-
--out-proxy 可选参数,输出已验证的 proxies 列表文件,如果为空,则直接输出到终端
-
--in-proxy 文件格式和 --out-proxy 文件格式一致
zhihu api
pip install git+git://github.com/lzjun567/zhihu-api --upgrade
from zhihu import Zhihu
zhihu = Zhihu()
zhihu.user(user_slug="xiaoxiaodouzi")
{'avatar_url_template': 'https://pic1.zhimg.com/v2-ca13758626bd7367febde704c66249ec_{size}.jpg',
'badge': [],
'name': '我是小号',
'headline': '程序员',
'gender': -1,
'user_type': 'people',
'is_advertiser': False,
'avatar_url': 'https://pic1.zhimg.com/v2-ca13758626bd7367febde704c66249ec_is.jpg',
'url': 'http://www.zhihu.com/api/v4/people/1da75b85900e00adb072e91c56fd9149', 'type': 'people',
'url_token': 'xiaoxiaodouzi',
'id': '1da75b85900e00adb072e91c56fd9149',
'is_org': False}
Python 密码泄露查询模块
pip install leakPasswd
import leakPasswd
leakPasswd.findBreach('taobao')
解析 nginx 访问日志并格式化输出
pip install ngxtop
$ ngxtop
running for 411 seconds, 64332 records processed: 156.60 req/sec
Summary:
| count | avg_bytes_sent | 2xx | 3xx | 4xx | 5xx |
|---------+------------------+-------+-------+-------+-------|
| 64332 | 2775.251 | 61262 | 2994 | 71 | 5 |
Detailed:
| request_path | count | avg_bytes_sent | 2xx | 3xx | 4xx | 5xx |
|------------------------------------------+---------+------------------+-------+-------+-------+-------|
| /abc/xyz/xxxx | 20946 | 434.693 | 20935 | 0 | 11 | 0 |
| /xxxxx.json | 5633 | 1483.723 | 5633 | 0 | 0 | 0 |
| /xxxxx/xxx/xxxxxxxxxxxxx | 3629 | 6835.499 | 3626 | 0 | 3 | 0 |
| /xxxxx/xxx/xxxxxxxx | 3627 | 15971.885 | 3623 | 0 | 4 | 0 |
| /xxxxx/xxx/xxxxxxx | 3624 | 7830.236 | 3621 | 0 | 3 | 0 |
| /static/js/minified/utils.min.js | 3031 | 1781.155 | 2104 | 927 | 0 | 0 |
| /static/js/minified/xxxxxxx.min.v1.js | 2889 | 2210.235 | 2068 | 821 | 0 | 0 |
| /static/tracking/js/xxxxxxxx.js | 2594 | 1325.681 | 1927 | 667 | 0 | 0 |
| /xxxxx/xxx.html | 2521 | 573.597 | 2520 | 0 | 1 | 0 |
| /xxxxx/xxxx.json | 1840 | 800.542 | 1839 | 0 | 1 | 0 |
火车余票查询
pip install iquery
Usage:
iquery (-c|彩票)
iquery (-m|电影)
iquery -p <city>
iquery -l song [singer]
iquery -p <city> <hospital>
iquery <city> <show> [<days>]
iquery [-dgktz] <from> <to> <date>
Arguments:
from 出发站
to 到达站
date 查询日期
city 查询城市
show 演出的类型
days 查询近(几)天内的演出, 若省略, 默认15
city 城市名,加在-p后查询该城市所有莆田医院
hospital 医院名,加在city后检查该医院是否是莆田系
Options:
-h, --help 显示该帮助菜单.
-dgktz 动车,高铁,快速,特快,直达
-m 热映电影查询
-p 莆田系医院查询
-l 歌词查询
-c 彩票查询
Show:
演唱会 音乐会 音乐剧 歌舞剧 儿童剧 话剧
歌剧 比赛 舞蹈 戏曲 相声 杂技 马戏 魔术