python之常用开发包
1.passlib (https://passlib.readthedocs.io/en/stable/)
passlib
目前常见的不可逆加密算法有以下几种:
passlib是python 2&3的密码散列库,它提供 超过30种密码散列算法的跨平台实现,以及 作为管理现有密码哈希的框架。它被设计成有用的 对于范围广泛的任务,从验证/etc/shadow中找到的散列到 为多用户应用程序提供全强度密码哈希。
示例:
from passlib.apps import custom_app_context # 生成加密串 pwd = '123456' hash_str = custom_app_context.encrypt(pwd) print(hash_str) # 验证密码 custom_app_context.verify(pwd, self.password_hash)
2.消息队列rq (https://github.com/rq/rq)
RQ (Redis Queue)是一个简单的 Python 库,用于排队作业,并在后台使用 workers 处理作业。它得到了 Redis 的支持,而且其设计的进入门槛很低。它可以很容易地集成到你的 web 堆栈中。
安装:
pip install rq
使用示例:
jobs.py import requests def count_words(url): return len(requests.get(url).text) app.py import time from redis import Redis from rq import Queue from jobs import count_words def run(): rq = Queue('default', connection=Redis()) for i in range(100): j = rq.enqueue(count_words, 'http://nvie.com') print('1 ', j.result) time.sleep(1) print('2 ', j.result) if __name__ == '__main__': run()
启动worker:
rq worker --with-scheduler
$ rq worker low high default 16:56:02 RQ worker 'rq:worker:s2.6443' started, version 0.8.1 16:56:02 Cleaning registries for queue: low 16:56:02 Cleaning registries for queue: high 16:56:02 Cleaning registries for queue: default
后面的三个参数low、high、default,就是这个Worker将要运行哪些Queue里面的Job,这个顺序很重要,排在前面的Queue里面的Job将优先被运行。
添加任务:
python app.py
同时还支持调度任务
# Schedule job to run at 9:15, October 10th job = queue.enqueue_at(datetime(2019, 10, 8, 9, 15), say_hello) # Schedule job to be run in 10 seconds job = queue.enqueue_in(timedelta(seconds=10), say_hello)
失败回调, 重试次数
from rq import Retry # Retry up to 3 times, failed job will be requeued immediately queue.enqueue(say_hello, retry=Retry(max=3)) # Retry up to 3 times, with configurable intervals between retries queue.enqueue(say_hello, retry=Retry(max=3, interval=[10, 30, 60]))
3.远程隧道链接
sshtunnel
4.代理UA
pip install fake_useragent import requests from fake_useragent import UserAgent def get_mafengwo_content(): # 创建UserAgent对象 user_agent = UserAgent() # 设置请求头 headers = { 'User-Agent': user_agent.random } # 目标网址 url = 'https://www.mafengwo.cn/' try: # 发送GET请求 response = requests.get(url, headers=headers) http://www.jshk.com.cn/mb/reg.asp?kefu=xiaoding;//爬虫IP免费获取; # 检查请求是否成功 if response.status_code == 200: # 打印网页内容 print(response.text) else: print(f"Failed to fetch the page. Status code: {response.status_code}") except Exception as e: print(f"An error occurred: {e}") if __name__ == "__main__": get_mafengwo_content()