celery — 异步、定时,python
celery
01 简介
celery是一个由python开发的用来处理大量数据的分布式系统。专注实时任务处理,支持任务调度。通常用它实现异步任务(async task)和定时任务(crontab)。
首先,celery不是一个任务队列,是一个管理分布式队列的工具,而且与语言无关,是一个独立的工具。
应用场景:
- 定时任务:每天几点执行
- 同步的可以换成异步:发邮件、推送消息等。
可以分为四部分:
brokers
:中间人,接收任务生产者发来的消息(即任务),将任务存入队列,一般用redis
backend
:存储结果的仓库
workers
:工作者,它实时监控消息队列,获取队列中调度的任务,并执行它
tasks
:任务,包含异步任务和定时任务。异步任务通常在业务逻辑中被触发并发往任务队列,而定时任务由 Celery Beat 进程周期性地将任务发往任务队列。
02 安装
- 安装redis数据库
https://www.cnblogs.com/tianzhh/articles/13646848.html
- 下载python的redis包
pip3 install redis
- 下载celery
pip3 install celery
03 应用
01 异步任务
- 创建并发送一个异步任务
# tasks.py文件
from celery import Celery
broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"
app = Celery("tasks", broker=broker, backend=backend) # 实例名必须与文件名一致
@app.task # 装饰为celery 的task对象
def add(x, y):
return x + y
注意:
Celery
实例,第一个参数是指定任务名,必须与文件名一致
当有多个装饰器的时候,app.task一定要在最外层
- 开启worker执行任务
celery -A tasks worker --loglevel=info
注意:
-A
指定创建celery对象的的文件路径
-Q
参数指的是该worker接收指定的队列的任务,这是为了当多个队列有不同的任务时可以独立;如果不设会接收所有的队列的任务
- 调用
from tasks import add
res = add.delay(1, 6)
print(res.id)
while True:
if res.ready(): # celery 运行完毕,得到结果时为True
print(res.result) # 异步,没运行完或者没结果时,结果为None
print(res.get()) # 一直等有结果
break
注意:
delay
把函数以及参数发送到workers中
- 查看返回值
from celery.result import AsyncResult
from pro_celery import tasks
async_obj = AsyncResult(id="9a2c2c5e-764f-49f0-a70a-4144708517e5", app=tasks.app)
if async_obj.successful():
result = async_obj.get()
print(result)
# result.forget() # 将结果删除
elif async_obj.failed():
print('执行失败')
elif async_obj.status == 'PENDING':
print('任务等待中被执行')
elif async_obj.status == 'RETRY':
print('任务异常后正在重试')
elif async_obj.status == 'STARTED':
print('任务已经开始被执行')
- 多任务
celery.py
from celery import Celery
broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"
app = Celery("cele", backend=broker, broker=backend, include=(
"tasks2",
"task1",
))
注意:
如果task不在该文件中,应指定task所在的文件名
任务一:task1.py
import time
from cele import app
@app.task
def add(x, y):
time.sleep(x)
return x + y
任务二:task2.py
import time
from cele import app
@app.task
def multi(x, y):
time.sleep(x)
return x * y
开启workers
celery -A tasks worker --loglevel=info
注意:
-A
后跟celery对象所在的文件名
加入-调用
import tasks2, task1
t = tasks2.multi.delay(5, 5)
t1 = task1.add.delay(1, 5)
print(t.get())
print(t1.get())
注意:
一般不用get方法,会变成同步,阻塞
02 定时任务
- 普通方法
在具体之间执行:2020-09-15 15:04:10执行
import time, task2
from datetime import datetime
tim = datetime(2020, 9, 15, 15, 54, 12)
v2 = datetime.utcfromtimestamp(tim.timestamp())
print(v2)
task2.math_add.apply_async(args=(3, 6), eta=v2) # 异步
注意:
默认是的etc时间
apply_async
是异步执行的
间隔多久执行:10秒执行一次
import task2
from datetime import datetime, timedelta
ctime = datetime.now()
utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())
delta_time = timedelta(seconds=10) # days,weekw,hours,minutes都可以
utc_time = utc_ctime + delta_time
print(utc_time)
t = task2.math_add.apply_async(args=(3, 6), eta=utc_time)
注意:
都是格式化的时间
- 类似contab
在celery的对象中配置
from celery import Celery
from celery.schedules import crontab
broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"
app = Celery("celery_crontab", broker=broker, backend=backend, include="tasks")
# app.conf.timezone = 'Asia/Shanghai'
# app.conf.enable_utc = False
app.conf.beat_schedule = {
"10-seconds": {
"task": "tasks.write_number",
"schedule": crontab(minute=1), # 每一分钟
"args": (19, 3)
},
"each10s_task": {
"task": "tasks.write_number",
"schedule": 10, # 每10秒钟执行一次
"args": (10, 10)
}
}
注意:
要创建beat
,在配置的时间往worker中提交请求
创建beat
celery -A tasks beat
开启worker
celery -A tasks worker --loglevel=info
04 celery的配置文件
基本配置
# 注意,celery4版本后,CELERY_BROKER_URL改为BROKER_URL
BROKER_URL = 'amqp://username:passwd@host:port/虚拟主机名'
# 指定结果的接受地址
CELERY_RESULT_BACKEND = 'redis://username:passwd@host:port/db'
# 指定任务序列化方式
CELERY_TASK_SERIALIZER = 'msgpack'
# 指定结果序列化方式
CELERY_RESULT_SERIALIZER = 'msgpack'
# 任务过期时间,celery任务执行结果的超时时间
CELERY_TASK_RESULT_EXPIRES = 60 * 20
# 指定任务接受的序列化类型.
CELERY_ACCEPT_CONTENT = ["msgpack"]
# 任务发送完成是否需要确认,这一项对性能有一点影响
CELERY_ACKS_LATE = True
# 压缩方案选择,可以是zlib, bzip2,默认是发送没有压缩的数据
CELERY_MESSAGE_COMPRESSION = 'zlib'
# 规定完成任务的时间
CELERYD_TASK_TIME_LIMIT = 5 # 在5s内完成任务,否则执行该任务的worker将被杀死,任务移交给父进程
# celery worker的并发数,默认是服务器的内核数目,也是命令行-c参数指定的数目
CELERYD_CONCURRENCY = 4
# celery worker 每次去rabbitmq预取任务的数量
CELERYD_PREFETCH_MULTIPLIER = 4
# 每个worker执行了多少任务就会死掉,默认是无限的
CELERYD_MAX_TASKS_PER_CHILD = 40
# 设置默认的队列名称,如果一个消息不符合其他的队列就会放在默认队列里面,如果什么都不设置的话,数据都会发送到默认的队列中
CELERY_DEFAULT_QUEUE = "default"
# 设置详细的队列
CELERY_QUEUES = {
"default": { # 这是上面指定的默认队列
"exchange": "default",
"exchange_type": "direct",
"routing_key": "default"
},
"topicqueue": { # 这是一个topic队列 凡是topictest开头的routing key都会被放到这个队列
"routing_key": "topic.#",
"exchange": "topic_exchange",
"exchange_type": "topic",
},
"task_eeg": { # 设置扇形交换机
"exchange": "tasks",
"exchange_type": "fanout",
"binding_key": "tasks",
},
}
在celery4.0以后配置参数改成了小写,对于4.0以后的版本替代参数:
CELERY_ACCEPT_CONTENT accept_content
CELERY_ENABLE_UTC enable_utc
CELERY_IMPORTS imports
CELERY_INCLUDE include
CELERY_TIMEZONE timezone
CELERYBEAT_MAX_LOOP_INTERVAL beat_max_loop_interval
CELERYBEAT_SCHEDULE beat_schedule
CELERYBEAT_SCHEDULER beat_scheduler
CELERYBEAT_SCHEDULE_FILENAME beat_schedule_filename
CELERYBEAT_SYNC_EVERY beat_sync_every
BROKER_URL broker_url
BROKER_TRANSPORT broker_transport
BROKER_TRANSPORT_OPTIONS broker_transport_options
BROKER_CONNECTION_TIMEOUT broker_connection_timeout
BROKER_CONNECTION_RETRY broker_connection_retry
BROKER_CONNECTION_MAX_RETRIES broker_connection_max_retries
BROKER_FAILOVER_STRATEGY broker_failover_strategy
BROKER_HEARTBEAT broker_heartbeat
BROKER_LOGIN_METHOD broker_login_method
BROKER_POOL_LIMIT broker_pool_limit
BROKER_USE_SSL broker_use_ssl
CELERY_CACHE_BACKEND cache_backend
CELERY_CACHE_BACKEND_OPTIONS cache_backend_options
CASSANDRA_COLUMN_FAMILY cassandra_table
CASSANDRA_ENTRY_TTL cassandra_entry_ttl
CASSANDRA_KEYSPACE cassandra_keyspace
CASSANDRA_PORT cassandra_port
CASSANDRA_READ_CONSISTENCY cassandra_read_consistency
CASSANDRA_SERVERS cassandra_servers
CASSANDRA_WRITE_CONSISTENCY cassandra_write_consistency
CASSANDRA_OPTIONS cassandra_options
CELERY_COUCHBASE_BACKEND_SETTINGS couchbase_backend_settings
CELERY_MONGODB_BACKEND_SETTINGS mongodb_backend_settings
CELERY_EVENT_QUEUE_EXPIRES event_queue_expires
CELERY_EVENT_QUEUE_TTL event_queue_ttl
CELERY_EVENT_QUEUE_PREFIX event_queue_prefix
CELERY_EVENT_SERIALIZER event_serializer
CELERY_REDIS_DB redis_db
CELERY_REDIS_HOST redis_host
CELERY_REDIS_MAX_CONNECTIONS redis_max_connections
CELERY_REDIS_PASSWORD redis_password
CELERY_REDIS_PORT redis_port
CELERY_RESULT_BACKEND result_backend
CELERY_MAX_CACHED_RESULTS result_cache_max
CELERY_MESSAGE_COMPRESSION result_compression
CELERY_RESULT_EXCHANGE result_exchange
CELERY_RESULT_EXCHANGE_TYPE result_exchange_type
CELERY_TASK_RESULT_EXPIRES result_expires
CELERY_RESULT_PERSISTENT result_persistent
CELERY_RESULT_SERIALIZER result_serializer
CELERY_RESULT_DBURI 请result_backend改用。
CELERY_RESULT_ENGINE_OPTIONS database_engine_options
[...]_DB_SHORT_LIVED_SESSIONS database_short_lived_sessions
CELERY_RESULT_DB_TABLE_NAMES database_db_names
CELERY_SECURITY_CERTIFICATE security_certificate
CELERY_SECURITY_CERT_STORE security_cert_store
CELERY_SECURITY_KEY security_key
CELERY_ACKS_LATE task_acks_late
CELERY_TASK_ALWAYS_EAGER task_always_eager
CELERY_TASK_ANNOTATIONS task_annotations
CELERY_TASK_COMPRESSION task_compression
CELERY_TASK_CREATE_MISSING_QUEUES task_create_missing_queues
CELERY_TASK_DEFAULT_DELIVERY_MODE task_default_delivery_mode
CELERY_TASK_DEFAULT_EXCHANGE task_default_exchange
CELERY_TASK_DEFAULT_EXCHANGE_TYPE task_default_exchange_type
CELERY_TASK_DEFAULT_QUEUE task_default_queue
CELERY_TASK_DEFAULT_RATE_LIMIT task_default_rate_limit
CELERY_TASK_DEFAULT_ROUTING_KEY task_default_routing_key
CELERY_TASK_EAGER_PROPAGATES task_eager_propagates
CELERY_TASK_IGNORE_RESULT task_ignore_result
CELERY_TASK_PUBLISH_RETRY task_publish_retry
CELERY_TASK_PUBLISH_RETRY_POLICY task_publish_retry_policy
CELERY_QUEUES task_queues
CELERY_ROUTES task_routes
CELERY_TASK_SEND_SENT_EVENT task_send_sent_event
CELERY_TASK_SERIALIZER task_serializer
CELERYD_TASK_SOFT_TIME_LIMIT task_soft_time_limit
CELERYD_TASK_TIME_LIMIT task_time_limit
CELERY_TRACK_STARTED task_track_started
CELERYD_AGENT worker_agent
CELERYD_AUTOSCALER worker_autoscaler
CELERYD_CONCURRENCY worker_concurrency
CELERYD_CONSUMER worker_consumer
CELERY_WORKER_DIRECT worker_direct
CELERY_DISABLE_RATE_LIMITS worker_disable_rate_limits
CELERY_ENABLE_REMOTE_CONTROL worker_enable_remote_control
CELERYD_HIJACK_ROOT_LOGGER worker_hijack_root_logger
CELERYD_LOG_COLOR worker_log_color
CELERYD_LOG_FORMAT worker_log_format
CELERYD_WORKER_LOST_WAIT worker_lost_wait
CELERYD_MAX_TASKS_PER_CHILD worker_max_tasks_per_child
CELERYD_POOL worker_pool
CELERYD_POOL_PUTLOCKS worker_pool_putlocks
CELERYD_POOL_RESTARTS worker_pool_restarts
CELERYD_PREFETCH_MULTIPLIER worker_prefetch_multiplier
CELERYD_REDIRECT_STDOUTS worker_redirect_stdouts
CELERYD_REDIRECT_STDOUTS_LEVEL worker_redirect_stdouts_level
CELERYD_SEND_EVENTS worker_send_task_events
CELERYD_STATE_DB worker_state_db
CELERYD_TASK_LOG_FORMAT worker_task_log_format
CELERYD_TIMER worker_timer
CELERYD_TIMER_PRECISION worker_timer_precision
加载配置文件中的配置
from celery import Celery
broker = "redis://:123456@127.0.0.1:6379/0"
backend = "redis://:123456@127.0.0.1:6379/0"
app = Celery("celery_crontab", broker=broker, backend=backend, include="tasks")
# app.conf.timezone = 'Asia/Shanghai'
# app.conf.enable_utc = False
app.config_from_object("celery_conf") # 加载配置文件
05 django中应用
在app下创建celery_disk文件夹
创建celery_conf.py
import djcelery
djcelery.setup_loader()
CELERY_IMPORTS=(
'app01.tasks',
)
#有些情况可以防止死锁
CELERYD_FORCE_EXECV=True
# 设置并发worker数量
CELERYD_CONCURRENCY=4
#允许重试
CELERY_ACKS_LATE=True
# 每个worker最多执行100个任务被销毁,可以防止内存泄漏
CELERYD_MAX_TASKS_PER_CHILD=100
# 超时时间
CELERYD_TASK_TIME_LIMIT=12*30
创建tasks.py
from celery import task
@task
def add(a,b):
with open('a.text', 'a', encoding='utf-8') as f:
f.write('a')
print(a+b)
视图函数:view.py
from django.shortcuts import render,HttpResponse
from app01.tasks import add
from datetime import datetime
def test(request):
# result=add.delay(2,3)
ctime = datetime.now()
# 默认用utc时间
utc_ctime = datetime.utcfromtimestamp(ctime.timestamp())
from datetime import timedelta
time_delay = timedelta(seconds=5)
task_time = utc_ctime + time_delay
result = add.apply_async(args=[4, 3], eta=task_time)
print(result.id)
return HttpResponse('ok')
setting.py
INSTALLED_APPS = [
...
'djcelery',
'app01'
]
from djagocele import celeryconfig
BROKER_BACKEND='redis'
BOOKER_URL='redis://127.0.0.1:6379/1'
CELERY_RESULT_BACKEND='redis://127.0.0.1:6379/2'