3.6.3 RabbitMQ教程二 - Work Queues
Work Queues
Prerequisites
As with other Python tutorials, we will use the Pika RabbitMQ client version 1.0.0.
就像其他的Python教程一样,我们使用Pika RabbitMQ客户端版本1.0.0
What This Tutorial Focuses On
In the first tutorial we wrote programs to send and receive messages from a named queue. In this one we'll create a Work Queue that will be used to distribute time-consuming tasks among multiple workers.
上一篇教程里我们写了从一个命名的队列发送和接收消息的程序。在这篇教程中我们要创建一个Work Queue,它将用来在多个工作端之间发布耗时任务。
The main idea behind Work Queues (aka: Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. Instead we schedule the task to be done later. We encapsulate a task as a message and send it to the queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers the tasks will be shared between them.
Work Queues背后的主要思路就是避免立即执行资源密集型的任务并且还得等待任务完成。相反我们可以安排任务稍后执行。我们将任务封装成一条消息并将它发送至queue。一个运行在后台的工作端进程会弹出任务并最终执行它。当你运行了多个工作端时任务会在它们(工作端们)之间共享。
This concept is especially useful in web applications where it's impossible to handle a complex task during a short HTTP request window.
这个概念在web应用中特别有用,因为在短的HTTP请求窗口中无法处理复杂的任务
In the previous part of this tutorial we sent a message containing "Hello World!". Now we'll be sending strings that stand for complex tasks. We don't have a real-world task, like images to be resized or pdf files to be rendered, so let's fake it by just pretending we're busy - by using the time.sleep() function. We'll take the number of dots in the string as its complexity; every dot will account for one second of "work". For example, a fake task described by Hello... will take three seconds.
上一篇教程我们发送了一条包含‘Hello World!’的消息。现在我们要发送表示复杂任务的字符串。我们没有真正的任务,比如调整图片的大小或者呈现pdf文件,所以我们假装我们很忙 - 通过使用time.sleep()函数。我们将把字符串中的点数作为消息的复杂度;每个点将占用‘work’的一秒。比如,一个Hello…这样的假任务需要花费三秒钟。
We will slightly modify the send.py code from our previous example, to allow arbitrary messages to be sent from the command line. This program will schedule tasks to our work queue, so let's name it new_task.py:
我们将我们之前的示例send.py中的代码稍作修改,以允许从命令行键入的任意消息可以被发送。这个程序会将任务安排进我们的work queue,因此我们将其命名为new_task.py
import sys import pika connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost')) channel = connection.channel() channel.queue_declare(queue='Hello') message = ''.join(sys.argv[1:]) or 'Hello World!' channel.basic_publish(exchange='', routing_key='hello', body=message) print('[x] Sent %r' % message)
Our old receive.py script also requires some changes: it needs to fake a second of work for every dot in the message body. It will pop messages from the queue and perform the task, so let's call it worker.py:
我们的旧脚本receive.py同样需要一些改变:它需要为消息中的每个点伪造一秒钟的工作,它会从队列中弹出消息并执行那些任务,因此我们命名它为worker.py
import pika import time connection = pika.BlockingConnection(pika.ConnectionParameters(host='localhost')) channel = connection.channel() channel.queue_declare(queue='hello') def callback(ch, method, properties, body): print(" [x] Received %r" % body) time.sleep(body.count(b'.')) print('[x] Done') channel.basic_consume(queue='hello', auto_ack=True, on_message_callback=callback) print(' [*] Waiting for messages. To exit press CTRL+C') channel.start_consuming()
Round-robin dispatching循环发布
One of the advantages of using a Task Queue is the ability to easily parallelise work. If we are building up a backlog of work, we can just add more workers and that way, scale easily.
使用Task Queue的一个优点是可以轻松实现并发工作的能力。如果我们正在建立一个积压的工作,我们就能增加更多的工作端,这样我们就能更快的执行(队列中的消息)
First, let's try to run two worker.py scripts at the same time. They will both get messages from the queue, but how exactly? Let's see.
首先,我们试试同时运行2个worker.py脚本。他们会同时从队列中获得消息,但具体怎样呢?来一起看看
You need three consoles open. Two will run the worker.py script. These consoles will be our two consumers - C1 and C2.
你需要启动三个控制台。2个运行worker.py脚本。这2个是我们的消费者- C1和C2
In the third one we'll publish new tasks. Once you've started the consumers you can publish a few messages:
第三个worker我们发布一些新任务。一旦你启动消费者你就可以发布一些信息
Let's see what is delivered to our workers:
我们来看看都有什么传递给我们的workers了
By default, RabbitMQ will send each message to the next consumer, in sequence. On average every consumer will get the same number of messages. This way of distributing messages is called round-robin. Try this out with three or more workers.
默认情况下,RabbitMQ会将每一条消息发送至下一个消费者,顺序地。平均每个消费者都会获得相同数量的消息。这种发布消息的方法被称为round-robin。用这个方法试试三个或更多的worker。
Message acknowledgment消息确认
Doing a task can take a few seconds. You may wonder what happens if one of the consumers starts a long task and dies with it only partly done. With our current code once RabbitMQ delivers message to the consumer it immediately marks it for deletion. In this case, if you kill a worker we will lose the message it was just processing. We'll also lose all the messages that were dispatched to this particular worker but were not yet handled.
执行一个任务会花费数秒。你可能会想,如果其中一个消费者开始了一项长期的任务,但最后只完成了一部分,会发生什么。在我们当前的代码中,一旦RabbitMQ将消息传递给消费者,它就会立即将其标记为删除。在这种情况下,如果你终结了一个worker,我们将丢失它正在处理的消息。我们还将丢失发送给此特定worker程序的,但尚未来得及处理的所有消息。
But we don't want to lose any tasks. If a worker dies, we'd like the task to be delivered to another worker.
但我们不想丢失任何任务。如果一个worker终结,我们会想让那个任务被传递给另一个worker
In order to make sure a message is never lost, RabbitMQ supports message acknowledgments. An ack(nowledgement) is sent back by the consumer to tell RabbitMQ that a particular message had been received, processed and that RabbitMQ is free to delete it.
为了确保消息永远不会丢失,RabbitMQ支持message acknowledgments消息确认。一条应答会被消费者程序发回来告知RabbitMQ,一条特定的消息已被接收,处理并且RabbitMQ可以随意将其删除。
If a consumer dies (its channel is closed, connection is closed, or TCP connection is lost) without sending an ack, RabbitMQ will understand that a message wasn't processed fully and will re-queue it. If there are other consumers online at the same time, it will then quickly redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.
如果一个消费程序终结(通道被关闭,连接被关闭,或者TCP链接丢失)且没有发送应答,RabbitMQ会理解为该条消息没有被完全处理并将它重置于队列中。如果同一时间有其他的消费程序在线,它会马上将其发送给另一个消费程序。如此一来你就可以确认没有消息会丢失,甚至当消费程序偶尔被终结时。
There aren't any message timeouts; RabbitMQ will redeliver the message when the consumer dies. It's fine even if processing a message takes a very, very long time.
不会有任何消息超时;当消费程序终结时RabbitMQ会重新传递消息。即使处理一条消息需要非常长的时间,也没关系。
Manual message acknowledgments are turned on by default. In previous examples we explicitly turned them off via the auto_ack=True flag. It's time to remove this flag and send a proper acknowledgment from the worker, once we're done with a task.
默认情况下,手动消息确认是打开的。在前面的例子中,我们通过auto_ack=True标志显式地关闭了它们。一旦我们完成了一项任务,就应该删除这个标志并向工作人员发送一个正确的确认。
def callback(ch, method, properties, body): print(" [x] Received %r" % body) time.sleep( body.count('.') ) print(" [x] Done") ch.basic_ack(delivery_tag = method.delivery_tag) channel.basic_consume(queue='hello', on_message_callback=callback)
Using this code we can be sure that even if you kill a worker using CTRL+C while it was processing a message, nothing will be lost. Soon after the worker dies all unacknowledged messages will be redelivered.
使用以上代码我们可以肯定即便当你通过CTRL+C终结一个正在处理一条消息的worker时,什么也不会丢失。不久之后worker终结,所有的未确认消息都会被重新发送。
Acknowledgement must be sent on the same channel that received the delivery. Attempts to acknowledge using a different channel will result in a channel-level protocol exception. See the doc guide on confirmations to learn more.
确认应答必须发送在与接收消息相同的通道里。尝试在不同的通道发送确认会引发通道层协议异常。查看确认指南文档以获得更多。
Forgotten acknowledgment被忘记的确认
It's a common mistake to miss the basic_ack. It's an easy error, but the consequences are serious. Messages will be redelivered when your client quits (which may look like random redelivery), but RabbitMQ will eat more and more memory as it won't be able to release any unacked messages.
忘记basic_ack是常见错误。虽然是个小错,但程序语句的运行序列可是严谨的。当你的客户程序退出时消息会被重新传递(看起来像是随机再传递),但RabbitMQ会吃掉越来越多的内存,因为它将无法释放任何未确认的消息。
In order to debug this kind of mistake you can use rabbitmqctl to print the messages_unacknowledged field:
为了解决这类错误,你可以使用rabbitmqctl来打印message_unacknowledged信息
rabbitmqctl.bat list_queues name messages_ready messages_unacknowledged
Message durability消息持久性
We have learned how to make sure that even if the consumer dies, the task isn't lost. But our tasks will still be lost if RabbitMQ server stops.
我们已经学会如何确保任务不会被丢失,甚至是消费程序down掉的时候。但如果RabbitMQ服务停掉的话我们的任务仍然会丢失。
When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.
当RabbitMQ退出或者崩溃的时候它会丢失队列和消息,除非你告诉它不要这么做。有两件事可以确保消息不会被丢失:我们需要将队列和消息都标记为可持久的
First, we need to make sure that RabbitMQ will never lose our queue. In order to do so, we need to declare it as durable:
首先,我们需要确保RabbitMQ永远不会弄丢我们的queue。为了达成此目的,我们需要将其声明为durable
channel.queue_declare(queue='hello', durable=True)
Although this command is correct by itself, it won't work in our setup. That's because we've already defined a queue called hello which is not durable. RabbitMQ doesn't allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that. But there is a quick workaround - let's declare a queue with different name, for example task_queue:
尽管这个命令本身是正确的,但在我们的设置中它不会工作。这是因为我们已经定义了一个名为hello的队列,它是不可持久的。RabbitMQ不允许你重新定义具有不同参数的现有队列,并将返回给任意这样做的程序一个错误。但有一个快速的解决方法 - 用不同的名字声明一个queue,比如task_queue
channel.queue_declare(queue='task_queue', durable=True)
This queue_declare change needs to be applied to both the producer and consumer code.
这个queue_declare改动需要同时用在producer和consumer
At that point we're sure that the task_queue queue won't be lost even if RabbitMQ restarts. Now we need to mark our messages as persistent - by supplying a delivery_mode property with a value 2.
此时,我们确信即使RabbitMQ重新启动,任务队列也不会丢失。现在,我们需要将消息标记为持久性消息,方法是为delivery_mode属性提供一个为2的值。
channel.basic_publish(exchange='', routing_key="task_queue", body=message, properties=pika.BasicProperties( delivery_mode = 2, # make message persistent ))
Note on message persistence关于消息持久性的说明
Marking messages as persistent doesn't fully guarantee that a message won't be lost. Although it tells RabbitMQ to save the message to disk, there is still a short time window when RabbitMQ has accepted a message and hasn't saved it yet. Also, RabbitMQ doesn't do fsync(2) for every message -- it may be just saved to cache and not really written to the disk. The persistence guarantees aren't strong, but it's more than enough for our simple task queue. If you need a stronger guarantee then you can use publisher confirms.
将消息标记为持久性并不能完全保证消息不会丢失。尽管它告诉RabbitMQ将消息保存到磁盘,但是当RabbitMQ已经接受消息并且还没有保存它时,仍然有一个很短的时间窗口。而且,RabbitMQ并不是对每条消息都执行fsync(2) -- 它可能只是保存到缓存中,而不是真正写入磁盘。持久性保证并不强,但对于我们的简单任务队列来说,它已经足够了。如果您需要更强大的保证,则可以使用publisher confirms。
Fair dispatch公平派发
You might have noticed that the dispatching still doesn't work exactly as we want. For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.
你也许已经注意到,分发过程仍然不能按照我们想要的效果来实现。例如,在有两个工作端的情况下,当所有奇数消息都很重,而偶数消息很轻时,一个工作端将一直很忙,而另一个工作端几乎不用干什么活。好吧,RabbitMQ对此一无所知,仍然会均匀地发送消息。
This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.
这是因为RabbitMQ只在消息进入队列时发送消息。它不会查看消费端未确认消息的数量。它只是盲目地将第n条消息发送给第n个消费者
In order to defeat that we can use the basic.qos method with the prefetch_count=1 setting. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.
为了克服这个缺点,我们可以使用basic.qos方法,并将prefetch_count设置为1. 这告诉RabbitMQ一次不要给一个工作端多个消息。或者,换句话说,在工作进程处理并确认前一条消息之前,不要向其发送新消息。相反,它将把它发送给下一个还不忙的工作端。
channel.basic_qos(prefetch_count=1)
Note about queue size
If all the workers are busy, your queue can fill up. You will want to keep an eye on that, and maybe add more workers, or use message TTL.
如果所有的工作端都忙,你的队列就会满。您将需要密切关注这一点,并可能添加更多的工人,或使用消息TTL(time to live)。
Putting it all together
new_task.py
import pika import sys connection = pika.BlockingConnection( pika.ConnectionParameters(host='localhost')) channel = connection.channel() channel.queue_declare(queue='task_queue', durable=True) message = ' '.join(sys.argv[1:]) or "Hello World!" channel.basic_publish( exchange='', routing_key='task_queue', body=message, properties=pika.BasicProperties( delivery_mode=2, # make message persistent )) print(" [x] Sent %r" % message) connection.close()
worker.py
import pika import time connection = pika.BlockingConnection( pika.ConnectionParameters(host='localhost')) channel = connection.channel() channel.queue_declare(queue='task_queue', durable=True) print(' [*] Waiting for messages. To exit press CTRL+C') def callback(ch, method, properties, body): print(" [x] Received %r" % body) time.sleep(body.count(b'.')) print(" [x] Done") ch.basic_ack(delivery_tag=method.delivery_tag) channel.basic_qos(prefetch_count=1) channel.basic_consume(queue='task_queue', on_message_callback=callback) channel.start_consuming()
Using message acknowledgments and prefetch_count you can set up a work queue. The durability options let the tasks survive even if RabbitMQ is restarted.
使用消息确认或者prefetch_count,你可以设置工作队列。即使RabbitMQ重新启动,持久性选项也允许任务继续存在。