zmq 学习笔记


0. PUB/SUB, XPUB/XSUB

  • filtering happens at publisher sides when sockets are using a connected protocol(tcp or ipc or inproc)
  • there are meta-info exchange between SUB and PUB, but on API level,only allow message to flow from PUB to SUB.
  • SUB socket can not send message on API level, but actually when user adds filter, there are meta information sent from subscriber to publisher, and XPUB can receive that(PUB cannot).
  • when XPUB receives meta info, this info must be forwarded to the original PUB server, this is done by XSUB, as following model illustrated: PUB->XSUB<-->XPUB->SUB. since SUB can not send, so it is not capable of serving as proxy.

1. REQ/REP works in a strict request-reply mode

  • when server receives a request, it has to reply it until it could issue another read.

  • one client socket can connected to several servers at the same time, send() is distributed evenly among all connections:

    • c1 connect() to s1.
    • c1 connect() to s2.
    • c1 send() will write to s1.
    • c1 recv() from s1.
    • c1 send() again, this time message is sent to s2.
  • if there are multiple clients connected to the same REP server, and multiple clients send request to server at the same time, then server will have to handle each of these request one by one.

    • A connects to C.
    • B connects to C.
    • A sends request r1 to C.
    • B sends request r2 to C.
    • C accepts r1.
    • C processes r1, and at the same time, C can not read any other requestes before replying to r1.
    • C replys to r1.
    • C accepts r2 and does some processing and replys to it.
    • by doing so a single REP socket can handle multiple connected REQ sockets(posix socket api cannot do that, it requires one socket per connection at server side.)

2. identity/envelop of message

  • why do we need ROUTER/DEALER sockets?

    • use it to work as proxy.
    • REQ/REP works in a strict request/reply mode, and this is not scalable when many clients are connected to a single server(it has to round-robin simultaneous requests: recv message, forward message, wait for replying message, forward reply message and then handle next recv). In contrast, ROUTER is not restricted to this model, and it can perform several recv() in a row before reply to any of them.
      • r1 sends to server s0.
      • r2 sends to server s0.
      • s0 forwards r1 to some REP server s1.
      • s0 forwards r2 to some REP server s2.(REP socket doesn't allow this, since r1 is not replied yet)
      • s0 receives reply from s1 and forward it to r1.
      • s0 receives reply from s2 and forward it to r2.
  • every ROUTER socket need to preppend an unique id to the message before handing that message to the application.

    • this id can be set by incoming socket by calling zmq.setsockopt(zmq.IDENTITY, $id) in client side.
    • or if not set by incoming socket, ROUTER socket has to choose one for it.
    • if there are id clash, the incoming connection will be turned down(connection refused).
  • for every REQ socket or REP socket when receiving a message, it will strip ALL the prepended id(and holds it internally), and then hands the message to caller.

    • socket will keep the id(s) it stripped from the massage for further use.
    • when socket need to send message back, it will preppend thoese id(s) it holds to the message before sending it out.
  • DEALER sends message to all connected clients in a round-robin way and messages received are fair-queued.[5]

    • no extra info will be added to the message to send.
    • no extra info will be stripped from the received message.
    • DEALER is completely agnostic to all connected clients.
  • a simple example: c1 ---> proxy1(ROUTER/DEALER) ---> proxy2(ROUTER/DEALER) ---> s1

    • c1 sends message: delimiter|msg
    • proxy1 ROUTER socket recv message: delimiter|msg, and preppend an id to the message: id1|delimiter|msg.
    • proxy1 sends id1|delimiter|msg to proxy2 throught DEALER socket.
    • proxy2 ROUTER socket recv message: id1|delimiter|msg, and preppend an id to the massage, id2|id1|delimiter|msg.
    • proxy2 sends id2|id1|delimiter|msg to s1 throught its DEALER socket.
    • s1 recv message: id2|id1|delimiter|msg, it will strip all ids and hands "msg" to caller.

    ==> then s1 trys to reply message:

    • before message is sent out, it will preppend the ids it get to the message.
    • every ROUTER will strip the first id off the package and use it to identified which connection to send the message when it try to send the message.
    • that is, proxy2 receives id2|id1|delimiter|rep_msg, and sends out id1|delimiter|rep_msg.
    • proxy1 receives id1|delimiter|rep_msg and works the same way and sends out delimiter|rep_msg.
    • c1 recv delimiter|rep_msg, and strip the delimiter before handing it to caller.
  • router to router is possible, but it is tricky to work, eg, router1 connects to router2:

    • r1 connected to r2 successfully.
    • at this point, r1 doesn't know the id of r2, so r1 could not send message to r2(recall that, every router need to strip an id from the message to identify the connection to send the message)
    • r2 can send message to r1, if and only if r2 could SOMEHOW get to know the id of r1 before hand.
    • the only way to accomplish step 3 is let r1 hardcodes its identity, then r2 could use this hardcoded id to send message back to r1.
    • r1 acts as server, it will bind to local port, thus r1 sets its identity won't help.
    • the only way for r1 to get the id of r2 is receiving message from r2,(ROUTER announces the identity of a connection only when receiving message from peer.[6])

3. reliability at client side read

  • server side uses poll to sit on the socket(or it could issue blocking read, timeout read), no exception will throw normally.
  • client should perform a timeout read, and when timeout occurs, retry several times, client should abort the connection if read still fails in the end. instead, start a new connection to server.
  • connection is cheap in this regard, and users are encouraged to kill bad connection and start a new one.[4][7]

Reference:

  1. http://api.zeromq.org/
  2. http://zguide.zeromq.org/
  3. http://pyzmq.readthedocs.io/en/latest/api/
  4. https://news.ycombinator.com/item?id=4161073
  5. http://lucumr.pocoo.org/2012/6/26/disconnects-are-good-for-you/
  6. http://zguide.zeromq.org/page:all#Recap-of-Request-Reply-Sockets
  7. http://zguide.zeromq.org/page:all#Identities-and-Addresses
  8. http://zguide.zeromq.org/page:all#Client-side-Reliability-Lazy-Pirate-Pattern
posted on 2016-09-05 10:21  twoon  阅读(1828)  评论(0编辑  收藏  举报