zmq模块的理解和使用二
1. 问题描述
在之前的文章里(zmq模块的理解和使用),写过zmq有三种模式供选择,我工作中主要采用PUB-SUB模式。使用场景主要包括一个server端和多个client端:
server端:进行视频读取,并对每张图片进行目标检测和跟踪,采用PUB模式,将检测和跟踪结果广播出去
client端:有多个客户端,采用SUB模式,接收PUB端的数据,然后对数据进行处理
最近碰到了一个问题, 就是client端有时候会收不到server的数据,而且程序捕捉不到任何异常。网上找了些资料看,由于网络环境不稳定,导致zmq底层的tcp连接不稳定,无法返回断开状态, 导致zmq自动重连机制无法生效。目前的解决方案有两类:
1. 采用zmq提供的TCP Keepalive
2. 自己实现心跳模式,超时进行重连
通过比较后,决定采用心跳模式,灵活性和可靠性更强一点。
2. 解决方案测试
写了个接单的模拟代码,打算测试几天观察下效果
server:
服务端采用两个topic,一个topic用来发送心跳数据(每隔1秒),另一个topic发送业务数据(目标检测)。服务端示例代码如下:
import zmq import time import random def start_server(topics, url, port): ctx = zmq.Context() send_scoket = ctx.socket(zmq.PUB) responseUrl = "tcp://{}:{}".format(url, port) print("bind to: {}".format(responseUrl)) send_scoket.bind(responseUrl) last_heartbeat = time.time() i = 1 while True: # 每隔1秒,发送心跳数据 if time.time()-last_heartbeat > 1: send_scoket.send_multipart([topics[1].encode("utf-8"), b'heartbeat']) last_heartbeat = time.time() print(i, "send heartbeat") # 以一定概率发送检测数据,模拟视频目标检测 if random.random() < 0.2: detection_message = "message{} for {}".format(i, topics[0]) send_scoket.send_multipart([topics[0].encode("utf-8"), detection_message.encode("utf-8")]) print(i, "send detection_message") i += 1 time.sleep(0.5) if __name__ =="__main__": topics = ['detection', 'heartbeat'] url = "127.0.0.1" port = 4488 start_server(topics, url, port)
client:
客户端订阅服务端的两个topic,同时接收服务端的心跳数据和业务数据,每次都判断有多长时间没有接收到心跳数据,如果心跳数据超时,就进行重新连接。客户端采用示例代码如下,
import zmq import time def start_client1(topics, url, port): ctx = zmq.Context() recv_scoket = ctx.socket(zmq.SUB) requestUrl = "tcp://{}:{}".format(url, port) print("connect to: {}".format(requestUrl)) recv_scoket.connect(requestUrl) for topic in topics: recv_scoket.subscribe(topic) last_heartbeat = 0 while True: # 30秒收不到server的心跳数据,就进行重新连接 if last_heartbeat != 0 and time.time() - last_heartbeat > 30: recv_scoket.disconnect(requestUrl) recv_scoket.connect(requestUrl) for topic in topics: recv_scoket.subscribe(topic) print("Reconnect pub server") time.sleep(2) # 每2秒重试连接,重连太频繁会导致接收不了数据 try: data = recv_scoket.recv_multipart(flags=1) datatopic = data[0].decode() if datatopic.startswith("heartbeat"): # 接收到心跳数据后,更新心跳接收到的时间 last_heartbeat = time.time() print("receive message: ", data[1].decode("utf-8")) except zmq.error.Again as e: # print(e) pass if __name__ == "__main__": topics = ['detection', 'heartbeat'] url = "192.168.2.139" port = 4488 start_client1(topics, url, port)
参考文档:
参考一: (PUB/SUB) Sub Silent Disconnect on Unreliable Connection · Issue #1199 · zeromq/libzmq · GitHub
参考二:https://blog.csdn.net/bubbleyang/article/details/107559224
参考三: https://blog.csdn.net/sinat_36265222/article/details/107252069