Tornado的Connection reset by peer - 记一次tornado的BUG修复
现场抓到的错误:
[W 130107 15:59:42 iostream:425] Read error on 8: [Errno 104] Connection reset by peer [W 130107 15:59:42 iostream:359] error on read Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/tornado/iostream.py", line 354, in _handle_read if self._read_to_buffer() == 0: File "/usr/lib/python2.7/dist-packages/tornado/iostream.py", line 421, in _read_to_buffer chunk = self._read_from_socket() File "/usr/lib/python2.7/dist-packages/tornado/iostream.py", line 402, in _read_from_socket chunk = self.socket.recv(self.read_chunk_size) error: [Errno 104] Connection reset by peer
这要从TCP的结束方式谈起:
大家都知道TCP的正常结束方式是四路握手,过程是这样:
1. (B) --> ACK/FIN --> (A)
2. (B) <-- ACK <-- (A)
3. (B) <-- ACK/FIN <-- (A)
4. (B) --> ACK --> (A)
但是四路握手不是关闭TCP连接的唯一方法,有时,如果主机需要尽快关闭连接(或连接超时,端口或主机不可达),RST (Reset)包将被发送.
注意,由于RST包不是TCP连接中的必须部分, 可以只发送RST包(即不带ACK标记). 但在正常的TCP连接中RST包可以带ACK确认标记。
所以IE浏览器在关闭一个连接时,它先会发送一个RST包给tornado,但是tornado没有正确处理。导致了recv函数返回errno.ECONNREST错误。
但是chrome、firefox是正常的关闭连接,执行TCP四路握手,所以tornado不会报错。
出现问题的代码在iostream.py的421行,代码片段:
try: chunk = self._read_from_socket() except socket.error, e: # ssl.SSLError is a subclass of socket.error logging.warning("Read error on %d: %s", self.socket.fileno(), e) self.close() raise
self._read_from_socket()函数,它直接从socket接收数据。但是没有处理ECONNREST,导致错误发生。
但是这个错误只是在ubuntu 12.10上面发生,它使用的tornado版本是2.3。
gitbub上最新的版本已经修复了这个错误,2012年11月3号,有commit为证:
commit 3258726fea5bcd1b401907653bc953ce63d5aeb2 Author: Ben Darnell <ben@bendarnell.com> Date: Wed Oct 3 22:36:46 2012 -0700 Reduce log spam from closed client connections. Added a bunch of tests for keepalive functionality and fixed two cases where we'd log an exception when the client was gone. ECONNRESET errors in IOStream reads now just close the connection instead of logging an error (the exception information is still available on stream.error in the close callback for apps that want it). HTTPConnection now also checks for a closed connection and cleans up instead of logging an error. IOStream now raises a new exception class StreamClosedError instead of IOError.
最新修复的代码是这样:
try: chunk = self.read_from_fd() except (socket.error, IOError, OSError), e: # ssl.SSLError is a subclass of socket.error if e.args[0] == errno.ECONNRESET: # Treat ECONNRESET as a connection close rather than # an error to minimize log spam (the exception will # be available on self.error for apps that care). self.close(exc_info=True) return self.close(exc_info=True) raise
并且增加了错误处理,不再直接报IOError,而是自定义错误StreamClosedError错误。