fbird

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

问题描述:

两台机器通过Socket通信,Socket Server在read数据(尤其是在异步读的情况下), 这个时候拔掉Socket Client的网线,然后Socket Server会一直等待而不能及时得到通知,虽然Socket Client能及时得到Exception.

 

原因:

http://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

这篇文章中有详细的原因描述以及解决方案。

There is a three-way handshake to open a TCP/IP connection, and a four-way handshake to close it. However, once the connection has been established, if neither side sends any data, then no packets are sent over the connection. TCP is an “idle” protocol, happy to assume that the connection is active until proven otherwise.

 

TCP was designed this way for resiliency and efficiency. This design enables a graceful recovery from unplugged network cables and router crashes. e.g., a client may connect to a server, an intermediate router may be rebooted, and after the router comes back up, the original connection still exists (this is true unless data is sent across the connection while the router was down). This design is also efficient, since no “polling” packets are sent across the network just to check if the connection is still OK (reduces unnecessary network traffic).

 

TCP does have acknowledgments for data, so when one side sends data to the other side, it will receive an acknowledgment if the connection is still active (or an error if it is not). Thus, broken connections can be detected by sending out data. It is important to note that the act of receiving data is completely passive in TCP; a socket that only reads cannot detect a dropped connection.

 

This leads to a scenario known as a “half-open connection”. At any given point in most protocols, one side is expected to send a message and the other side is expecting to receive it. Consider what happens if an intermediate router is suddenly rebooted at that point: the receiving side will continue waiting for the message to arrive; the sending side will send its data, and receive an error indicating the connection was lost. Since broken connections can only be detected by sending data, the receiving side will wait forever. This scenario is called a “half-open connection” because one side realizes the connection was lost but the other side believes it is still active.

简单来说,这是reasonable design的一个side effect,硬币的另一面。

 

解决方案:

上面的链接已经有推荐的解决方案,下面是具体的实现细节。

详细代码可以参考: http://www.cnblogs.com/wzd24/archive/2007/05/22/755050.html

 

Socket.IOControl 方法:  https://msdn.microsoft.com/zh-cn/library/8a3744sh(v=vs.110).aspx 

IOControlCode.KeepAliveValues: https://msdn.microsoft.com/en-us/library/system.net.sockets.iocontrolcode.aspx

 

语法
C#

public int IOControl ( 
IOControlCode ioControlCode, 
byte[] optionInValue, 
byte[] optionOutValue 
) 

 

参数
ioControlCode
一个 IOControlCode 值,它指定要执行的操作的控制代码。

optionInValue
Byte 类型的数组,包含操作要求的输入数据。

optionOutValue
Byte 类型的数组,包含由操作返回的输出数据。

返回值
optionOutValue 参数中的字节数。

如:

socket.IOControl(IOControlCode.KeepAliveValues, inOptionValues, null);

我们要搞清楚的就是inOptionValues的定义,在C++里它是一个结构体。我们来看看这个结构体:

struct tcp_keepalive 
{ 
    u_long  onoff; //是否启用Keep-Alive
    u_long  keepalivetime; //多长时间后开始第一次探测(单位:毫秒)
    u_long  keepaliveinterval; //探测时间间隔(单位:毫秒)
}; 

在C#中,我们直接用一个Byte数组传递给函数:

uint dummy = 0;
byte[] inOptionValues = new byte[Marshal.SizeOf(dummy) * 3];
BitConverter.GetBytes((uint)1).CopyTo(inOptionValues, 0);//是否启用Keep-Alive
BitConverter.GetBytes((uint)5000).CopyTo(inOptionValues, Marshal.SizeOf(dummy));//多长时间开始第一次探测
BitConverter.GetBytes((uint)5000).CopyTo(inOptionValues, Marshal.SizeOf(dummy) * 2);//探测时间间隔

 

posted on 2016-03-04 10:22  PonyTan  阅读(507)  评论(0编辑  收藏  举报