IOCP陷阱
1. AcceptEx 10061
客户端循环连接,没有发送数据,一定次数后,连接失败,WSAGetLastError的结果是10061。并且后续无法再次连接。
这是因为其中的一个参数,详细用法参考IOCP Input/Output Completion Port IO完成端口
BOOL AcceptEx(
SOCKET sListenSocket,
SOCKET sAcceptSocket,
PVOID lpOutputBuffer,
DWORD dwReceiveDataLength,
DWORD dwLocalAddressLength,
DWORD dwRemoteAddressLength,
LPDWORD lpdwBytesReceived,
LPOVERLAPPED lpOverlapped
);
dwReceiveDataLength
The number of bytes in lpOutputBuffer that will be used for actual receive data at the beginning of the buffer. This size should not include the size of the local address of the server, nor the remote address of the client; they are appended to the output buffer. If dwReceiveDataLength is zero, accepting the connection will not result in a receive operation. Instead, AcceptEx completes as soon as a connection arrives, without waiting for any data.
这里表示,如果设置了长度,那么AcceptEx就会在连接成功,并且接收到连接第一块数据的时候返回。那么就会有被攻击的风险,如果只有连接不发送数据,服务器投递的等待accept的socket就会被消耗光。并且客户端断开连接,GetQueuedCompletionStatus也没有返回,没有任何通知,除非自己主动查询连接的socket。出现这个现象后,客户端永远无法连接,提示报错10061:
|
|
解决方法一
按照微软官方文档说明,把dwReceiveDataLength
设置为0。
注意事项
void GetAcceptExSockaddrs( PVOID lpOutputBuffer, DWORD dwReceiveDataLength, DWORD dwLocalAddressLength, DWORD dwRemoteAddressLength, sockaddr **LocalSockaddr, LPINT LocalSockaddrLength, sockaddr **RemoteSockaddr, LPINT RemoteSockaddrLength );
dwReceiveDataLength
The number of bytes in the buffer used for receiving the first data. This value must be equal to the dwReceiveDataLength parameter that was passed to the AcceptEx function.
dwReceiveDataLength
与AcceptEx的参数必须一致,所以这里也要填写0.
acceptExFunc( listensocket, iosocket, wsabuf.buf, 0, sizeof(SOCKADDR_IN) + 16, sizeof(SOCKADDR_IN) + 16, &dwbytes, &overlapped )) { if (WSA_IO_PENDING != WSAGetLastError()) { ret = false; } }
getAcceptExSockFunc( wsabuf.buf, 0, sizeof(SOCKADDR_IN) + 16, sizeof(SOCKADDR_IN) + 16, (LPSOCKADDR*)&localaddr, &localaddrlen, (LPSOCKADDR*)&clientaddr, &clientaddrlen );
这样可以获得地址信息,也可以避免连接没有发送数据的攻击。
解决方法二
If a receive buffer is provided, the overlapped operation will not complete until a connection is accepted and data is read. Use the getsockopt function with the SO_CONNECT_TIME option to check whether a connection has been accepted. If it has been accepted, you can determine how long the connection has been established. The return value is the number of seconds that the socket has been connected. If the socket is not connected, the getsockopt returns 0xFFFFFFFF. Applications that check whether the overlapped operation has completed, in combination with the SO_CONNECT_TIME option, can determine that a connection has been accepted but no data has been received. Scrutinizing a connection in this manner enables an application to determine whether connections that have been established for a while have received no data. It is recommended such connections be terminated by closing the accepted socket, which forces the AcceptEx function call to complete with an error.
int getsockopt( SOCKET s, int level, int optname, char *optval, int *optlen );
Parameters
s
A descriptor identifying a socket.
对应socket句柄。
level
The level at which the option is defined. Example: SOL_SOCKET.
需要获取的参数属于哪一个level分类。
optname
The socket option for which the value is to be retrieved. Example: SO_ACCEPTCONN. The optname value must be a socket option defined within the specified level, or behavior is undefined.
socket的对应属性。
optval
A pointer to the buffer in which the value for the requested option is to be returned.
返回对应数据的指针。
optlen
A pointer to the size, in bytes, of the optval buffer.
返回对应数据的长度。从这里可以看书,上面的数据必须我们自己申请,然后传递进去。
SO_CONNECT_TIME属于SOL_SOCKET。
SO_CONNECT_TIME | DWORD |
Returns the number of seconds a socket has been connected. This socket option is valid for connection oriented protocols only. |
INT seconds; INT bytes = sizeof(seconds); int iResult = 0; iResult = getsockopt( sAcceptSocket, SOL_SOCKET, SO_CONNECT_TIME, (char *)&seconds, (PINT)&bytes ); if ( iResult != NO_ERROR ) { printf( "getsockopt(SO_CONNECT_TIME) failed: %u\n", WSAGetLastError( ) ); exit(1); }
另外一种解决方法就是,AcceptEx接收数据的参数不是0,但是需要定时通过getsockopt获取socket连接了多长时间,然后把没有发送数据并且超时的socket断掉。但是这个函数精确度是秒,设计起来也更复杂,不建议使用。
2. WSARecv/WSASend立即返回的处理
参考 https://tboox.org/cn/2018/08/16/coroutine-iocp-some-issues/
在开发过程中,看到过这两个函数介绍有可能立即返回,但是没有找到对应的用法,官方也没有特别说明,所以就没有在意。不过在心里也在考虑,如果立即返回,应该如何处理,是不是不用到工作线程处理了。
不管有没有处理,调用GetQueuedCompletionStatus的时候,系统都会返回。所以要么不处理,要么按照作者写的那样,做一个标识,处理过的在工作线程就忽略不计即可。
3. GetQueuedCompletionStatusEx单请求慢
参考 https://tboox.org/cn/2018/08/16/coroutine-iocp-some-issues/
GetQueuedCompletionStatusEx可以一次请求多个,减少了调用和线程切换次数,增加了效率。但是作者测下来,如果是单一IO请求,GetQueuedCompletionStatusEx的效率反而慢了。具体原因不明,按照作者介绍,通过最近的IO请求动态的调整调用的函数。
4. 取消其他线程IO请求
参考 https://tboox.org/cn/2018/08/16/coroutine-iocp-some-issues/
CancelIO只能用于取消当前线程投递的io事件,想要在取消其他线程投递的io事件,需要使用CancelIOEx