WCF Reliable Sessions Puzzle(解决:System.ServiceModel.Channels.ServiceChannel)
As I mentioned on my previous post, I have spent a few days very puzzled with a behaviour in WCF reliable messaging/sessions.
THe problem all starts because as documented here, creating WCF clients (the classes generated by svcutil.exe) is quite expensive due to the rich set of functionality implemented in WCF. You can learn more about the client model here. To overcome the problem, there are basically two options:
- Create a ChannelFactory instance and keep that in the application context somewhere, so you don't have to initialize the factory every time you need a client. You then use this factory to create client channels directly;
- Create a client instance and cache that to be reused for multiple calls.
Ultimatelly, option 2 offers the best performance, as you go through the expense of instantiating the client only once. I then wrote a pool class to manage a number of client instances and all was nice and cool until the clients' connections started to be closed by the WCF host due to inactivity (I'm using NetTcp binding). As my application is the only one to use my WCF service and I can control the number of client connections open to the host at one time, I wanted just to keep these connections alive for as long as the client application lived, so I thought about using Reliable Sessions for that. Basically, as documented here, reliable sessions are kept alive by the WCF architecture indefinitely, so that's good for what I need.
The weird thing started when after enabling reliable sessions in my app, the behaviour of the host didn't change: it would still drop my connections after 10 minutes and everything would go bad. I then wrote some simple code to try to replicate the problem, code that can be seen below:
using System;
using System.ServiceModel;
namespace Service
{
[ServiceContract(SessionMode = SessionMode.Allowed,
Namespace = "http://tempuri.org/Services/IServiceContract")]
public interface IServiceContract
{
[OperationContract]
string DoSomething(string arg);
}
[ServiceBehavior(Namespace = "http://tempuri.org/Services/ServiceImplementation")]
public class ServiceImplementation : IServiceContract
{
#region IServiceContract Members
public string DoSomething(string arg)
{
return string.Format("The string passed in was '{0}'.", arg);
}
#endregion
}
class Program
{
static void Main(string[] args)
{
using (ServiceHost serviceHost = new ServiceHost(typeof(ServiceImplementation)))
{
serviceHost.Open();
Console.WriteLine("Service running. Press any key to exit.");
Console.Read();
serviceHost.Close();
}
}
}
}
The configuration file for the service looks like the following:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<system.serviceModel>
<bindings>
<netTcpBinding>
<binding name="NetTcp_Reliable">
<reliableSession ordered="false" inactivityTimeout="00:10:00"
enabled="true" />
</binding>
</netTcpBinding>
</bindings>
<behaviors>
<serviceBehaviors>
<behavior name="MetadataBehavior">
<serviceMetadata httpGetEnabled="true" httpGetUrl="http://localhost:2526/Service" />
</behavior>
</serviceBehaviors>
</behaviors>
<services>
<service behaviorConfiguration="MetadataBehavior" name="Service.ServiceImplementation">
<endpoint address="net.tcp://localhost:2525/Service" binding="netTcpBinding"
bindingConfiguration="NetTcp_Reliable" contract="Service.IServiceContract" />
</service>
</services>
</system.serviceModel>
</configuration>
For the client application I have used svcutil.exe to generate the WCF client and the configuration file. The client is pretty ordinary and I haven’t changed anything so I will spare you the bore of reading it, but the config file looked like this:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<system.serviceModel>
<bindings>
<netTcpBinding>
<binding name="NetTcpBinding_IServiceContract" closeTimeout="00:01:00"
openTimeout="00:01:00" receiveTimeout="00:10:00" sendTimeout="00:01:00"
transactionFlow="false" transferMode="Buffered" transactionProtocol="OleTransactions"
hostNameComparisonMode="StrongWildcard" listenBacklog="10"
maxBufferPoolSize="524288" maxBufferSize="65536" maxConnections="10"
maxReceivedMessageSize="65536">
<readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384"
maxBytesPerRead="4096" maxNameTableCharCount="16384" />
<reliableSession ordered="false" inactivityTimeout="00:10:00"
enabled="true" />
<security mode="Transport">
<transport clientCredentialType="Windows" protectionLevel="EncryptAndSign" />
<message clientCredentialType="Windows" />
</security>
</binding>
</netTcpBinding>
</bindings>
<client>
<endpoint address="net.tcp://localhost:2525/Service" binding="netTcpBinding"
bindingConfiguration="NetTcpBinding_IServiceContract" contract="IServiceContract"
name="NetTcpBinding_IServiceContract">
</endpoint>
</client>
</system.serviceModel>
</configuration>
These settings are pretty much the defaults for NetTcp binding except for the reliableSession element that enables RM and disables message ordering. The console application that uses the WCF client looks like the following:
namespace Client
{
class Program
{
static void Main(string[] args)
{
ServiceContractClient client = new ServiceContractClient();
client.InnerChannel.Faulted += new EventHandler(InnerChannel_Faulted);
Console.WriteLine("The call was made at {0} and the response was '{1}'", DateTime.Now, client.DoSomething("First Call"));
Console.WriteLine("Press any key to call the server again!");
Console.Read();
Console.WriteLine("The call was made at {0} and the response was '{1}'", DateTime.Now, client.DoSomething("Second Call"));
Console.Read();
}
static void InnerChannel_Faulted(object sender, EventArgs e)
{
Console.WriteLine("The channel faulted!");
}
}
}
The above console application allowed me to let the application hit the service once, then I could leave it idle for as much as I wanted and try to hit the service again after pressing any key. It will also handle the Faulted event in the client’s inner channel. The behaviour I was expecting is that regardless of how long I left the application idle after the first call to the service, when I pressed a key I would still be able to call the service again. What happenned instead is that after 10 minutes of the application iddle, the Faulted event of the channel would be raised, telling me that the connection was gone.
My first thought was to increase the InactivityTimeout on the reliable session configuration in both client and server to 20 minutes instead of 10 and see what happened. When running the application it would still fault the channel after 10 minutes.
After posting in the WCF forums I was pointed to the Receive Timeout in the binding configuration. After reading the extensive (NOT) documentation available for this property I was led to believe that it was the time alowed for a receive operation (i.e. the time the client takes to send a full message stream) to be completed, but in reality this property is the time the service allows between client calls (poor, poor documentation Microsoft). Even though, my understanding was that the WCF infrastructure messages exchanged by RM to keep the session alive would count for this receive timeout and would keep my connection alive, but that evidently wasn’t happening.
We then contacted Microsoft PSS and after a while they confirmed that the behavior should be that the infrastructure messages keep the connection alive, but they have changed something in the receive timeout behaviour close to RTM and the end result was that this setting overrides everything else (all other timeouts or sessions) in WCF. It is recognised within Microsoft as a bug, which is good: they know about the bug… but the bad news are that they don’t plan to fix it before WCF version 2.0 as they believe there’s a viable workaround. The workaround in this case is that you can set the receive timeout to an extremely long time span (or System.TimeSpan.MaxValue) or the infinite keyword (that behind the scenes will use System.TimeSpan.MaxValue) and it will produce the expected behaviour.
Moral of the story is: if you ever use WCF reliable messaging and need the session to be kept alive indefinitely you already know: set the receive timeout to something very, very big.