DotNet Core Threadpool
DotNet Core Threadpool
Jai Rathore
https://medium.com/@jaiadityarathore/dotnet-core-threadpool-bef2f5a37888
If you are reading this chances are that your Dotnet core app is experiencing some performance issues while scaling. And you think thread starvation could be one of the reasons behind it. Let’s take a look into how the Dotnet threadpool works.
既然你在阅读这篇文章,可能正是因为你的 .NET 应用正在经受着扩展中的遇到的问题。并且线程池饥渴可能是背后的原因之一。让我们花费一点时间来看看 .NET 的线程池是如何工作的。
Threadpool is the native software level thread management system for Dotnet runtime. It is also the queuing mechanism for all the incoming requests. There is no separate request queuing mechanism in Dotnet core besides the threadpool.
线程池是 .NET 运行时中原生软件级别的线程管理系统。它还是对于所有进入的请求使用队列机制。在线程池之外,没有另外的请求队列机制了。
On the hardware level we typically have 1 thread per core of the CPU. So in most cases of our applications and in the context of this post we will consider 16 threads available, assuming we have a 16 core processor. Ideally we want our spawned threads to be at or around the same number. This may be a little confusing but stay with me.
在硬件级别,对于 CPU 的每个核心我们通常有一个线程对应。所以,对于大多数场景下的应用,以及在本文的场景中,我们将考虑存在 16 个线程。并假设我们拥有 16 个内核的处理器。理想情况下,我们期望产生的线程也有相同的数量。这有一点困扰,但可以这样考虑。
In a fully asynchronous environment, the way Dotnet Threadpool works is :- for an incoming request it spawns a new thread — that thread runs it’s current operation (before async call), saves the current context and adds itself to the global pool as an available thread. When the remaining async call completes or if a new request comes in at this point, the Dotnet threadpool checks if all the 16 threads are already in operation, if yes it spawns up a new thread, if not it takes up one of the available threads from the pool.
在完全异步的环境下,.NET 线程池的工作方式为:对于每个进入的请求,它生成一个新线程,该线程处理它当前的操作 (在异步调用之前),保存当前的上下文,然后将自己加入到公共池中作为可用的线程。当原有的异步调用完成,或者如果一个新的请求来到,那么 .NET 线程池会检查是否所有 16 个线程都已经在处理中了,如果使得,那么它会生成一个新的线程,如果不是,它就会从线程池中取得一个可用的线程。
Synchronous vs Asynchronous request processing
However if some of the calls in the application are blocking or are not asynchronous, the thread which had run part of the operation can neither add itself to the global pool nor destruct itself. It’s blocked till the entire operation is complete. And if all the current 16 threads are stuck at a given point in a similar fashion, the threadpool will have to spawn a new thread for every incoming request.
不过,如果应用程序中的某些调用是阻塞的,或者它不是异步的,则运行这部分操作的线程既不能将自身添加回全局池中,也不能销毁自身。它会被阻塞在整个操作完成之前。如果所有当前的 16 个线程都以类似的方式卡在给定点,则线程池将不得不为每个传入请求生成一个新线程。
If at this point there is a burst of 1000 incoming request , dotnet threadpool will have to spawn a thread for every single one of those request which would result in a lot of threads being spun up. But the CPU can still only process 16 threads at any single point of time. Thread creation, destruction, context switching and their existence itself is very expensive , they eat up a lot of memory and can bring a system to a halt. We do not want so many threads running.
如果此时猝发了 1000 个新的请求,.NET 线程池将不得不为每个单独的请求创建一个线程,导致大量的线程被创建出来。但是,此时的 CPU 在此刻仍然在原来的 16 个线程上运行。线程的创建、销毁、上下文切换,以及它们的存在都是非常昂贵的,它们吃掉了大量的内存,会导致系统挂起。我们并不希望如此大量的线程在同时运行中。
To avoid this Dotnet threadpool has a throttling mechanism. It throttles the incoming request and spawns a new thread at 0.5 second per request. So that means in worst cases if application has a lot of blocking calls and all threads get stuck on the blocking calls, in worst cases when there is a burst of 1000 incoming request a new incoming request/half processed request can wait up to 1000*0.5 = 500 seconds. So in some cases this means that an operation which has to retrieve data from the database, even though the first part of request has retrieved the data from the database — it might have to wait upto 500 seconds to send that data back to the application, which on the application level will show up as a slow query when in reality even though the query execution was pretty fast, it was the thread availability which was the issue. This problem is commonly referred as Thread Starvation.
为了避免这种状况,.NET 线程池拥有一个节流机制。它会对进入的请求节流,每 0.5 秒每个请求才会创建一个新线程。这样在最坏的场景下,如果应用程序存在大量的阻塞调用,并且所有的线程都被阻塞在这些调用上,同时有 1000 个猝发请求进入,在最坏的场景下,会有最多 1000 * 0.5 = 500 秒的才会创建。因此在有些场景下,这意味着某个需要访问数据库获取数据的操作,从请求开始到从数据库获得数据,可能不得不等待多至 500 秒才能将数据返回。在应用程序的层面上的表现将是一个很慢的查询,但实际上的查询是非常快的。这就是线程可用性的问题所在,通常被称为线程饥渴。
ThreadPool Default behavior
使用指定最小值的线程池 Thread Pool with Custom Minimum Value
Dotnet threadpool provides 2 settings — MinimumValue and MaximumValue
线程池支持 2 个设定:最小值和最大值
MaximumValue — total number of threads that can be spawned , which is typically 32,767 (default)
最大值 - 可以创建的线程总数,默认值为 32767
Minimum Value — does not mean the minimum number of threads always present. The application does not boot with the minimum value of threads. The application will still boot with the threads equal to the number of CPU cores. It only means — the minimum number of threads that can be spawned before dotnet start the thread throttling process. We can think of it more in terms of the threshold limit. By default Dotnet sets this value as the number of hardware threads available based on the CPU setting — 16. This means after all 16 threads are in use and 5 new requests come in. Dotnet thread pool will wait 0.5 second for each request then check if a thread becomes free, if yes allots that to the request. If not spawns a new thread. If all the threads are blocked because of some blocking operation — the 5th request with current setting might have to wait 5 * 0.5 = 2.5 seconds to be processed.
最小值 - 不是线程保持的最小值。应用不会启动最小数量的线程。应用还是创建与 CPU 核数相同的的线程数。这只是意味着,在 .NET 开始线程节流之前,创建的最小数量的线程。我们可以进一步考虑配额限制。默认的 .NET 会设置该值为物理 CPU 内核数量,也就是 16。这就是说在所有 16 个线程被使用后,并且有 5 个新的请求到达。.NET 线程池将对每个请求等待 0.5 秒,然后检查是否已经有线程可用,如果已经有线程可用,那么分配这个线程。如果还没有线程可用,那么创建一个新的线程。如果所有的的线程都已经被某种操作所阻塞,第 5 个请求请求可能不得不等待 5 * 0.5 = 2.5 秒才能处理。
If we increase the minimum value to 100. This means that if there is a sudden burst of 100+ request and all the current threads are busy. It will instantly spawn 100 threads for each of those requests and only then start the throttling process. This means there will now be at least 116 threads fighting for the resources on the machine (memory/CPU) which is ideally only designed to handle 16 threads. If the request burst is bigger or if it continues for a longer time, our system can soon become unresponsive and will need a reboot.
如果将最小值提升到 100。这意味着如果突然有 100+ 的请求到达,而且所有现有的线程都处于忙碌状态,它会立即创建 100 个线程来处理每个请求,只有在这之后,才开始节流处理。这样将会有至少 116 个线程开始争夺机器上的资源,例如 CPU/内存等等。理想情况下,机器智能同时处理 16 个线程。如果请求猝发的值很大,或者持续较长的时间,我们的系统很快就会变得失去响应能力,甚至需要重新启动。
We can use minimum value as a patch for sometime, but it is not recommended by Microsoft. However minimum value of 100 should be relatively safe provided our burst of incoming requests is not constant or long lasting. And their recommendation was if the blocking calls cannot be avoided then to use it with some sort of a concurrency limiter which would not take incoming request (return 503) past a certain limit so the system doesn’t become unresponsive.
有的时候我们可以将最小值看作一种补救,但是微软并不建议如此。不过,最小值 100 应该是相对安全的,前提是我们的传入请求突发不是恒定的或持久的。他们的建议是,如果无法避免阻塞调用,那么将其与某种并发限制器 一起使用,该限制器不会使传入的请求(返回503)超过某个限制,因此系统不会变得无响应。
To Setup custom values for the minimum Value thresholds of Threadpool, we only need to make a small change. For example to set minimum value as 100 for both WorkerThread and IOCP thread, in the ConfigServices of your Startup.cs add this line of code.
为了配置该线程池的最小值,我们仅仅需要做一点变更。例如针对 WorkerThread 和 IOCP 线程将该最小值设置为 100,在应用的 Startup.cs 中的 ConfigService 中增加如下代码:
ThreadPool.SetMinThreads(100, 100);
The real problem is all the blocking calls (non-async) across our applications. We should clean up on all the instances where we are using blocking calls like .Wait() / .Result() / .GetResult() and try using use await instead.
真正的问题是,所有阻塞调用 (非异步) 遍布我们的应用程序。我们应该清理所有这些使用阻塞调用的用法,例如 .Wait()/ .Result() / .GetResult() 等等,并使用 await 来替代它们。
使用 ReadFormAsync 来代替 .Form() (Use ReadFormAsync instead of .Form())
If running on Dotnet 3.0 + use IAsyncEnumerable
如果你已经在使用 .NET 3.0 及其以上版本,如果从 Action 中返回,那么使用 IAsyncEnumerable
Before setting custom values for the thread we should try to measure the current metrics of the app like the current number of threads in the threadpool. This can be done by using a package like AppMetrics DotnetRuntime. (I will try to cover setting that up in another post). We should play with different values of minimum value of threadpool untill we find that sweep spot. Also look at these recommendations.
在为线程池设置定制的值之前,我们应该尝试测量应用当前的指标,例如线程池中的当前线程数量。这可以通过使用诸如 AppMetrics DotnetRuntime 包来实现(我可能还在其它的文章中涉及)。还可以参考 https://docs.microsoft.com/en-us/aspnet/core/performance/performance-best-practices?view=aspnetcore-5.0
However it’s best to leave the threadpool alone and as soon as the expected burst of request is gone, the first step should be to remove all the blocking calls from your apps and to make it all asynchronous as soon as possible.
事实上,最好不要随便调整线程池,一旦预期的猝发状况不存在,首先应该的操作是从你的应用中删除所有的阻塞调用,并尽快全部使用异步方式。