引文:线程模型(Threading Model)默认从进程域 (M:N 模型 ) 改为系统全局域 (1:1 模型 )
在 AIX 5L 中,pthread 线程的默认模型是 m:n 方式,而从 AIX 6.1 开始,默认改为了 1:1 方式。这两种方式在系统中通过 AIXTHREAD_SCOPE 环境变量来进行控制。如果设置 AIXTHREAD_SCOPE=P,则线程模型为进程域(M:N 模型),设置 AIXTHREAD_SCOPE=S 则为系统域(1:1 模型)。
1:1 模型下,每个用户空间的线程都对应于内核中的一个线程,线程的调度由内核在系统全局范围进行;而 M:N 模型下,多个用户线程对应于内核中的多个内核线程,用户线程调度仅限于在本进程范围内进行,而对应的内核线程则交由内核进行调度。许多应用程序例如数据库 和 Java 应用要求设置为 1:1 方式以提供更好的性能,在 AIX 5L 中这些应用程序会要求配置 AIXTHREAD_SCOPE 环境变量,而在 AIX 6.1 中默认即为为 1:1 方式,不再需要进行配置。
原文:
Thread tuning
User threads provide independent flow of control within a process.
- 1:1 Thread Model
- The 1:1 model indicates that each user thread will have exactly one kernel thread mapped to it. This is the default model on AIX® 4.1, AIX 4.2, and AIX 4.3. In this model, each user thread is bound to a VP and linked to exactly one kernel thread. The VP is not necessarily bound to a real CPU (unless binding to a processor was done). A thread which is bound to a VP is said to have system scope because it is directly scheduled with all the other user threads by the kernel scheduler.
- M:N Thread Model
- The M:N model was implemented in AIX 4.3.1 and is also now the default model. In this model, several user threads can share the same virtual processor or the same pool of VPs. Each VP can be thought of as a virtual CPU available for executing user code and system calls. A thread which is not bound to a VP is said to be a local or process scope because it is not directly scheduled with all the other threads by the kernel scheduler. The pthreads library will handle the scheduling of user threads to the VP and then the kernel will schedule the associated kernel thread. As of AIX 4.3.2, the default is to have one kernel thread mapped to eight user threads. This is tunable from within the application or through an environment variable.
Depending on the type of application, the administrator can choose to use a different thread model. Tests on AIX 4.3.2 have shown that certain applications can perform much better with the 1:1 model. This is an important point because the default as of AIX 4.3.1 is M:N. By simply setting the environment variable AIXTHREAD_SCOPE=S for that process, we can set the thread model to 1:1 and then compare the performance to its previous performance when the thread model was M:N.
If you see an application creating and deleting threads, it could be the kernel threads are being harvested because of the 8:1 default ratio of user threads to kernel threads. This harvesting along with the overhead of the library scheduling can affect the performance. On the other hand, when thousands of user threads exist, there may be less overhead to schedule them in user space in the library rather than manage thousands of kernel threads. You should always try changing the scope if you encounter a performance problem when using pthreads; in many cases, the system scope can provide better performance.
If an application is running on an SMP system, then if a user thread cannot acquire a mutex, it will attempt to spin for up to 40 times. It could easily be the case that the mutex was available within a short amount of time, so it may be worthwhile to spin for a longer period of time. As you add more CPUs, if the performance goes down, this usually indicates a locking problem. You might want to increase the spin time by setting the environment variable SPINLOOPTIME=n, where n is the number of spins. It is not unusual to set the value as high as in the thousands depending on the speed of the CPUs and the number of CPUs. Once the spin count has been exhausted, the thread can go to sleep waiting for the mutex to become available or it can issue the yield() system call and simply give up the CPU but stay in an executable state rather than going to sleep. By default, it will go to sleep, but by setting the YIELDLOOPTIME environment variable to a number, it will yield up to that many times before going to sleep. Each time it gets the CPU after it yields, it can try to acquire the mutex.
Certain multithreaded user processes that use the malloc subsystem heavily may obtain better performance by exporting the environment variable MALLOCMULTIHEAP=1 before starting the application. The potential performance improvement is particularly likely for multithreaded C++ programs, because these may make use of the malloc subsystem whenever a constructor or destructor is called. Any available performance improvement will be most evident when the multithreaded user process is running on an SMP system, and particularly when system scope threads are used (M:N ratio of 1:1). However, in some cases, improvement may also be evident under other conditions, and on uniprocessors.
作者:张子良
出处:http://www.cnblogs.com/hadoopdev
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 地球OL攻略 —— 某应届生求职总结
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· 提示词工程——AI应用必不可少的技术
· .NET周刊【3月第1期 2025-03-02】