线程局部存储 线程本地数据
cpython/Lib/_threading_local.py at 3.12 · python/cpython · GitHub https://github.com/python/cpython/blob/3.12/Lib/_threading_local.py
threading --- 基于线程的并行 — Python 3.12.4 文档 https://docs.python.org/zh-cn/3/library/threading.html
threading.main_thread()
返回主 Thread
对象。一般情况下,主线程是Python解释器开始时创建的线程。
在 CPython 中,由于存在 全局解释器锁,同一时刻只有一个线程可以执行 Python 代码(虽然某些性能导向的库可能会去除此限制)。
该模块的设计基于 Java的线程模型。 但是,在Java里面,锁和条件变量是每个对象的基础特性,而在Python里面,这些被独立成了单独的对象。 Python 的 Thread
类只是 Java 的 Thread 类的一个子集;目前还没有优先级,没有线程组,线程还不能被销毁、停止、暂停、恢复或中断。 Java 的 Thread 类的静态方法在实现时会映射为模块级函数。
线程局部存储
线程局部存储 (TLS) 是一种存储持续期(storage duration),对象的存储是在线程开始时分配,线程结束时回收,每个线程有该对象自己的实例。这种对象的链接性(linkage)可以是静态的也可是外部的。
TLS的一个例子是用全局变量errno
表示错误号。这可能在多线程并发时产生同步错误。线程局部存储的errno
是个解决办法。
Windows的实现[编辑]
每个进程都有一组标志,共TLS_MINIMUM_AVAILABLE(==64)个。每个标志可以被设为FREE或INUSE,表示该TLS元素是否正在使用。注意这组标志属进程所有。当系统创建一个线程的时候,会为该线程分配与线程关联的、属于线程自己的PVOID型数组(共有TLS_MINIMUM_AVAILBALE个元素),数组中的每个PVOID可以保存任意值。
Windows API函数TlsAlloc
用于获取进程中一个未用的TLS slot index。然后将该标志从FREE改为INUSE,并返回该标志在位数组中的索引,通常将该索引保存在一个全局变量中,因为这个值会在整个进程范围内(而不是线程范围内)使用。
调用TlsSetValue(dwTlsIndex,pvTlsValue)将一个PVOID值放到线程的数组中dwTlsIndex指定的具体位置。
函数TlsGetValue
与TlsSetValue
用于通过TLS slot index读写一个线程局部存储变量所指向的内存块。函数TlsFree
用于释放TLS slot index。
在Win32线程信息块的FS:[0x2C]地址处,存放的是线程局部存储表的地址。[1]每个线程用它自己的线程局部存储表的拷贝。TlsAlloc返回表中一个未使用的索引。因此每个线程可以用TlsSetValue(index)设置线程局部存储值,用TlsGetValue(index)获取线程局部存储值。
Windows可执行程序也可以定义一个节(section),映射到进程每个线程的不同的内存分页。这种节只定义在主程序里,动态链接库(DLL)不应该包含这种节因为不会被LoadLibrary函数在加载时初始化。
对于Windows系统来说,全局变量或静态变量会被放到".data"或".bss"段中,但当使用__declspec(thread)定义一个线程私有变量的时候,编译器会把这些变量放到PE文件的".tls"段中。当系统启动一个新的线程时,它会从进程的堆中分配一块足够大小的空间,然后把".tls"段中的内容复制到这块空间中,于是每个线程都有自己独立的一个".tls"副本。所以对于用__declspec(thread)定义的同一个变量,它们在不同线程中的地址都是不一样的。对于一个TLS变量来说,它有可能是一个C++的全局对象,那么每个线程在启动时不仅仅是复制".tls"的内容那么简单,还需要把这些TLS对象初始化,必须逐个地调用它们的全局构造函数,而且当线程退出时,还要逐个地将它们析构,正如普通的全局对象在进程启动和退出时都要构造、析构一样。Windows PE文件的结构中有个叫数据目录的结构。它总共有16个元素,其中有一元素下标为IMAGE_DIRECT_ENTRY_TLS,这个元素中保存的地址和长度就是TLS表(IMAGE_TLS_DIRECTORY结构)的地址和长度。TLS表中保存了所有TLS变量的构造函数和析构函数的地址,Windows系统就是根据TLS表中的内容,在每次线程启动或退出时对TLS变量进行构造和析构。TLS表本身往往位于PE文件的".rdata"段中。
Pthreads的实现[编辑]
Pthreads API定义了线程特定的数据。
函数pthread_key_create
与pthread_key_delete
创建与删除一个键,用于线程特定的数据。键的类型被称为pthread_key_t
。键可以被所有线程看到。在每个线程,键可以用pthread_setspecific
函数关联到线程特定的数据。数据可以随后用pthread_getspecific
函数获取。
特定于语言的实现[编辑]
C and C++[编辑]
C11的关键字_Thread_local
用于定义线程局部变量。在头文件<threads.h>
定义了thread_local
为上述关键词的同义。例如:
#include <threads.h>
thread_local int foo = 0;
C++11引入了thread_local
[2]关键字用于下述情形:
- 名字空间(全局)变量
- 文件静态变量
- 函数静态变量
- 静态成员变量
此外,不同编译器提供了各自的方法声明线程局部变量:
- Solaris Studio C/C++, IBM XL C/C++,[3] GNU C,[4] Clang[5]与Intel C++ Compiler (Linux平台)[6]使用语法:
__thread int number;
- Visual C++,[7] Intel C/C++ (Windows systems),[8] C++Builder, 与Digital Mars C++ 使用语法:
__declspec(thread) int number;
- C++Builder也可以使用语法:
int __thread number;
Windows的版本早于Vista与Server 2008, __declspec(thread)
对于DLL只用于DLL被可执行程序绑定静态加载,在LoadLibrary()函数动态加载DLL将报告protection fault或data corruption。[9]
Java[编辑]
Java语言中,线程局部变量使用ThreadLocal
类对象表示。ThreadLocal保持了变量的类型T,可以通过get/set方法访问。例如,ThreadLocal保持了Integer值:
private static final ThreadLocal<Integer> myThreadLocalInteger = new ThreadLocal<Integer>();
Oracle/OpenJDK使用操作系统线程以避免性能代价。[10]
.NET 语言: C# 与Visuan Basic.Net[编辑]
.NET Framework语言,静态域可标记ThreadStatic attribute:
class FooBar {
[ThreadStatic] static int foo;
}
.NET 4.0,System.Threading.ThreadLocal<T>可用于分配与惰性装入线程局部变量。
class FooBar {
private static System.Threading.ThreadLocal<int> foo;
}
Also an API is available for dynamically allocating thread-local variables.
Python[编辑]
Python语言从版本2.4开始,threading模块的local类可用于创建线程局部存储:
import threading
mydata = threading.local()
mydata.x = 1
Ruby[编辑]
Ruby语言能创建/访问线程局部变量使用[]=/[]方法:
Thread.current[:user_id] = 1
Thread-local storage (TLS) is a computer programming method that uses static or global memory local to a thread.
While the use of global variables is generally discouraged in modern programming, legacy operating systems such as UNIX are designed for uniprocessor hardware and require additional mechanism to retain the semantics of pre-reentrant APIs. An example of such situations is where functions use a global variable to set an error condition (for example the global variable errno
used by many functions of the C library). If errno
were a global variable, a call of a system function on one thread may overwrite the value previously set by a call of a system function on a different thread, possibly before following code on that different thread could check for the error condition. The solution is to have errno
be a variable that looks like it is global, but in fact exists once per thread—i.e., it lives in thread-local storage. A second use case would be multiple threads accumulating information into a global variable. To avoid a race condition, every access to this global variable would have to be protected by a mutex. Alternatively, each thread might accumulate into a thread-local variable (that, by definition, cannot be read from or written to from other threads, implying that there can be no race conditions). Threads then only have to synchronise a final accumulation from their own thread-local variable into a single, truly global variable.
Many systems impose restrictions on the size of the thread-local memory block, in fact often rather tight limits. On the other hand, if a system can provide at least a memory address (pointer) sized variable thread-local, then this allows the use of arbitrarily sized memory blocks in a thread-local manner, by allocating such a memory block dynamically and storing the memory address of that block in the thread-local variable.
Contents
Windows implementation[edit]
The application programming interface (API) function TlsAlloc
can be used to obtain an unused TLS slot index; the TLS slot index will then be considered ‘used’.
The TlsGetValue
and TlsSetValue
functions are then used to read and write a memory address to a thread-local variable identified by the TLS slot index. TlsSetValue
only affects the variable for the current thread. The TlsFree
function can be called to release the TLS slot index.
There is a Win32 Thread Information Block for each thread. One of the entries in this block is the thread-local storage table for that thread.[1] TlsAlloc returns an index to this table, unique per address space, for each call. Each thread has its own copy of the thread-local storage table. Hence, each thread can independently use TlsSetValue(index) and obtain the specified value via TlsGetValue(index), because these set and look up an entry in the thread's own table.
Apart from TlsXxx function family, Windows executables can define a section which is mapped to a different page for each thread of the executing process. Unlike TlsXxx values, these pages can contain arbitrary and valid addresses. These addresses, however, are different for each executing thread and therefore should not be passed to asynchronous functions (which may execute in a different thread) or otherwise passed to code which assume that a virtual address is unique within the whole process. TLS sections are managed using memory paging and its size is quantized to a page size (4kB on x86 machines). Such sections may only be defined inside a main executable of a program - DLLs should not contain such sections, because they are not correctly initialized when loading with LoadLibrary.
Pthreads implementation[edit]
In the Pthreads API, memory local to a thread is designated with the term Thread-specific data.
The functions pthread_key_create
and pthread_key_delete
are used respectively to create and delete a key for thread-specific data. The type of the key is explicitly left opaque and is referred to as pthread_key_t
. This key can be seen by all threads. In each thread, the key can be associated with thread-specific data via pthread_setspecific
. The data can later be retrieved using pthread_getspecific
.
In addition pthread_key_create
can optionally accept a destructor function that will automatically be called at thread exit, if the thread-specific data is not NULL. The destructor receives the value associated with the key as parameter so it can perform cleanup actions (close connections, free memory, etc.). Even when a destructor is specified, the program must still call pthread_key_delete
to free the thread-specific data at process level (the destructor only frees the data local to the thread).
Language-specific implementation[edit]
Apart from relying on programmers to call the appropriate API functions, it is also possible to extend the programming language to support thread local storage (TLS).
C and C++[edit]
In C11, the keyword _Thread_local
is used to define thread-local variables. The header <threads.h>
, if supported, defines thread_local
as a synonym for that keyword. Example usage:
#include <threads.h>
thread_local int foo = 0;
C++11 introduces the thread_local
[2] keyword which can be used in the following cases
- Namespace level (global) variables
- File static variables
- Function static variables
- Static member variables
Aside from that, various compiler implementations provide specific ways to declare thread-local variables:
- Solaris Studio C/C++, IBM XL C/C++,[3] GNU C,[4] Clang[5] and Intel C++ Compiler (Linux systems)[6] use the syntax:
__thread int number;
- Visual C++,[7] Intel C/C++ (Windows systems),[8] C++Builder, and Digital Mars C++ use the syntax:
__declspec(thread) int number;
- C++Builder also supports the syntax:
int __thread number;
On Windows versions before Vista and Server 2008, __declspec(thread)
works in DLLs only when those DLLs are bound to the executable, and will not work for those loaded with LoadLibrary() (a protection fault or data corruption may occur).[9]
Common Lisp (and maybe other dialects)[edit]
Common Lisp provides a feature called dynamically scoped variables.
Dynamic variables have a binding which is private to the invocation of a function and all of the children called by that function.
This abstraction naturally maps to thread-specific storage, and Lisp implementations that provide threads do this. Common Lisp has numerous standard dynamic variables, and so threads cannot be sensibly added to an implementation of the language without these variables having thread-local semantics in dynamic binding.
For instance the standard variable *print-base*
determines the default radix in which integers are printed. If this variable is overridden, then all enclosing code will print integers in an alternate radix:
;;; function foo and its children will print
;; in hexadecimal:
(let ((*print-base* 16)) (foo))
If functions can execute concurrently on different threads, this binding has to be properly thread-local, otherwise each thread will fight over who controls a global printing radix.
D[edit]
In D version 2, all static and global variables are thread-local by default and are declared with syntax similar to "normal" global and static variables in other languages. Global variables must be explicitly requested using the shared keyword:
int threadLocal; // This is a thread-local variable.
shared int global; // This is a global variable shared with all threads.
The shared keyword works both as the storage class, and as a type qualifier – shared variables are subject to some restrictions which statically enforce data integrity.[10] To declare a "classic" global variable without these restrictions, the unsafe __gshared keyword must be used:[11]
__gshared int global; // This is a plain old global variable.
Java[edit]
In Java, thread-local variables are implemented by the ThreadLocal
class object. ThreadLocal holds variable of type T, which is accessible via get/set methods. For example, ThreadLocal variable holding Integer value looks like this:
private static final ThreadLocal<Integer> myThreadLocalInteger = new ThreadLocal<Integer>();
At least for Oracle/OpenJDK, this does not use native thread-local storage in spite of OS threads being used for other aspects of Java threading. Instead, each Thread object stores a (non-thread-safe) map of ThreadLocal objects to their values (as opposed to each ThreadLocal having a map of Thread objects to values and incurring a performance overhead).[12]
.NET languages: C# and others[edit]
In .NET Framework languages such as C#, static fields can be marked with the ThreadStatic attribute:
class FooBar {
[ThreadStatic] static int foo;
}
In .NET 4.0 the System.Threading.ThreadLocal<T> class is available for allocating and lazily loading thread-local variables.
class FooBar {
private static System.Threading.ThreadLocal<int> foo;
}
Also an API is available for dynamically allocating thread-local variables.
Object Pascal[edit]
In Object Pascal (Delphi) or Free Pascal the threadvar reserved keyword can be used instead of 'var' to declare variables using the thread-local storage.
var
mydata_process: integer;
threadvar
mydata_threadlocal: integer;
Objective-C[edit]
In Cocoa, GNUstep, and OpenStep, each NSThread object has a thread-local dictionary that can be accessed through the thread's threadDictionary method.
NSMutableDictionary *dict = [[NSThread currentThread] threadDictionary];
dict[@"A key"] = @"Some data";
Perl[edit]
In Perl threads were added late in the evolution of the language, after a large body of extant code was already present on the Comprehensive Perl Archive Network (CPAN). Thus, threads in Perl by default take their own local storage for all variables, to minimise the impact of threads on extant non-thread-aware code. In Perl, a thread-shared variable can be created using an attribute:
use threads;
use threads::shared;
my $localvar;
my $sharedvar :shared;
Python[edit]
In Python version 2.4 or later, local class in threading module can be used to create thread-local storage.
import threading
mydata = threading.local()
mydata.x = 1
Ruby[edit]
Ruby can create/access thread-local variables using []=/[] methods:
Thread.current[:user_id] = 1
References[edit]
- ^ Pietrek, Matt (May 2006). "Under the Hood". MSDN. Retrieved 6 April 2010.
- ^ Section 3.7.2 in C++11 standard
- ^ IBM XL C/C++: Thread-local storage
- ^ GCC 3.3.1: Thread-Local Storage
- ^ Clang 2.0: release notes
- ^ Intel C++ Compiler 8.1 (linux) release notes: Thread-local Storage
- ^ Visual Studio 2003: Thread extended storage-class modifier
- ^ Intel C++ Compiler 10.0 (windows): Thread-local storage
- ^ "Rules and Limitations for TLS"
- ^ Alexandrescu, Andrei (6 July 2010). "Chapter 13 - Concurrency". The D Programming Language. InformIT. p. 3. Retrieved 3 January 2014.
- ^ Bright, Walter (12 May 2009). "Migrating to Shared". dlang.org. Retrieved 3 January 2014.
- ^ "How is Java's ThreadLocal implemented under the hood?". Stack Overflow. Stack Exchange. Retrieved 27 December 2015.
External links[edit]
- ELF Handling For Thread-Local Storage — Document about an implementation in C or C++.
- ACE_TSS< TYPE > Class Template Reference
- RWTThreadLocal<Type> Class Template Documentation
- Article "Use thread-local Storage to Pass Thread Specific Data" by Doug Doedens
- "Thread-Local Storage" by Lawrence Crowl
- Article "It's Not Always Nice To Share" by Walter Bright
- Practical ThreadLocal usage in Java: http://www.captechconsulting.com/blogs/a-persistence-pattern-using-threadlocal-and-ejb-interceptors
- GCC "[1]"