Python解释器是单线程应用 IO 密集型 计算密集型 GIL global interpreter lock 线程状态和全局解释器锁 引用计数 内存管理 纤程安全 原子

Frequently Asked Questions — PyPy documentation https://doc.pypy.org/en/latest/faq.html#does-pypy-have-a-gil-why

PEP 703 – Making the Global Interpreter Lock Optional in CPython | peps.python.org https://peps.python.org/pep-0703/

Overview of CPython Changes

Removing the global interpreter lock requires substantial changes to CPython internals, but relatively few changes to the public Python and C APIs. This section describes the required changes to the CPython implementation followed by the proposed API changes.

The implementation changes can be grouped into the following four categories:

  • Reference counting
  • Memory management
  • Container thread-safety
  • Locking and atomic APIs

 

 

 

 

 

 

GlobalInterpreterLock - Python Wiki https://wiki.python.org/moin/GlobalInterpreterLock

In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. The GIL prevents race conditions and ensures thread safety. A nice explanation of how the Python GIL helps in these areas can be found here. In short, this mutex is necessary mainly because CPython's memory management is not thread-safe.

In hindsight, the GIL is not ideal, since it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Luckily, many potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.

Unfortunately, since the GIL exists, other features have grown to depend on the guarantees that it enforces. This makes it hard to remove the GIL without breaking many official and unofficial Python packages and modules.

The GIL can degrade performance even when it is not a bottleneck. Summarizing the linked slides: The system call overhead is significant, especially on multicore hardware. Two threads calling a function may take twice as much time as a single thread calling the function twice. The GIL can cause I/O-bound threads to be scheduled ahead of CPU-bound threads, and it prevents signals from being delivered.

CPython extensions must be GIL-aware in order to avoid defeating threads. For an explanation, see Global interpreter lock.

Non-CPython implementations

  • Jython and IronPython have no GIL and can fully exploit multiprocessor systems

  • PyPy currently has a GIL like CPython

  • in Cython the GIL exists, but can be released temporarily using a "with" statement

 

 Initialization, Finalization, and Threads — Python 3.11.3 documentation https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock

Thread State and the Global Interpreter Lock

The Python interpreter is not fully thread-safe. In order to support multi-threaded Python programs, there’s a global lock, called the global interpreter lock or GIL, that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.

Therefore, the rule exists that only the thread that has acquired the GIL may operate on Python objects or call Python/C API functions. In order to emulate concurrency of execution, the interpreter regularly tries to switch threads (see sys.setswitchinterval()). The lock is also released around potentially blocking I/O operations like reading or writing a file, so that other Python threads can run in the meantime.

The Python interpreter keeps some thread-specific bookkeeping information inside a data structure called PyThreadState. There’s also one global variable pointing to the current PyThreadState: it can be retrieved using PyThreadState_Get().

 

线程状态和全局解释器锁

 

 hashlib --- 安全哈希与消息摘要 — Python 3.11.3 文档 https://docs.python.org/zh-cn/3.11/library/hashlib.html

 

 

hash.update(data)

用 bytes-like object 来更新哈希对象。 重复调用相当于单次调用并传入所有参数的拼接结果: m.update(a); m.update(b) 等价于 m.update(a+b)

在 3.1 版更改: 当使用 OpenSSL 提供的哈希算法在大于 2047 字节的数据上执行哈希更新时 Python GIL 会被释放以允许其他线程运行。

 

hash.update(data)

用 bytes-like object 来更新哈希对象。 重复调用相当于单次调用并传入所有参数的拼接结果: m.update(a); m.update(b) 等价于 m.update(a+b)

在 3.1 版更改: 当使用 OpenSSL 提供的哈希算法在大于 2047 字节的数据上执行哈希更新时 Python GIL 会被释放以允许其他线程运行。

 

 

posted @ 2017-10-10 00:42  papering  阅读(216)  评论(0编辑  收藏  举报