Multiprocessing Start Methods(转)
原文:https://superfastpython.com/multiprocessing-start-method/
作者:JASON BROWNLEE
Need to Change Start Method
A process is a running instance of a computer program.
Every Python program is executed in a Process, which is a new instance of the Python interpreter. This process has the name MainProcess and has one thread used to execute the program instructions called the MainThread. Both processes and threads are created and managed by the underlying operating system.
Sometimes we may need to create new child processes in our program in order to execute code concurrently.
Python provides the ability to create and manage new processes via the multiprocessing.Process class.
In multiprocessing programming, we may need to change the technique used to start child processes.
This is called the start method.
What is a start method and how can we configure it in Python?
Run your loops using all CPUs, download my FREE book to learn how.
What is a Start Method
A start method is the technique used to start child processes in Python.
There are three start methods, they are:
- spawn: start a new Python process.
- fork: copy a Python process from an existing process.
- forkserver: new process from which future forked processes will be copied.
Default Start Methods
Each platform has a default start method.
The following lists the major platforms and the default start methods.
- Windows (win32): spawn
- macOS (darwin): spawn
- Linux (unix): fork
Supported Start Methods
Not all platforms support all start methods.
The following lists the major platforms and the start methods that are supported.
- Windows (win32): spawn
- macOS (darwin): spawn, fork, forkserver.
- Linux (unix): spawn, fork, forkserver.
Generally, a fork is considered not safe on macOS.
You can learn more about this here:
Confused by the multiprocessing module API?
Download my FREE PDF cheat sheet
How to Change The Start Method
The multiprocessing package provides functions for getting and setting the start method for creating child processes.
The start method API includes the following functions:
- multiprocessing.get_all_start_methods()
- multiprocessing.get_start_method()
- multiprocessing.set_start_method()
- multiprocessing.get_context()
Let’s take a closer look at each in turn.
How to Get Supported Start Methods
The list of supported start methods can be retrieved via the multiprocessing.get_all_start_methods() function.
The function returns a list of string values, each representing a supported start method.
For example:
1
2
3
|
...
# get supported start methods
methods = multiprocessing.get_all_start_methods()
|
How to Get The Current Start Method
The current start method can be retrieved via the multiprocessing.get_start_method() function.
The function returns a string value for the currently configured start method.
For example:
1
2
3
|
...
# get the current start method
method = multiprocessing.get_start_method()
|
How to Set Start Method
The start method can be set via the multiprocessing.set_start_method() function.
The function takes a string argument indicating the start method to use.
This must be one of the methods returned from the multiprocessing.get_all_start_methods() for your platform.
For example:
1
2
3
|
...
# set the start method
multiprocessing.set_start_method('spawn')
|
It is a best practice, and required on most platforms that the start method be set first, prior to any other code, and to be done so within a if __name__ == ‘__main__’ check called a protected entry point or top-level code environment.
For example:
1
2
3
4
5
|
...
# protect the entry point
if __name__ == '__main__':
# set the start method
multiprocessing.set_start_method('spawn')
|
If the start method is not set within a protected entry point, it is possible to get a RuntimeError such as:
1
|
RuntimeError: context has already been set
|
It is also a good practice and required on some platforms that the start method only be set once.
Best Practices For Setting the Start Method
In summary, the rules for setting the start method are as follows:
- Set the start method first prior to all other code.
- Set the start method only once in a program.
- Set the start method within a protected entry point.
How to Set Start Method Via Context
A multiprocessing context configured with a given start method can be retrieved via the multiprocessing.get_context() function.
This function takes the name of the start method as an argument, then returns a multiprocessing context that can be used to create new child processes.
For example:
1
2
3
|
...
# get a context configured with a start method
context = multiprocessing.get_context('fork')
|
The context can then be used to create a child process, for example:
1
2
3
|
...
# create a child process via a context
process = context.Process(...)
|
It may also be possible to force the start method.
This can be achieved via the “force” argument provided on the set_start_method() implementation in the DefaultContext, although not documented.
For example:
1
2
3
|
...
# set the start method
context.set_start_method('spawn', force=True)
|
Now that we know how to configure the start method, let’s look at some worked examples.
Free Python Multiprocessing Course
Download my multiprocessing API cheat sheet and as a bonus you will get FREE access to my 7-day email course.
Discover how to use the Python multiprocessing module including how to create and start child processes and how to use a mutex locks and semaphores.
Example of Getting Supported Start Methods
We can get a list of all supported start methods for the current platform.
This can be achieved via the multiprocessing.get_all_start_methods() function.
The example below gets the list of supported start methods for the current platform, then reports the result.
1
2
3
4
5
6
7
8
|
# SuperFastPython.com
# example of getting the supported start methods
from multiprocessing import get_all_start_methods
# protect entry point
if __name__ == '__main__':
# get the supported start methods
methods = get_all_start_methods()
print(methods)
|
Running the example gets the list of supported start methods for the current platform.
In this case, we can see the current platform supports all three start methods.
Your specific results may differ, depending on your platform.
1
|
['spawn', 'fork', 'forkserver']
|
Overwheled by the python concurrency APIs?
Find relief, download my FREE Python Concurrency Mind Maps
Example of Getting the Current Start Method
We can get the name of the currently configured start method.
This can be achieved via the multiprocessing.get_start_method() function
The example below gets the currently configured start method and reports it.
1
2
3
4
5
6
7
8
|
# SuperFastPython.com
# example of getting the start method
from multiprocessing import get_start_method
# protect entry point
if __name__ == '__main__':
# get the start method
method = get_start_method()
print(method)
|
Running the example retrieves the currently configured start method.
In this case, the ‘spawn‘ method is the default start method.
Your specific result may differ, depending on your platform.
1
|
spawn
|
Example of Setting the Start Method
We can configure the start method using the multiprocessing.set_start_method() function.
In this section we can explore examples of setting each of the different types of start methods and then using it to start a new child process.
Specifically, we will look at starting a child process using the spawn, fork, and forkserver start methods.
Let’s take a closer look at each.
Example of Spawn Start Method
We can start a child process using the ‘spawn‘ start method.
1
2
3
|
...
# set the spawn method
set_start_method('spawn')
|
Once configured, we can confirm that the start method was changed.
1
2
3
|
...
# report the start method
print(f'Start Method: {get_start_method()}')
|
We can then create a new child process using the start method to execute a custom function.
1
2
3
4
5
|
...
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Tying this together, the complete example is listed below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
# SuperFastPython.com
# example of setting the spawn start method
from multiprocessing import set_start_method
from multiprocessing import get_start_method
from multiprocessing import Process
# function executed in a new process
def task():
print('Hello from new process', flush=True)
# protect entry point
if __name__ == '__main__':
# set the spawn method
set_start_method('spawn')
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Running the example first configures the start method as ‘spawn‘.
The current start method is retrieved and reported, confirming that the change has taken effect.
A new child process is then created using the start method.
1
2
|
Start Method: spawn
Hello from new process
|
Example of Fork Start Method
We can start a child process using the ‘fork‘ start method.
1
2
3
|
...
# set the start method
set_start_method('fork')
|
Once configured, we can confirm that the start method was changed.
1
2
3
|
...
# report the start method
print(f'Start Method: {get_start_method()}')
|
We can then create a new child process using the start method to execute a custom function.
1
2
3
4
5
|
...
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Tying this together, the complete example is listed below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
# SuperFastPython.com
# example of setting the fork start method
from multiprocessing import set_start_method
from multiprocessing import get_start_method
from multiprocessing import Process
# function executed in a new process
def task():
print('Hello from new process', flush=True)
# protect entry point
if __name__ == '__main__':
# set the start method
set_start_method('fork')
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Running the example first configures the start method as ‘fork‘.
The current start method is retrieved and reported, confirming that the change has taken effect.
A new child process is then created using the start method.
1
2
|
Start Method: fork
Hello from new process
|
Example of ForkServer Start Method
We can start a child process using the ‘forkserver‘ start method.
Recall that the forkserver method will start a new server process and use a copy of this process each time a child process is created.
1
2
3
|
...
# set the start method
set_start_method('forkserver')
|
Once configured, we can confirm that the start method was changed.
1
2
3
|
...
# report the start method
print(f'Start Method: {get_start_method()}')
|
We can then create a new child process using the start method to execute a custom function.
1
2
3
4
5
|
...
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Tying this together, the complete example is listed below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
# SuperFastPython.com
# example of setting the forkserver start method
from multiprocessing import set_start_method
from multiprocessing import get_start_method
from multiprocessing import Process
# function executed in a new process
def task():
print('Hello from new process', flush=True)
# protect entry point
if __name__ == '__main__':
# set the start method
set_start_method('forkserver')
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Running the example first configures the start method as ‘forkserver‘.
The current start method is retrieved and reported, confirming that the change has taken effect.
A new child process is then created using the start method.
1
2
|
Start Method: forkserver
Hello from new process
|
Example of Multiple Start Methods
We can use multiple start methods within our program via a multiprocessing context.
In this example we will set the baseline start method to ‘fork‘ and use it to start a new child process. We will then create a new multiprocessing context with the ‘spawn’ start method, then use it to start a child process.
First, we can set the baseline start method to ‘fork‘, confirm it has been set, then start a new process using the method.
1
2
3
4
5
6
7
8
9
|
...
# set the start method
set_start_method('fork')
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Next, we can create a new multiprocessing context using the ‘spawn‘ start method.
1
2
3
|
...
# change start method
context = get_context('spawn')
|
We can then confirm that the context uses the configured start method.
1
2
3
|
...
# report the start method
print(f'Start Method: {context.get_start_method()}')
|
Finally, we can create a new child process using the context with the different start method.
1
2
3
4
5
|
...
# start a child process
process = context.Process(target=task)
process.start()
process.join()
|
Tying this together, the complete example is listed below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
# SuperFastPython.com
# example of setting the fork start method
from multiprocessing import get_context
from multiprocessing import set_start_method
from multiprocessing import get_start_method
from multiprocessing import Process
# function executed in a new process
def task():
print('Hello from new process', flush=True)
# protect entry point
if __name__ == '__main__':
# set the start method
set_start_method('fork')
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
# change start method
context = get_context('spawn')
# report the start method
print(f'Start Method: {context.get_start_method()}')
# start a child process
process = context.Process(target=task)
process.start()
process.join()
|
Running the example first sets the baseline start method to ‘fork‘.
It then confirms that the start method was changed as expected and then creates and starts a new child process using the start method.
Next, a new multiprocessing context is created with a different start method, ‘spawn‘ in this case.
The start method of the new context is then confirmed and a new child process is created using the start method in the created context, different from the baseline start method.
1
2
3
4
|
Start Method: fork
Hello from new process
Start Method: spawn
Hello from new process
|
Example of RuntimeError Setting Start Method
We may get a RuntimeError when setting the start method.
This can happen when setting a specific start method, e.g. ‘spawn‘ outside of a protected entry point.
We can demonstrate this with a worked example.
The example below attempts to start the start method to ‘spawn‘ first, directly after the import statements and outside of the protected entry point.
The complete example is listed below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
# SuperFastPython.com
# example of getting a runtime error when setting the start method
from multiprocessing import set_start_method
from multiprocessing import get_start_method
from multiprocessing import Process
# set the start method
set_start_method('spawn') # raises a RuntimeError
# function executed in a new process
def task():
print('Hello from new process', flush=True)
# protect entry point
if __name__ == '__main__':
# report the start method
print(f'Start Method: {get_start_method()}')
# start a child process
process = Process(target=task)
process.start()
process.join()
|
Running the example results in a RuntimeError indicating that the start method has already been set.
1
2
3
4
|
Start Method: spawn
Traceback (most recent call last):
...
RuntimeError: context has already been set
|
Interestingly, we don’t get the error if we attempt to set the start method to ‘fork‘ outside of the protected entry point.
The fix involves moving the call to set_start_method() to be the first line within the protected entry point.
For example:
1
2
3
4
5
|
...
# protect entry point
if __name__ == '__main__':
# set the start method
set_start_method('spawn')
|
Further Reading
This section provides additional resources that you may find helpful.
Books
- Python Multiprocessing Jump-Start, Jason Brownlee, 2022 (my book!).
- Multiprocessing API Interview Questions
- Multiprocessing Module API Cheat Sheet
I would also recommend specific chapters in the books:
- Effective Python, Brett Slatkin, 2019.
- See: Chapter 7: Concurrency and Parallelism
- High Performance Python, Ian Ozsvald and Micha Gorelick, 2020.
- See: Chapter 9: The multiprocessing Module
- Python in a Nutshell, Alex Martelli, et al., 2017.
- See: Chapter: 14: Threads and Processes
Guides
APIs
Takeaways
You now know how to configure the start method for processes in Python.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Photo by Greg Rosenke on Unsplash