Go Memory Model 内存模型 同步 goroutine
小结:
1、
序列化访问
Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.
To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync
and sync/atomic
packages.
2、
检测到数据竞争时,可终止程序执行。
First, any implementation can, upon detecting a data race, report the race and halt execution of the program. Implementations using ThreadSanitizer (accessed with “go
build
-race
”) do exactly this
3、同步
1)初始化
.程序初始化在单个go程中运行,但该gor例程可以创建其他go程,这些go程可以并发运行。
如果包p导入包q, q的init函数完成发生在任何p的开始之前。
所有init函数的完成都在main.main函数启动之前同步完成。
2)
Goroutine creation
goroutine创建
启动一个新的goroutine的go语句在开始执行goroutine之前被同步。
var a string func f() { print(a) } func hello() { a = "hello, world" go f() }
The go
statement that starts a new goroutine is synchronized before the start of the goroutine's execution.
calling hello
will print "hello, world"
at some point in the future (perhaps after hello
has returned).
Goroutine destruction
var a string func hello() { go func() { a = "hello" }() print(a) }
The exit of a goroutine is not guaranteed to be synchronized before any event in the program. For example, in this program:
the assignment to a
is not followed by any synchronization event, so it is not guaranteed to be observed by any other goroutine. In fact, an aggressive compiler might delete the entire go
statement.
If the effects of a goroutine must be observed by another goroutine, use a synchronization mechanism such as a lock or channel communication to establish a relative ordering.
Channel communication
Channel communication is the main method of synchronization between goroutines. Each send on a particular channel is matched to a corresponding receive from that channel, usually in a different goroutine.
通道通信是各程序之间同步的主要方法。在特定通道上的每个发送都与来自该通道的相应接收相匹配,通常在不同的goroutine中。
A send on a channel is synchronized before the completion of the corresponding receive from that channel.
通道上的发送在从该通道完成相应的接收之前被同步。
var c = make(chan int, 10) var a string func f() { a = "hello, world" c <- 0 } func main() { go f() <-c print(a) }
(有缓冲的通道,打印hello,world)
The write to a
is sequenced before the send on c
, which is synchronized before the corresponding receive on c
completes, which is sequenced before the print
.
对a的写在c上的发送之前排序,在c上相应的接收完成之前同步,在打印之前排序。
var c = make(chan int) var a string func f() { a = "hello, world" <-c } func main() { go f() c <- 0 print(a) }
(无缓冲的通道,打印hello,world)
var c = make(chan int, 10) var a string func f() { a = "hello, world" <-c } func main() { go f() c <- 0 print(a) }
(改动为:有缓冲,不会打印)
注意对比上述3份代码。
If the channel were buffered (e.g., c = make(chan int, 1)
) then the program would not be guaranteed to print "hello, world"
. (It might print the empty string, crash, or do something else.)
如果通道是缓冲的(例如,c = make(chan int, 1)),那么程序就不能保证输出“hello, world”。(它可能打印空字符串、崩溃或执行其他操作。)
The kth receive on a channel with capacity C is synchronized before the completion of the k+Cth send from that channel completes.
容量为C的通道上的第k个接收在该通道的第k+C个发送完成之前被同步。
(满了,才能接收)
This rule generalizes the previous rule to buffered channels. It allows a counting semaphore to be modeled by a buffered channel: the number of items in the channel corresponds to the number of active uses, the capacity of the channel corresponds to the maximum number of simultaneous uses, sending an item acquires the semaphore, and receiving an item releases the semaphore. This is a common idiom for limiting concurrency.
该规则将前面的规则推广到缓冲通道。它允许通过缓冲通道对计数信号量进行建模:通道中的项的数量对应于活动使用的数量,通道的容量对应于同时使用的最大数量,发送一个项获得信号量,接收一个项释放信号量。这是限制并发性的常用习惯用法。
限制正在运行的任务的数量
This program starts a goroutine for every entry in the work list, but the goroutines coordinate using the limit
channel to ensure that at most three are running work functions at a time.
var limit = make(chan int, 3) func main() { for _, w := range work { go func(w func()) { limit <- 1 w() <-limit }(w) } select{} }
对比
func task(i int) { log.Println(" task,i ", i) time.Sleep(2 * time.Second) } func main() { log.Println("Start") ch := make(chan struct{}, 4) for i := 0; i < 16; i++ { // ch<-struct{}{} go func(i int) { ch <- struct{}{} // select{} defer func() { <-ch }() task(i) }(i) } log.Println("End") select{} // select{} }
func task(i int) { log.Println(" task,i ", i) time.Sleep(2 * time.Second) } func main() { log.Println("Start") ch := make(chan struct{}, 4) for i := 0; i < 16; i++ { ch<-struct{}{} go func(i int) { // ch <- struct{}{} // select{} defer func() { <-ch }() task(i) }(i) } log.Println("End") // select{} // select{} }
实践:
控制执行顺序,同步
2022/09/29 14:57:03 in
2022/09/29 14:57:03 in f
2022/09/29 14:57:11 out f
2022/09/29 14:57:11 out
等待
2022/09/30 16:17:44 out
2022/09/30 16:17:44 main-0
2022/09/30 16:17:48 in
2022/09/30 16:17:50 main-1
import ( "log" "sync" "time" ) var l sync.Mutex var a string func f() { a = "hello, world" l.Unlock() } func main() { l.Lock() go f() l.Lock() print(a) }
For any sync.Mutex
or sync.RWMutex
variable l
and n < m, call n of l.Unlock()
is synchronized before call m of l.Lock()
returns.
l.RLock
on a sync.RWMutex
variable l
, there is an n such that the nth call to l.Unlock
is synchronized before the return from l.RLock
, and the matching call to l.RUnlock
is synchronized before the return from call n+1 to l.Lock
. Once
import ( "log" "sync" "time" ) // package LabGO var a string var i int var once sync.Once func setup() { a = "hello, world" i+=1 } func doprint() { once.Do(setup) println(a,i) } func twoprint() { go doprint() go doprint() } func main() { twoprint() time.Sleep(time.Second) }
6)原子值
Atomic Values
The APIs in the sync/atomic
package are collectively “atomic operations” that can be used to synchronize the execution of different goroutines. If the effect of an atomic operation A is observed by atomic operation B, then A is synchronized before B. All the atomic operations executed in a program behave as though executed in some sequentially consistent order.
sync/atomic包中的api都是“原子操作”,可用于同步不同gor例程的执行。如果原子操作B可以观察到原子操作A的效果,那么A就会在B之前同步。在程序中执行的所有原子操作的行为就像是按照某种顺序一致的顺序执行的。
The preceding definition has the same semantics as C++’s sequentially consistent atomics and Java’s volatile
variables.
前面的定义与c++的顺序一致的原子和Java的volatile变量具有相同的语义。
7)终结器
Finalizers
The runtime
package provides a SetFinalizer
function that adds a finalizer to be called when a particular object is no longer reachable by the program. A call to SetFinalizer(x, f)
is synchronized before the finalization call f(x)
.
运行时包提供了一个SetFinalizer函数,该函数添加了一个终结器,在程序不再访问特定对象时调用它。对SetFinalizer(x, f)的调用在终结调用f(x)之前同步。
8)额外的机制
Additional Mechanisms
The sync
package provides additional synchronization abstractions, including condition variables, lock-free maps, allocation pools, and wait groups. The documentation for each of these specifies the guarantees it makes concerning synchronization.
Other packages that provide synchronization abstractions should document the guarantees they make too.
4、不正确的同步
Incorrect synchronization
import ( "log" "sync" "time" ) var a, b int func f() { a = 1 b = 2 } func g() { // print(b) // print(a) // println(a,b) println(a,b,a,b) } func main() { go f() g() }
Programs with races are incorrect and can exhibit non-sequentially consistent executions. In particular, note that a read r may observe the value written by any write w that executes concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.
In this program:
var a, b int func f() { a = 1 b = 2 } func g() { print(b) print(a) } func main() { go f() g() }
it can happen that g
prints 2
and then 0
.
This fact invalidates a few common idioms.
Double-checked locking is an attempt to avoid the overhead of synchronization. For example, the twoprint
program might be incorrectly written as:
var a string var done bool func setup() { a = "hello, world" done = true } func doprint() { if !done { once.Do(setup) } print(a) } func twoprint() { go doprint() go doprint() }
but there is no guarantee that, in doprint
, observing the write to done
implies observing the write to a
. This version can (incorrectly) print an empty string instead of "hello, world"
.
Another incorrect idiom is busy waiting for a value, as in:
var a string var done bool func setup() { a = "hello, world" done = true } func main() { go setup() for !done { } print(a) }
As before, there is no guarantee that, in main
, observing the write to done
implies observing the write to a
, so this program could print an empty string too. Worse, there is no guarantee that the write to done
will ever be observed by main
, since there are no synchronization events between the two threads. The loop in main
is not guaranteed to finish.
There are subtler variants on this theme, such as this program.
type T struct { msg string } var g *T func setup() { t := new(T) t.msg = "hello, world" g = t } func main() { go setup() for g == nil { } print(g.msg) }
Even if main
observes g != nil
and exits its loop, there is no guarantee that it will observe the initialized value for g.msg
.
In all these examples, the solution is the same: use explicit synchronization.
在所有这些示例中,解决方案都是相同的:使用显式同步。
Incorrect compilation
*p = 1 if cond { *p = 2 }
不将数据竞争引入无竞争程序意味着不将写入移出它们出现的条件语句。例如,编译器不能在这个程序中反转条件:
That is, the compiler must not rewrite the program into this one: 也就是说,编译器不能将程序重写为这个:
*p = 2 if !cond { *p = 1 }
If cond
is false and another goroutine is reading *p
, then in the original program, the other goroutine can only observe any prior value of *p
and 1
. In the rewritten program, the other goroutine can observe 2
, which was previously impossible.
如果cond为false,而另一个goroutine正在读取*p,那么在原始程序中,另一个goroutine只能观察到*p和1的任何先前值。在重写的程序中,另一个goroutine可以观察到2,这在以前是不可能的。
Not introducing data races also means not assuming that loops terminate. For example, a compiler must in general not move the accesses to *p
or *q
ahead of the loop in this program:
n := 0 for e := list; e != nil; e = e.next { n++ } i := *p *q = 1
If list
pointed to a cyclic list, then the original program would never access *p
or *q
, but the rewritten program would. (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic; moving `*q` ahead would also require the compiler proving that no other goroutine can access `*q`.)
如果list指向循环列表,那么原始程序将永远不会访问*p或*q,但重写的程序将访问。(如果编译器能证明' *p '不会恐慌,那么将' *p '向前移动是安全的;将' *q '向前移动还需要编译器证明没有其他goroutine可以访问' *q '。)
Not introducing data races also means not assuming that called functions always return or are free of synchronization operations. For example, a compiler must not move the accesses to *p
or *q
ahead of the function call in this program (at least not without direct knowledge of the precise behavior of f
): 不引入数据竞争也意味着不假设被调用的函数总是返回或没有同步操作。例如,在这个程序中,编译器不能在函数调用之前移动对*p或*q的访问(至少不能在没有直接了解f的精确行为的情况下):
f() i := *p *q = 1
If the call never returned, then once again the original program would never access *p
or *q
, but the rewritten program would. And if the call contained synchronizing operations, then the original program could establish happens before edges preceding the accesses to *p
and *q
, but the rewritten program would not.
i
and reload it a second time from *p
in this program:i := *p if i < 0 || i >= len(funcs) { panic("invalid function index") } ... complex code ... // compiler must NOT reload i = *p here funcs[i]()
i
without saving a copy and then reload i = *p
just before funcs[i]()
. A Go compiler must not, because the value of *p
may have changed. (Instead, the compiler could spill i
to the stack.)*p
as temporary storage in this program:*p = i + *p/2
That is, it must not rewrite the program into this one: 也就是说,它不能将程序重写为这个:
*p /= 2 *p += i
If i
and *p
start equal to 2, the original code does *p = 3
, so a racing thread can read only 2 or 3 from *p
. The rewritten code does *p = 1
and then *p = 3
, allowing a racing thread to read 1 as well.
Note that all these optimizations are permitted in C/C++ compilers: a Go compiler sharing a back end with a C/C++ compiler must take care to disable optimizations that are invalid for Go.
注意,所有这些优化在C/ C++编译器中都是允许的:与C/C++编译器共享后端的Go编译器必须注意禁用对Go无效的优化。
Note that the prohibition on introducing data races does not apply if the compiler can prove that the races do not affect correct execution on the target platform. For example, on essentially all CPUs, it is valid to rewrite
请注意,如果编译器能够证明数据竞争不会影响目标平台上的正确执行,则不禁止引入数据竞争。例如,在几乎所有的cpu上,重写都是有效的
n := 0 for i := 0; i < m; i++ { n += *shared }
into:
n := 0 local := *shared for i := 0; i < m; i++ { n += local }
provided it can be proved that *shared
will not fault on access, because the potential added read will not affect any existing concurrent reads or writes. On the other hand, the rewrite would not be valid in a source-to-source translator.
前提是可以证明*shared不会在访问时出错,因为潜在的添加读不会影响任何现有的并发读或写。另一方面,重写在源代码到源代码的翻译器中是无效的。
Go programmers writing data-race-free programs can rely on sequentially consistent execution of those programs, just as in essentially all other modern programming languages.
When it comes to programs with races, both programmers and compilers should remember the advice: don't be clever.
写的无数据竞争程序可以依赖于这些程序的顺序一致执行,就像所有其他现代编程语言一样。
The Go Memory Model - The Go Programming Language https://go.dev/ref/mem