Go Memory Model 内存模型 同步 goroutine

小结:

1、

序列化访问

 

Programs that modify data being simultaneously accessed by multiple goroutines must serialize such access.

To serialize access, protect the data with channel operations or other synchronization primitives such as those in the sync and sync/atomic packages.

2、

检测到数据竞争时,可终止程序执行。

First, any implementation can, upon detecting a data race, report the race and halt execution of the program. Implementations using ThreadSanitizer (accessed with “go build -race”) do exactly this

3、同步

1)初始化

.程序初始化在单个go程中运行,但该gor例程可以创建其他go程,这些go程可以并发运行。

如果包p导入包q, q的init函数完成发生在任何p的开始之前。

所有init函数的完成都在main.main函数启动之前同步完成。

2)

Goroutine creation

 goroutine创建

 启动一个新的goroutine的go语句在开始执行goroutine之前被同步。

var a string

func f() {
	print(a)
}

func hello() {
	a = "hello, world"
	go f()
}

  

The go statement that starts a new goroutine is synchronized before the start of the goroutine's execution.

calling hello will print "hello, world" at some point in the future (perhaps after hello has returned).

 启动一个新的goroutine的go语句在开始执行goroutine之前被同步。
调用hello将在未来的某个时刻打印“hello, world”(可能在hello返回之后)。
 
3)

Goroutine destruction

 goroutine销毁
var a string

func hello() {
	go func() { a = "hello" }()
	print(a)
}

  

The exit of a goroutine is not guaranteed to be synchronized before any event in the program. For example, in this program:

 goroutine的退出不能保证在程序中的任何事件之前同步。
 

the assignment to a is not followed by any synchronization event, so it is not guaranteed to be observed by any other goroutine. In fact, an aggressive compiler might delete the entire go statement.

If the effects of a goroutine must be observed by another goroutine, use a synchronization mechanism such as a lock or channel communication to establish a relative ordering.

 

对a的赋值后不跟随任何同步事件,因此不能保证其他goroutine会观察到它。事实上,侵略性编译器可能会删除整个go语句。
如果一个goroutine的效果必须由另一个goroutine观察,那么可以使用一种同步机制,例如锁或通道通信来建立相对顺序。
 
4)
通道通信

Channel communication

Channel communication is the main method of synchronization between goroutines. Each send on a particular channel is matched to a corresponding receive from that channel, usually in a different goroutine.

通道通信是各程序之间同步的主要方法。在特定通道上的每个发送都与来自该通道的相应接收相匹配,通常在不同的goroutine中。

A send on a channel is synchronized before the completion of the corresponding receive from that channel.

通道上的发送在从该通道完成相应的接收之前被同步。

 

var c = make(chan int, 10)
var a string

func f() {
	a = "hello, world"
	c <- 0
}

func main() {
	go f()
	<-c
	print(a)
}

(有缓冲的通道,打印hello,world) 

 The write to a is sequenced before the send on c, which is synchronized before the corresponding receive on c completes, which is sequenced before the print.

对a的写在c上的发送之前排序,在c上相应的接收完成之前同步,在打印之前排序。

 

var c = make(chan int)
var a string

func f() {
	a = "hello, world"
	<-c
}

func main() {
	go f()
	c <- 0
	print(a)
}

(无缓冲的通道,打印hello,world)

 
var c = make(chan int, 10)
var a string

func f() {
	a = "hello, world"
	<-c
}

func main() {
	go f()
	c <- 0
	print(a)
}

(改动为:有缓冲,不会打印)

注意对比上述3份代码。

If the channel were buffered (e.g., c = make(chan int, 1)) then the program would not be guaranteed to print "hello, world". (It might print the empty string, crash, or do something else.)

 

如果通道是缓冲的(例如,c = make(chan int, 1)),那么程序就不能保证输出“hello, world”。(它可能打印空字符串、崩溃或执行其他操作。)

 

The kth receive on a channel with capacity C is synchronized before the completion of the k+Cth send from that channel completes.

容量为C的通道上的第k个接收在该通道的第k+C个发送完成之前被同步。

(满了,才能接收)

This rule generalizes the previous rule to buffered channels. It allows a counting semaphore to be modeled by a buffered channel: the number of items in the channel corresponds to the number of active uses, the capacity of the channel corresponds to the maximum number of simultaneous uses, sending an item acquires the semaphore, and receiving an item releases the semaphore. This is a common idiom for limiting concurrency.

该规则将前面的规则推广到缓冲通道。它允许通过缓冲通道对计数信号量进行建模:通道中的项的数量对应于活动使用的数量,通道的容量对应于同时使用的最大数量,发送一个项获得信号量,接收一个项释放信号量。这是限制并发性的常用习惯用法。

 

限制正在运行的任务的数量

This program starts a goroutine for every entry in the work list, but the goroutines coordinate using the limit channel to ensure that at most three are running work functions at a time.

 

var limit = make(chan int, 3)

func main() {
	for _, w := range work {
		go func(w func()) {
			limit <- 1
			w()
			<-limit
		}(w)
	}
	select{}
}

  对比

func task(i int) {
	log.Println(" task,i ", i)
	time.Sleep(2 * time.Second)
}
func main() {
	log.Println("Start")
	ch := make(chan struct{}, 4)
	for i := 0; i < 16; i++ {
		// ch<-struct{}{}
		go func(i int) {
			ch <- struct{}{} // select{}
			defer func() {
				<-ch
			}()
			task(i)
		}(i)
	}
	log.Println("End")
	select{} // select{}
}

  

func task(i int) {
	log.Println(" task,i ", i)
	time.Sleep(2 * time.Second)
}
func main() {
	log.Println("Start")
	ch := make(chan struct{}, 4)
	for i := 0; i < 16; i++ {
		ch<-struct{}{}
		go func(i int) {
			// ch <- struct{}{} // select{}
			defer func() {
				<-ch
			}()
			task(i)
		}(i)
	}
	log.Println("End")
	// select{} // select{}
}

  实践:

 

控制执行顺序,同步

var blockedWaitIt = make(chan struct{})

func f() {
    log.Println("in f")
    time.Sleep(8 * time.Second)
    log.Println("out f")
    blockedWaitIt <- struct{}{}
}
func main() {
    log.Println("in")
    go f()
    <-blockedWaitIt
    log.Println("out")
}

2022/09/29 14:57:03 in
2022/09/29 14:57:03 in f
2022/09/29 14:57:11 out f
2022/09/29 14:57:11 out

 

等待 

func f_go_long(){
    go func(){
        time.Sleep(4*time.Second)
        log.Println("in")
    }()
    log.Println("out")
}
func main() {
    f_go_long()
    log.Println("main-0")
    time.Sleep(6*time.Second)
    log.Println("main-1")
}
 

2022/09/30 16:17:44 out
2022/09/30 16:17:44 main-0
2022/09/30 16:17:48 in
2022/09/30 16:17:50 main-1

 

 
import (
	"log"
	"sync"
	"time"
)

var l sync.Mutex
var a string

func f() {
	a = "hello, world"
	l.Unlock()
}

func main() {
	l.Lock()
	go f()
	l.Lock()
	print(a)
}

 

For any sync.Mutex or sync.RWMutex variable l and n < m, call n of l.Unlock() is synchronized before call m of l.Lock() returns.

 对于锁,当n < m时, l. unlock()的调用n在l. lock()的调用m返回之前同步。
For any call to l.RLock on a sync.RWMutex variable l, there is an n such that the nth call to l.Unlock is synchronized before the return from l.RLock, and the matching call to l.RUnlock is synchronized before the return from call n+1 to l.Lock
n次加锁在n+1次上锁之前被同步。
 
5)
执行一次

Once 

sync包通过使用Once类型提供了一种安全的机制,用于在存在多个gor例程的情况下进行初始化。对于特定的f,多个线程可以执行一次。do (f),但只有一个线程会运行f(),而其他线程调用阻塞,直到f()返回。
import (
	"log"
	"sync"
	"time"
)

// package LabGO
var a string
var i int
var once sync.Once

func setup() {
	a = "hello, world"
	i+=1
}

func doprint() {
	once.Do(setup)
	println(a,i)
}

func twoprint() {
	go doprint()
	go doprint()
}

func main() {
	twoprint()
	time.Sleep(time.Second)
}

  6)原子值

Atomic Values

 

The APIs in the sync/atomic package are collectively “atomic operations” that can be used to synchronize the execution of different goroutines. If the effect of an atomic operation A is observed by atomic operation B, then A is synchronized before B. All the atomic operations executed in a program behave as though executed in some sequentially consistent order.

sync/atomic包中的api都是“原子操作”,可用于同步不同gor例程的执行。如果原子操作B可以观察到原子操作A的效果,那么A就会在B之前同步。在程序中执行的所有原子操作的行为就像是按照某种顺序一致的顺序执行的。

The preceding definition has the same semantics as C++’s sequentially consistent atomics and Java’s volatile variables.

前面的定义与c++的顺序一致的原子和Java的volatile变量具有相同的语义。

7)终结器

Finalizers 

The runtime package provides a SetFinalizer function that adds a finalizer to be called when a particular object is no longer reachable by the program. A call to SetFinalizer(x, f) is synchronized before the finalization call f(x).

运行时包提供了一个SetFinalizer函数,该函数添加了一个终结器,在程序不再访问特定对象时调用它。对SetFinalizer(x, f)的调用在终结调用f(x)之前同步。

8)额外的机制

Additional Mechanisms

The sync package provides additional synchronization abstractions, including condition variableslock-free mapsallocation pools, and wait groups. The documentation for each of these specifies the guarantees it makes concerning synchronization.

Other packages that provide synchronization abstractions should document the guarantees they make too.

 

4、不正确的同步

Incorrect synchronization

import (
	"log"
	"sync"
	"time"
)

var a, b int

func f() {
	a = 1
	b = 2
}

func g() {
	// print(b)
	// print(a)
	// println(a,b)
	println(a,b,a,b)
}

func main() {
	go f()
	g()
}

  

func g() {
    print(b)
    print(a)
    // println(a,b)
    // println(a,b,a,b)
}

 

func g() {
    print(b)
    print(a)
    println(a,b)
    // println(a,b,a,b)
}
 
func g() {
    print(b)
    print(a)
    println(a,b)
    // println(a,b,a,b)
}
 
func g() {
    // print(b)
    // print(a)
    // println(a,b)
    println(a, b, a, b)
}
 
打印不同的结果。
 

Programs with races are incorrect and can exhibit non-sequentially consistent executions. In particular, note that a read r may observe the value written by any write w that executes concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.

In this program:

var a, b int

func f() {
	a = 1
	b = 2
}

func g() {
	print(b)
	print(a)
}

func main() {
	go f()
	g()
}

it can happen that g prints 2 and then 0.

This fact invalidates a few common idioms.

Double-checked locking is an attempt to avoid the overhead of synchronization. For example, the twoprint program might be incorrectly written as:

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func doprint() {
	if !done {
		once.Do(setup)
	}
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}

but there is no guarantee that, in doprint, observing the write to done implies observing the write to a. This version can (incorrectly) print an empty string instead of "hello, world".

Another incorrect idiom is busy waiting for a value, as in:

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func main() {
	go setup()
	for !done {
	}
	print(a)
}

As before, there is no guarantee that, in main, observing the write to done implies observing the write to a, so this program could print an empty string too. Worse, there is no guarantee that the write to done will ever be observed by main, since there are no synchronization events between the two threads. The loop in main is not guaranteed to finish.

There are subtler variants on this theme, such as this program.

type T struct {
	msg string
}

var g *T

func setup() {
	t := new(T)
	t.msg = "hello, world"
	g = t
}

func main() {
	go setup()
	for g == nil {
	}
	print(g.msg)
}

Even if main observes g != nil and exits its loop, there is no guarantee that it will observe the initialized value for g.msg.

In all these examples, the solution is the same: use explicit synchronization.

 

在所有这些示例中,解决方案都是相同的:使用显式同步。

 5、
不正确的编译

Incorrect compilation

The Go memory model restricts compiler optimizations as much as it does Go programs. Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs. In particular, a compiler must not introduce writes that do not exist in the original program, it must not allow a single read to observe multiple values, and it must not allow a single write to write multiple values.
 Go内存模型对编译器优化的限制和对Go程序的限制一样多。一些在单线程程序中有效的编译器优化并不适用于所有的Go程序。特别是,编译器不能引入原程序中不存在的写操作,不能允许一次读操作观察多个值,也不能允许一次写操作写入多个值。
 
All the following examples assume that `*p` and `*q` refer to memory locations accessible to multiple goroutines.
下面所有的例子都假设' *p '和' *q '指的是可被多个goroutine访问的内存位置。
Not introducing data races into race-free programs means not moving writes out of conditional statements in which they appear. For example, a compiler must not invert the conditional in this program:
*p = 1
if cond {
	*p = 2
}

不将数据竞争引入无竞争程序意味着不将写入移出它们出现的条件语句。例如,编译器不能在这个程序中反转条件:

That is, the compiler must not rewrite the program into this one: 也就是说,编译器不能将程序重写为这个:

*p = 2
if !cond {
	*p = 1
}

If cond is false and another goroutine is reading *p, then in the original program, the other goroutine can only observe any prior value of *p and 1. In the rewritten program, the other goroutine can observe 2, which was previously impossible.

如果cond为false,而另一个goroutine正在读取*p,那么在原始程序中,另一个goroutine只能观察到*p和1的任何先前值。在重写的程序中,另一个goroutine可以观察到2,这在以前是不可能的。

Not introducing data races also means not assuming that loops terminate. For example, a compiler must in general not move the accesses to *p or *q ahead of the loop in this program:

n := 0
for e := list; e != nil; e = e.next {
	n++
}
i := *p
*q = 1

  If list pointed to a cyclic list, then the original program would never access *p or *q, but the rewritten program would. (Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic; moving `*q` ahead would also require the compiler proving that no other goroutine can access `*q`.)

如果list指向循环列表,那么原始程序将永远不会访问*p或*q,但重写的程序将访问。(如果编译器能证明' *p '不会恐慌,那么将' *p '向前移动是安全的;将' *q '向前移动还需要编译器证明没有其他goroutine可以访问' *q '。)

 Not introducing data races also means not assuming that called functions always return or are free of synchronization operations. For example, a compiler must not move the accesses to *p or *q ahead of the function call in this program (at least not without direct knowledge of the precise behavior of f): 不引入数据竞争也意味着不假设被调用的函数总是返回或没有同步操作。例如,在这个程序中,编译器不能在函数调用之前移动对*p或*q的访问(至少不能在没有直接了解f的精确行为的情况下):

f()
i := *p
*q = 1

  If the call never returned, then once again the original program would never access *p or *q, but the rewritten program would. And if the call contained synchronizing operations, then the original program could establish happens before edges preceding the accesses to *p and *q, but the rewritten program would not. 

 
Not allowing a single read to observe multiple values means not reloading local variables from shared memory. For example, a compiler must not discard i and reload it a second time from *p in this program:
i := *p
if i < 0 || i >= len(funcs) {
	panic("invalid function index")
}
... complex code ...
// compiler must NOT reload i = *p here
funcs[i]()

  

If the complex code needs many registers, a compiler for single-threaded programs could discard i without saving a copy and then reload i = *p just before funcs[i](). A Go compiler must not, because the value of *p may have changed. (Instead, the compiler could spill i to the stack.)
 
如果复杂的代码需要很多寄存器,单线程程序的编译器可以丢弃i而不保存副本,然后在funcs[i]()之前重新加载i = *p。Go编译器不能,因为*p的值可能已经改变了。(相反,编译器可以将i溢出到堆栈中。)
 
Not allowing a single write to write multiple values also means not using the memory where a local variable will be written as temporary storage before the write. For example, a compiler must not use *p as temporary storage in this program:
不允许一次写操作可以写多个值,这也意味着在写操作之前,不使用将要写入局部变量的内存作为临时存储。例如,编译器不能在这个程序中使用*p作为临时存储:
 
*p = i + *p/2

  That is, it must not rewrite the program into this one: 也就是说,它不能将程序重写为这个:

*p /= 2
*p += i

  

If i and *p start equal to 2, the original code does *p = 3, so a racing thread can read only 2 or 3 from *p. The rewritten code does *p = 1 and then *p = 3, allowing a racing thread to read 1 as well.

Note that all these optimizations are permitted in C/C++ compilers: a Go compiler sharing a back end with a C/C++ compiler must take care to disable optimizations that are invalid for Go.

注意,所有这些优化在C/ C++编译器中都是允许的:与C/C++编译器共享后端的Go编译器必须注意禁用对Go无效的优化。

Note that the prohibition on introducing data races does not apply if the compiler can prove that the races do not affect correct execution on the target platform. For example, on essentially all CPUs, it is valid to rewrite

请注意,如果编译器能够证明数据竞争不会影响目标平台上的正确执行,则不禁止引入数据竞争。例如,在几乎所有的cpu上,重写都是有效的

n := 0
for i := 0; i < m; i++ {
	n += *shared
}

 into:

n := 0
local := *shared
for i := 0; i < m; i++ {
	n += local
}

  provided it can be proved that *shared will not fault on access, because the potential added read will not affect any existing concurrent reads or writes. On the other hand, the rewrite would not be valid in a source-to-source translator.

 前提是可以证明*shared不会在访问时出错,因为潜在的添加读不会影响任何现有的并发读或写。另一方面,重写在源代码到源代码的翻译器中是无效的。

 

 

 6、
 
顺序一致执行
 

Go programmers writing data-race-free programs can rely on sequentially consistent execution of those programs, just as in essentially all other modern programming languages.

When it comes to programs with races, both programmers and compilers should remember the advice: don't be clever.

 

写的无数据竞争程序可以依赖于这些程序的顺序一致执行,就像所有其他现代编程语言一样。

 

The Go Memory Model - The Go Programming Language https://go.dev/ref/mem

 

posted @ 2022-09-04 20:44  papering  阅读(130)  评论(0编辑  收藏  举报