撕裂写 torn write 对齐地址写 多核 内存屏障

12 | atomic:要保证原子操作,一定要使用这几种方法 https://time.geekbang.org/column/article/304127

 对一个地址的赋值是原子操作吗?

这是一个很有趣的问题,如果是原子操作,还要 atomic 包干什么?官方的文档中并没有特意的介绍,不过,在一些 issue 或者论坛中,每当有人谈到这个问题时,总是会被建议用 atomic 包。

Dave Cheney就谈到过这个问题,讲得非常好。我来给你总结一下他讲的知识点,这样你就比较容易理解使用 atomic 和直接内存操作的区别了。

在现在的系统中,write 的地址基本上都是对齐的(aligned)。 比如,32 位的操作系统、CPU 以及编译器,write 的地址总是 4 的倍数,64 位的系统总是 8 的倍数(还记得 WaitGroup 针对 64 位系统和 32 位系统对 state1 的字段不同的处理吗)。对齐地址的写,不会导致其他人看到只写了一半的数据,因为它通过一个指令就可以实现对地址的操作。如果地址不是对齐的话,那么,处理器就需要分成两个指令去处理,如果执行了一个指令,其它人就会看到更新了一半的错误的数据,这被称做撕裂写(torn write) 。所以,你可以认为赋值操作是一个原子操作,这个“原子操作”可以认为是保证数据的完整性。

但是,对于现代的多处理多核的系统来说,由于 cache、指令重排,可见性等问题,我们对原子操作的意义有了更多的追求。在多核系统中,一个核对地址的值的更改,在更新到主内存中之前,是在多级缓存中存放的。这时,多个核看到的数据可能是不一样的,其它的核可能还没有看到更新的数据,还在使用旧的数据。

多处理器多核心系统为了处理这类问题,使用了一种叫做内存屏障(memory fence 或 memory barrier)的方式。一个写内存屏障会告诉处理器,必须要等到它管道中的未完成的操作(特别是写操作)都被刷新到内存中,再进行操作。此操作还会让相关的处理器的 CPU 缓存失效,以便让它们从主存中拉取最新的值。

atomic 包提供的方法会提供内存屏障的功能,所以,atomic 不仅仅可以保证赋值的数据完整性,还能保证数据的可见性,一旦一个核更新了该地址的值,其它处理器总是能读取到它的最新值。但是,需要注意的是,因为需要处理器之间保证数据的一致性,atomic 的操作也是会降低性能的。

 If aligned memory writes are atomic, why do we need the sync/atomic package? | Dave Cheney https://dave.cheney.net/2018/01/06/if-aligned-memory-writes-are-atomic-why-do-we-need-the-sync-atomic-package

If aligned memory writes are atomic, why do we need the sync/atomic package?

This is a post inspired by a question on the Go Forum. The question, paraphrased, was “If properly aligned writes are guaranteed to be atomic by the processor, why does the race detector complain?”

The answer is, there are two uses of the word atomic in play here. The first, the one the OP references, is a property of most microprocessors that, as long as the address of the write is naturally aligned–if it’s a 32-bit value, say, then it is always written to an address which is a multiple of four–then nothing will observe a half written value.

To explain what that means, consider the opposite, an unaligned write where a 32-bit value is written to an address whose bottom two bits are not zero. In this case the processor has to split the write into two, spanning the boundary. This is known as a torn write as an observer on the bus could see this partially updated value.1

These words comes from a time before multiple processors were common. At that time the observers of a torn read or write would most likely be other agents on the ISA, VESA, or PCI bus like disk controllers or video cards. However, we now live in the multi-core age so we need to talk about caches and visibility.

Since almost the beginning of computing, the CPU has run faster than main memory. That is to say, the performance of a computer is strongly related to the performance of its memory. This is known as the processor/memory gap. To bridge this gap processors have adopted caches which store recently accessed memory in a small, fast, store, closer to the processor.2 Because caches also buffer writes back to main memory, while the property that an aligned address will be atomic remains, when that write occurs has become less deterministic.3 This is the domain of second use of the word atomic, the one implemented by the sync/atomic package.

In a modern multiprocessor system, a write to main memory will be buffered in multiple levels of caches before hitting main memory. This is done to to hide the latency of main memory, but in doing so it means that communicating between processors using main memory is now imprecise; a value read from memory may have already been overwritten by one processor, however the new value has not made its way through the various caches yet.

To solve this ambiguity you need to use a memory fence, also known as a memory barrier. A memory write barrier operation tells the processor that it has to wait until all the outstanding operations in its pipeline, specifically writes, have been flushed to main memory. This operation also invalidates the caches

4

held by other processors, forcing them to retrieve the new value directly from memory. The same is true for reads, you use a memory read barrier to tell the processor to stop and synchronise with any outstanding writes to memory. 

In terms of Go, read and write memory barrier operations are handled by the sync/atomic package, specifically the family of atomic.Load and atomic.Store functions respectively.

5

In answer to the OP’s question: to safely use a value in memory as a communication channel between two goroutines, the race detector will complain unless the sync/atomic package is used.

  1. Historically, most microprocessors, but notably not Intel, made unaligned writes illegal, causing a fault if an unaligned read or write was attempted. This simplified the design of the processor at a time when transistors were expensive by removing the requirement to translate unaligned loads and stores into the strictly aligned requirements of the memory sub-system. Today however, almost all microprocessors have evolved to permit unaligned access, at the cost of performance and the loss of the atomic write property.
  2. The first production computer to feature a cache was the IBM System/360 Model 85.
  3. This is a gross over simplification. At the hardware level ranges of physical addresses are required to be uncached for read, or obey write-through, rather than write-back, semantics. For the discussion of memory visibility between two goroutines in the same virtual address space, these details can be safely ignored.
  4. nitpicker’s note: technically the cache line is invalidated 
  5. Even though most processors allow unaligned read and writes, atomic operations on memory require the address to be naturally aligned as the communication between processors is handled by the cache, which operates in terms of cache lines which are usually 64 bytes long. An unaligned read or write could therefore span two cache lines, which would be impossible to atomically synchronise across processors.

This entry was posted in GoProgramming and tagged atomicdata race on January 6, 2018.

 

 Go test -race and uint8 variables - Technical Discussion - Go Forum https://forum.golangbridge.org/t/go-test-race-and-uint8-variables/7707

 

 

撕裂写防护 - Amazon Elastic Compute Cloud https://docs.aws.amazon.com/zh_cn/AWSEC2/latest/UserGuide/storage-twp.html

Torn write prevention - Amazon Elastic Compute Cloud https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage-twp.html

撕裂写防护

撕裂写防护是由 AWS 设计的块存储功能,可提高 I/O 密集型关系数据库工作负载的性能并减少延迟,而不会对数据弹性产生负面影响。使用 InnoDB 或 XtraDB 作为数据库引擎的关系数据库,例如 MySQL 和 MariaDB,将受益于撕裂写防护这一功能。

通常,使用大于存储设备断电原子性的页面的关系数据库使用数据记录机制来防止撕裂写。MariaDB 和 MySQL 在将数据写入数据表之前使用双写缓冲区文件来记录数据。如果写入事务期间由于操作系统崩溃或断电导致写入不完整或撕裂,数据库可以从双写缓冲区恢复数据。与写入双写缓冲区相关的 I/O 额外开销会影响数据库性能和应用程序延迟,并减少每秒可处理的事务数量。有关双写缓冲区的更多信息,请参阅 MariaDB 和 MySQL 文档。

借助撕裂写防护功能,可以在全有或全无的写入事务中将数据写入存储,这样就无需使用双写缓冲区。这样可以防止在写入事务期间因操作系统崩溃或断电将部分或撕裂的数据写入存储。在不影响工作负载弹性的情况下,每秒处理的事务数量最多可增加 30%,写入延迟最多可减少 50%。

Torn write prevention

Torn write prevention is a block storage feature designed by AWS to improve the performance of your I/O-intensive relational database workloads and reduce latency without negatively impacting data resiliency. Relational databases that use InnoDB or XtraDB as the database engine, such as MySQL and MariaDB, will benefit from torn write prevention.

Typically, relational databases that use pages larger than the power fail atomicity of the storage device use data logging mechanisms to protect against torn writes. MariaDB and MySQL use a doublewrite buffer file to log data before writing it to data tables. In the event of incomplete or torn writes, as a result of operating system crashes or power loss during write transactions, the database can recover the data from the doublewrite buffer. The additional I/O overhead associated with writing to the doublewrite buffer impacts database performance and application latency, and it reduces the number transactions that can be processed per second. For more information about doublewrite buffer, see the MariaDB and MySQL documentation.

With torn write prevention, data is written to storage in all-or-nothing write transactions, which eliminates the need for using the doublewrite buffer. This prevents partial, or torn, data from being written to storage in the event of operating system crashes or power loss during write transactions. The number of transactions processed per second can be increased by up to 30 percent, and write latency can be decreased by up to 50 percent, without compromising the resiliency of your workloads.

 

 

 

 

 

 

 

 

 

posted @ 2023-05-14 00:15  papering  阅读(52)  评论(0编辑  收藏  举报