《DDIA》读书笔记:SSTable and LSM Trees

本文是第三章SSTable and LSM-Trees部分的读书笔记。

这部分包括的内容为

B-Trees的优点在于读更快, LSM-Trees的优点在于写更快

Hash Index + log

内存中存索引，磁盘上存log

需要面对的问题

log的格式
如何删除记录。方式：插入一条“删除”的记录，merging的时候，该记录会通知merging线程删掉对应的数据。If you want to delete a key and its associated value, you have to append a special deletion record to the data file(sometimes called a tombstone). When log segments are merged, the tombstone tells the merging process to discard any previous values for the deleted key.
crach recovery
Partially written records。写到一半就故障了怎么办，一种方案是引入checksum
并发控制。通常是只有一个writer，有多个reader，不需要锁

一个SSTable等价于一个log segment

SSTable特点：

segment内有序。
segment内key唯一。
合并segment非常的高效(因为segment内部是有序的)
如果使用hash索引，不用在内存中为每个key分配一个index，而是一个index对应很多key-value对，这样就减少了内存中需要维护的数据结构的大小
可以压缩一系列的key-value对后再写入segment，从而节省磁盘空间

hash索引的例子：in-memory index + Sorted segment file(SSTable) on disk

处理写请求

内存中维护一个平衡树结构memtable来处理写请求。When a write comes in, add it to an in-memory balanced tree data structure. This in-memory tree is sometimes called a memtable
内存中的memtable大小超过一定阀值后，写入到磁盘存储为SSTable文件。When the memtable gets bigger than some threshold—typically a few megabytes —write it out to disk as an SSTable file. This can be done efficiently because the tree already maintains the key-value pairs sorted by key. The new SSTable file becomes the most recent segment of the database. While the SSTable is being written out to disk, writes can continue to a new memtable instance.

处理读请求

处理读请求的时候，首先从memtable中找，然后从最新的磁盘上的SSTable文件找，没找到就依次找旧的SSTable文件。In order to serve a read request, first try to find the key in the memtable, then in the most recent on-disk segment, then in the next-older segment, etc.
因为读数据的时候需要一个一个的SSTable找，所以如果数据不存在，就需要找遍所有的segment，这会导致速度极其慢。一个通常的做法是额外维护一个bloom filter

merging和compaction

有个后台线程一直运行，负责合并(merging)和压缩(compaction)，并且负责抛弃重复或被删除的值。From time to time, run a merging and compaction process in the background to combine segment files and to discard overwritten or deleted values.

为了保证服务重启后memtable中的内容不会丢失，额外使用一个WAL日志来记录memtable中的内容，每次memtable被写入到磁盘后，该WAL日志就删除掉

size-tiered. 新的segment一直和旧的segment合并。newer and smaller SSTables are successively merged into older and larger SSTables.
leveled. the key range is split up into smaller SSTables and older data is moved into separate “levels,” which allows the compaction to proceed more incrementally and use less disk space.

参考:
Designing Data-Intensive Applications https://book.douban.com/subject/26197294/

posted @ 2022-02-10 18:12 elimsc 阅读(48) 评论(0) 编辑收藏举报

刷新页面返回顶部