CMU_15445_2023Fall_Project1
LRU_K 算法
LRU_K 算法是 LRU(最近最少使用算法) 与 LFU(使用频率最小算法的结合体)
LRU-K Page Replacement Algorithm Definition
Assume we are given a set \(N = {1, 2, . . . , n}\) of disk pages, denoted by positive integers, and that the database system under study makes a succession of references to these pages specified by the reference string: \(r_1, r_2, . . . , r_t , . . . ,\) where \(r_t = p ( p \in N)\) means that term numbered \(t\) in the references string refers to disk page \(p\). In the following discussion, we will measure all time in terms of counts of successive page accesses in the reference string (\(t\) for \(r_t\)) rather than clock time.
Definition 2.1.
Backward \(K\)-distance \(b_t(p, K)\); \(LRU-K\) age of a page. Given a reference string known up to time \(t\), \(r_1, r_2, . . . , r_t\) , the backward \(K\)-distance \(b_t( p, K)\) is the distance backward in subscript from \(t\) to the \(K_{th}\) most recent reference to the page \(p\).
\(b_t (p, K) = g\), if \(r_{t-g}\) has the value \(p\) in the page reference string \(r_1, r_2, . . . , r_t\) and there have been exactly \(K - 1\) other positions \(i\) with \(t - g < i \leq t\), where \(r_i = p\)
\(b_t (p, K) = \infty\) if \(p\) does not appear at least \(K\) times in \(r_1, r_2, . . . , r_t\) We will often refer to \(b_t(p, K)\) as the \(LRU-K\) age of the page \(p\).
Definition 2.2.
LRU-K Algorithm. The LRU-K Algorithm specifies a page replacement policy when a buffer is needed for a new page being read in from disk: the page \(p\) to be dropped (i.e., selected as a replacement victim) is the one whose Backward \(K\)-distance, \(b_t ( p, K)\), is the maximum of all pages in memory buffers. The only time the choice is ambiguous is when more than one page has $b_t( p, K) = \infty $. In this case, a subsidiary policy may be used to select a replacement victim among the pages with infinite Backward K-distance. Pages referenced the smallest number of times would always be chosen first for page
replacement, but within each group of pages referenced only \(L\) times, where \(L < K\), \(LRU-L\) could be employed as a subsidiary policy.
上面是LRU_K 算法的定义. 上面的描述实际上已经很清晰了, 假设当前时间戳为 \(t\), 那么我们要访问的内存中的某一个页面 \(P\) 的 \(K\)-distance 就是当前时间戳 \(t\) 与第倒数第 \(k\) 访问时的时间戳的差值. 如果访问序列中访问页面 \(P\) 的次数小于 \(K\), 那么这个页面的 \(K\)-distance 就是无穷, 此时我们可以选择缩小 \(K\), 使用 \(LRU_L\) 算法, 本质上是一样的. 当 \(L==1\) 的时候, 使用的就是 \(LRU\) 算法.
下图是 LRUK 算法计算一个示例的结果.
Task1 的一些解释
- frame 是什么, frame 是一个抽象的概念, 可以看作是物理内存中的一个物理页的抽象.
LRUKReplacer
就是管理物理内存中哪些物理页面可以被替换, 如何替换. 例如将一个 frame 设置为evictable
, 表示这个 frame对应的物理页可以被替换. - 具体什么时候将frame设置为 evictable, 这个是
buffer_pool_manager
决定的, 例如新建一个页面, 需要设置为unevictable
, 当buffer_pool_manager
发现这个帧上的page的使用线程数为 0, 那么就将这个 frame 设置为:evictable
. - LRUK_Replacer 的容量就是
buffer_pool_manager
的大小, 也就是replacer_size_
, 而curr_size_
表示 LRUK_Replacer的实时大小. - LRUK_replacer 的实时大小就是是
evcitable
的frame的个数, 个数为0, 表示所有的 frame 正在使用.
Task 2 - Disk Scheduler
This component is responsible for scheduling read and write operations on the DiskManager. You will implement a new class called DiskScheduler in src/include/storage/disk/disk_scheduler.h
and its corresponding implementation file in src/storage/disk/disk_scheduler.cpp
.
The disk scheduler can be used by other components (in this case, your BufferPoolManager in Task #3) to queue disk requests(磁盘调用是用来处理磁盘请求的), represented by a DiskRequest struct (already defined in src/include/storage/disk/disk_scheduler.h
). The disk scheduler will maintain a background worker thread(后台工作线程) which is responsible for processing scheduled requests.
The disk scheduler will utilize a shared queue(共享的队列应该是线程之间的临界资源) to schedule and process the DiskRequests. One thread will add a request to the queue, and the disk scheduler's background worker will process the queued requests. We have provided a Channel class in src/include/common/channel.h
to facilitate the safe sharing of data between threads(具体看一下这里是如何保证线程安全的), but feel free to use your own implementation if you find it necessary.
The DiskScheduler constructor and destructor(构造函数以及析构函数) are already implemented and are responsible creating and joining the background worker thread. You will only need to implement the following methods as defined in the header file (src/include/storage/disk/disk_scheduler.h
) and in the source file (src/storage/disk/disk_scheduler.cpp
):
调度器
Schedule(DiskRequest r)
: Schedules a request for the DiskManager to execute. The DiskRequest struct specifies whether the request is for a read/write, where the data should be written into/from, and the page ID for the operation. The DiskRequest also includes a std::promise
whose value should be set to true once the request is processed.
DiskRequest
存储了某次对磁盘数据请求的结构体, 包含了本次数据请求的大部分信息. 同时还使用了 std::promise
来同步请求被调用的信息.
工作线程
StartWorkerThread()
: Start method for the background worker thread which processes the scheduled requests. The worker thread is created in the DiskScheduler constructor and calls this method(调度器构造函数启动该线程). This method is responsible for getting queued requests and dispatching them to the DiskManager(工作会被分配到 DiskManager). Remember to set the value on the DiskRequest's callback to signal to the request issuer that the request has been completed. This method should not return until the DiskScheduler's destructor is called.
Lastly, one of the fields of a DiskRequest is a std::promise
. If you are unfamiliar with C++ promises and futures, you can check out their documentation. For the purposes of this project, they essentially provide a callback mechanism for a thread to know when their scheduled request is completed. To see an example of how they might be used, check out disk_scheduler_test.cpp
.
Again, the implementation details are up to you, but you must make sure that your implementation is thread-safe.
Disk Manager
The Disk Manager class (src/include/storage/disk/disk_manager.h
) reads and writes the page data from and to the disk. Your disk scheduler will use DiskManager::ReadPage()
and DiskManager::WritePage()
when it is processing a read or write request.
磁盘调度器的多线程处理机制
这一部分看似简单, 整体流程并不复杂, 但是我是第一次处理C++中的多线程问题, 对我来说, 反倒是最难的一部分了. 我用下图解释一下我的理解:
- Disk Scheduler 自带一个工作线程, 当初始化一个Disk Scheduler对象的时候, 构造函数会在后台启动
background_thread_
工作线程, 工作线程要完成的任务就是读取请求队列中的请求, 并执行, 然后设置通知信号promise
. - 请求队列: 磁盘IO的请求队列的数据类型是
Channel
,Channel
中有一个设计, 条件变量阻塞的线程同步机制. 在Get()
请求中使用cv_.wait(lk, [&]() { return !q_.empty(); });
如果请求队列为空, 那么获取请求的线程会被阻塞. 而Put()
函数中, 使用cv_.notify_all();
, 当将请求写入请求队列后, 通知由于条件遍历cv阻塞的线程, 那么请求线程就会继续执行. - 请求队列使用
Channel
类的好处是, 当工作线程StartWorkerThread()
处理完请求后, 请求队列为空, 工作线程会被自动阻塞, 当新的请求写入请求队列后, 工作线程又会自动开始工作. - Disk Scheduler的析构函数, 会在请求队列中写入一个空请求, 工作线程处理到空请求时, 自动退出.
- 我们还需要熟悉
promise
与future
的用法. 在DiskRequest
中,promise
作为一个信号, 表示本次请求是否完成.BuffPollManager
发起一个磁盘请求的时候,promise
会被设置为false
, 然后主线程初始化一个std::future
的对象, 主线程会阻塞在read_result.get();
处, 等待promise
的结果.promise
的值是DiskScheduler
的工作线程处理完请求后赋值的. 赋值之后, 主线程获取到, 主线程继续执行.
Task 3 - Buffer Pool Manager
Next, implement the buffer pool manager (BufferPoolManager). The BufferPoolManager is responsible for fetching database pages from disk with the DiskScheduler
and storing them in memory. The BufferPoolManager can also schedule writes of dirty pages out to disk when it is either explicitly instructed to do so or when it needs to evict a page to make space for a new page.
我觉得官方给的描述比我写的好太多了, 我就不赘述了.
To make sure that your implementation works correctly with the rest of the system, we will provide you with some functions already filled in. You will also not need to implement the code that actually reads and writes data to disk (this is called the DiskManager in our implementation). We will provide that functionality. You do, however, need to implement the DiskScheduler
to process disk requests and dispatch them to the DiskManager (this is Task #2).
All in-memory pages in the system are represented by Page objects. The BufferPoolManager
does not need to understand the contents of these pages. But it is important for you as the system developer to understand that Page objects are just containers for memory in the buffer pool and thus are not specific to a unique page. That is, each Page object contains a block of memory that the DiskManager
will use as a location to copy the contents of a physical page that it reads from disk. The BufferPoolManager
will reuse the same Page object to store data as it moves back and forth to disk. This means that the same Page object may contain a different physical page throughout the life of the system. The Page object's identifer (page_id) keeps track of what physical page it contains; if a Page object does not contain a physical page, then its page_id must be set to INVALID_PAGE_ID.
前面我们说过, frame 是一个物理页的抽象, 那么 Page
就是 BufferPoolManager
管理存储资源的最小单位的抽象. 更加具体一点的解释是, Page
是一个类, 它的属性 page_id
表示这个页是磁盘上的哪一个页面, 它的属性 Page->data
表示这个页存储在内存的具体地址. 也就是 BufferPoolManager
管理的页, 既和磁盘有关, 也和内存有关. 实际上, 具体的页面在磁盘上存储到哪里, 在内存中存储到哪里, 这个存储的动作是 DiskManager 做的.
Each Page object also maintains a counter for the number of threads that have "pinned" that page. Your BufferPoolManager
is not allowed to free a Page that is pinned. Each Page object also keeps track of whether it is dirty or not. It is your job to record whether a page was modified before it is unpinned. Your BufferPoolManager
must write the contents of a dirty Page back to disk before that object can be reused.
BufferPoolManager
同时还会使用 counter
表示正在使用这个页面的线程数, 同时还维持了 dirty
标志, 表示是否被修改, 替换后是否写入内存中.
Your BufferPoolManager
implementation will use the LRUKReplacer
and DiskScheduler
classes that you created in the previous steps of this assignment. The LRUKReplacer
will keep track of when Page objects are accessed so that it can decide which one to evict when it must free a frame to make room for copying a new physical page from disk. When mapping page_id
to frame_id
in the BufferPoolManager
, again be warned that STL containers are not thread-safe. The DiskScheduler
will schedule writes and reads to disk on the DiskManager
.
实际上 BufferPoolManager
包括了LRUKReplacer
与 Disk_Scheduler
, 前者是用来管理物理内存中页面的替换的, 后者是用来与磁盘交互, 处理 IO 请求的. 我将我的理解画成了下面的图:
- 物理页面的纳管:
BufferPoolManager
初始化的时候, 申请了一块连续的地址pages_ = new Page[pool_size_];
来存储BufferPoolManager
的页面信息. 这个pages_
数字实际上是BufferPoolManager
使用的物理内存的抽象, 它并不是具体的物理内存, 因为每一个page真正存储数据的位置是page->data
指向的地址. 所以这个 pages 数组并不是真正的BufferPoolManager
使用的物理内存, 而是使用的这部分内存的抽象. - 我们可以计算这个
BufferPoolManager
实际可以使用的物理内存的大小为pool_size * BUSTUB_PAGE_SIZE
.BUSTUB_PAGE_SIZE
是一个页面在物理内存中的实际大小. BufferPoolManager
管理的物理内存实际上分为两部分, 已使用了的, 和未被使用的.- 已使用的部分, 使用
page_table
来管理, 表示这一个page中存储了从磁盘读取的页面, 或者正在使用的页面. 而还未使用的页面使用free_list_
来表示, 这部分可以被使用, 使用后, 被page_table
纳管. - 我的图中说明 pages 与
frame_id
之间的关系,frame_id
可以理解为BufferPoolManager
使用的部分的物理页的序号, 我们知道 pages 数组的大小就是BufferPoolManager
的实际物理页的个数. 所以 pages 数组的下标和 frame_id 实际上是一一对应的.
6.LRUK_Replacer
只作用于被纳管了的物理页. 当我们使用NewPage()
来使用与纳管一个页面的时候,LRUK_Replacer
开始作用于该页面.
You will need to implement the following functions defined in the header file (src/include/buffer/buffer_pool_manager.h
) and in the source file (src/buffer/buffer_pool_manager.cpp
):
实际上下面这些函数, 就是 BufferPoolManager
的主要功能, 就是 Porject1 中需要完成的功能.
FetchPage(page_id_t page_id)
UnpinPage(page_id_t page_id, bool is_dirty)
FlushPage(page_id_t page_id)
NewPage(page_id_t* page_id)
DeletePage(page_id_t page_id)
FlushAllPages()
For FetchPage
, you should return nullptr if no page is available in the free list and all other pages are currently pinned. FlushPage
should flush a page regardless of its pin status.
For UnpinPage
, the is_dirty
parameter keeps track of whether a page was modified while it was pinned.
The AllocatePage
private method provides the BufferPoolManager
a unique new page_id
when you want to create a new page in NewPage()
. On the other hand, the DeallocatePage()
method is a no-op that imitates(模仿) freeing a page on the disk and you should call this in your DeletePage()
implementation.
上面的这些函数, 在代码中的注释已经写的十分清晰了, 有很多细节需要注意, 只要细心, 应该都不会出错.
自己做的过程中的一些bug
Project1 真的是梦回当时写操作系统, 遇到很多小但是很致命的bug
- LRUK算法, 当访问次数不足K次的时候, 使用LRU算法, 但是这里的LRU不是使用的最近一次的访问时间, 而是使用最远一次的访问时间用于LRU的比较.
RecordAccess
中如果该物理页之前没有出现过, 第一次访问, 要纳管该物理页, 添加到node_store_
中.- 在
buffer_pool_manager
中, 删除一个页面之后需要将这个页面从页表中删除, 不够细心. - 在
NewPages
替换页面的时候, 不管该页面是不是 dirty, 都需要page_table_.erase(new_fetch_page->page_id_);
, 这里我 fetch_page 的时候写在后面的, 但是NewPages
写错位置了. FetchPage
的时候, 如果这个页面已经在内存中了, 直接返回这个页面, 但是需要添加对这个页面的访问记录FetchPage
的时候, 我将下面的
// 如果 replacer_的大小为0, 表示内存中所有的页面都不可以替换, 就无法从磁盘读到内存中
if (this->replacer_->Size() == 0) {
return nullptr;
}
放到最开始了, 如果这个时候这个页面本来就在内存中, 会返回错误, 实际上是返回正确
7. 在 FetchPage
与 NewPage
中, 下面的代码中:
if (!free_list_.empty()) {
// 将 list的第一个元素排出
auto frame_id = free_list_.front();
free_list_.pop_front();
// 将该 frame 分配给这个 page, frame 不可以替换, PIN 这个frame
fetch_page = pages_ + frame_id;
}
我tm代码写的不够仔细, 前面已经定义了frame_id
, 这里不小心重复定义了, 这个bug我查了半天.
8. UnpinPage
的时候, 如果一个页面的 dirty 已经被设置为 true, 不能将其再次设置为 false. 直接使用unpin_page->is_dirty_ = is_dirty;
, 会导致dirty已经设置为true后, 被修改为false.
总结
- Project0 熟悉了一些C++的语法, 但是显然还是不够的,
Project1
使用了更多的C++的语法, 尤其是多线程的实现, 处理方式等. 收获还是蛮多的. - 由于以前学习过操作系统, 熟悉操作系统的页表机制, 以及MIT的那个JOS小型操作系统的实现, 对
BufferPoolManager
中的操作还是较为熟悉的, 但是还是轻敌了, 最重要的就是细心, 一定要细心.