PAT 2020年春季 7-4 Replacement Selection (30 分)
When the input is much too large to fit into memory, we have to do external sorting instead of internal sorting. One of the key steps in external sorting is to generate sets of sorted records (also called runs) with limited internal memory. The simplest method is to read as many records as possible into the memory, and sort them internally, then write the resulting run back to some tape. The size of each run is the same as the capacity of the internal memory.
Replacement Selection sorting algorithm was described in 1965 by Donald Knuth. Notice that as soon as the first record is written to an output tape, the memory it used becomes available for another record. Assume that we are sorting in ascending order, if the next record is not smaller than the record we have just output, then it can be included in the run.
For example, suppose that we have a set of input { 81, 94, 11, 96, 12, 99, 35 }, and our memory can sort 3 records only. By the simplest method we will obtain three runs: { 11, 81, 94 }, { 12, 96, 99 } and { 35 }. According to the replacement selection algorithm, we would read and sort the first 3 records { 81, 94, 11 } and output 11 as the smallest one. Then one space is available so 96 is read in and will join the first run since it is larger than 11. Now we have { 81, 94, 96 }. After 81 is out, 12 comes in but it must belong to the next run since it is smaller than 81. Hence we have { 94, 96, 12 } where 12 will stay since it belongs to the next run. When 94 is out and 99 is in, since 99 is larger than 94, it must belong to the first run. Eventually we will obtain two runs: the first one contains { 11, 81, 94, 96, 99 } and the second one contains { 12, 35 }.
Your job is to implement this replacement selection algorithm.
Input Specification:
Each input file contains several test cases. The first line gives two positive integers N (≤10^5) and M (<N/2), which are the total number of records to be sorted, and the capacity of the internal memory. Then N numbers are given in the next line, all in the range of int. All the numbers in a line are separated by a space.
Output Specification:
For each test case, print in each line a run (in ascending order) generated by the replacement selection algorithm. All the numbers in a line must be separated by exactly 1 space, and there must be no extra space at the beginning or the end of the line.
Sample Input:
13 3
81 94 11 96 12 99 17 35 28 58 41 75 15
Sample Output:
11 81 94 96 99
12 17 28 35 41 58 75
15
实现思路:
本题是一道置换排序的模拟题,首先讲解一下题目意思,以题目例子为例。
81 94 11 96 12 99 17 35 28 58 41 75 15
一个13个元素的数据待排序,然后内存空间只能一次性存入3个元素,那么必须要通过外部排序才能完成一系列的有序排序,每一轮排序都会生成一些有序归并段。
第一次内存保存 81 94 11,最小值是11,直接加入待输出序列,加入一个96,则最小值是81,之后这一步比较关键,当12准备进入内存的时候,由于待输出序列的最后一个值是81,而12并不大于81,那么就不能加入待输出序列,而转而放到第二个有序归并段才对,则12依旧保留在内存中,当前内存中元素值为{94,96,12}由于12是不能输出的元素所以找内存中除了12以外最小的元素94输出,然后99加入,一直到内存中存有元素{12,17,35},此时出现了一种情况,就是内存中的所有数据都不大于输出序列的最后一个元素,那么当前的操作应该是输出之前的输出序列,将内存中的归并段设置为第二个初始归并段循环以前的操作即可。
这里具体的实现方法就是定义优先队列作为内存缓冲区,然后将初始归并段(长度为M)输入,之后开始循环出队优先队列里的元素,优先队列定义为小根堆,每次出队最小值,然后从第M+1个元素开始判断,如果这个数大于输出序列的最后一个元素值则加入输出序列,反之则保存到一个缓存数组中,接着判断如果队列空了,说明出现了两种情况,一种是所有数据依旧处理完毕,还有一种是内存区填满了元素,并且所有元素无法加入输出序列的情况。
AC代码:
#include <iostream>
#include <queue>
#include <vector>
using namespace std;
int n,m;
priority_queue<int,vector<int>,greater<int>> q;
vector<int> sq;
int main() {
cin>>n>>m;
int val;
vector<int> sq,temp;
for(int i=0; i<n; i++) {
scanf("%d",&val);
sq.push_back(val);
}
int cnt=0,idx=0;
while(idx<m) q.push(sq[idx++]);//初始装入m个空间的元素 把内存装满
vector<int> out;
while(cnt!=n) {
int minV=q.top();//取出最小元素
q.pop();
out.push_back(minV);
cnt++;
if(idx<n) {
if(sq[idx]>minV) q.push(sq[idx++]);//如果当前元素大于等于输出序列的最后一个元素时
else temp.push_back(sq[idx++]);//当小于最后一个元素时保留
}
if(q.empty()) {//当前内存内所有元素都不能输出 所以将内存中的元素作为新的有序归并段
int tag=0;
for(int i=0; i<out.size(); i++) {
if(tag) printf(" ");
tag=1;
printf("%d",out[i]);
}
out.clear();
for(int x : temp) q.push(x);
temp.clear();
printf("\n");
}
}
return 0;
}