利用堆实现堆排序&优先队列

数据结构之（二叉）堆一文在末尾提到“利用堆能够实现：堆排序、优先队列。”。本文代码实现之。

1、堆排序

如果要实现非递减排序。则须要用要大顶堆。

此处设计到三个大顶堆的操作：（1）自顶向下调整操作：MaxHeapify（相应堆的SiftDown操作）、（2）利用数组建立大顶堆：BuildMaxHeap、（3）不断交换堆顶元素（堆的最大元素）和堆的末尾元素，实现非递减排序。

以下是详细的实现代码：

//已知L[i,...,n)除L[i]之外均满足大顶堆的定义,本函数向下调整L[i]
//使得在具有n个结点的堆中,以i为下标的根节点的子树又一次遵循最大堆的性质
//n为节点总数，从0開始计算。i节点的子节点为 2*i+1, 2*i+2
void MaxHeapify(int L[], int i, int n)
{
	int j, tmp;
	j = 2 * i + 1;	//父节点i的左孩子结点
	tmp = L[i];
	while (j < n)
	{
		if (j + 1 < n && L[j + 1] > L[j])//找左右孩子中最大的
			j++;
		if (L[j] <= tmp)   //	父结点tmp=L[i]比孩子结点L[j]大
			break;
		L[i] = L[j];	//把较大的子结点往上移动,替换它的父结点
		i = j;
		j = 2 * i + 1;
	}
	L[i] = tmp;
}

//子数组L[n/2+1,..,n)中的元素都是树的叶子结点,每一个叶节点都能够看成仅仅包括一个元素的堆。
//该函数对树中的其余结点都调用一次MaxHeapify()
void BuildMaxHeap(int L[], int n)
{
	for (int i = n / 2 - 1; i >= 0; i--) //注意是从最后一个结点(n-1)的父节点((n-1)-1)/2 = (n-2)/2 = n/2-1 開始
		MaxHeapify(L, i, n);
}

void MaxHeapSort(int L[], int n)
{
	BuildMaxHeap(L, n); //先建立大顶堆
	for (int i = n - 1 ; i > 0; i--)   //从最后一个元素起
	{
		//L[0]永远是大顶堆堆L[0,i]的堆顶元素,最大值
		std::swap(L[0], L[i]);	  

		//将L[0]和最末尾元素交换后：最大的元素如今位于数组末尾,堆的结点个数减一
		//但L[0..i)由于改动了L[0]可能会破坏堆的性质,因此须要调整L[0,...,i)使之依旧满足最大堆：
		//已知L[0,...,i)除L[0]之外均满足大顶堆的定义,本函数向下调整L[0]
		//使得在具有i个结点的堆中,以0为下标的根节点的子树又一次遵循最大堆的性质
		MaxHeapify(L, 0, i);
	}
}

2、利用堆实现优先队列

优先队列分为最大优先队列和最小优先队列，分别借助于大顶堆和小顶堆。

优先队列有以下基本操作：（1）提取队列中的最大（小）元素；（2）提取队列中的最大（小）元素并从队列中删除；（3）将队列中元素为x的keyword降低（增大）到k，这里如果k的值不大（小）于x的原关键值。

其它的还包含如插入、删除操作。这些操作大多调用SiftDown、SiftUp操作实现。以下是详细实现代码：

（1）最大优先队列

#pragma once
#include "maxHeap.h"

template<class T>
class MaxPriQueue : public MaxHeap<T>
{
public:
	MaxPriQueue(const int nmax = 20) : MaxHeap(nmax) {}
	~MaxPriQueue(){}

	T maximum() const;	//返回具有最大键值的元素
	T ExtractMax();	//去掉并返回最大键值的元素
	void InCreaseKey(const T &x, const T &k);	//将元素x的keyword添加到k,这里如果k的值不小于x的原关键值
	virtual void Insert(const T &key);//算法导论第三版p92的方法;也能够调用继承来的Insert方法
};

template<class T>
T MaxPriQueue<T>::maximum()const
{
	if (size > 0)
		return arr[0];
	return T(0);
}

template<class T>
T MaxPriQueue<T>::ExtractMax()
{
	if (size  < 0)
		return T(0);
	T max  = arr[0];
	arr[0] = arr[--size];//最末尾的值补上去,同一时候size减1
	SiftDown(0);	  //向下调整
	return max;
}

template<class T>
void MaxPriQueue<T>::InCreaseKey(const T &x, const T &k)
{
	if (k < x)		//如果k的值不小于x的原关键值
		return;
	int pos = Search(x);
	if (pos == -1)
		return;
	arr[pos] = k;
	SiftUp(pos);	//向上调整
}
template<class T>
void MaxPriQueue<T>::Insert(const T &key)
{
	arr[size++] =  -INFINITY;    //首先添加一个-∞的叶节点
	InCreaseKey(-INFINITY, key); //将该叶节点的值增大至key
}

（2）最小优先队列

#pragma once
#include "minHeap.h"

template<class T>
class MinPriQueue : public MinHeap<T>
{
public:
	MinPriQueue(const int nmax = 20) : MinHeap(nmax) {}
	~MinPriQueue(){}

	T minimum() const;	//返回具有最小键值的元素
	T ExtractMin();	//去掉并返回最小键值的元素
	void DeCreaseKey(const T &x, const T &k);	//将元素x的keyword降低到k,这里如果k的值不大于x的原关键值
	virtual void Insert(const T &key);//算法导论第三版p92的方法;也能够调用继承来的Insert方法
};

template<class T>
T MinPriQueue<T>::minimum()const
{
	if (size > 0)
		return arr[0];
	return T(0);
}

template<class T>
T MinPriQueue<T>::ExtractMin()
{
	if (size  < 0)
		return T(0);
	T min  = arr[0];
	arr[0] = arr[--size];//最末尾的值补上去,同一时候size减1
	SiftDown(0);	  //向下调整
	return min;
}

template<class T>
void MinPriQueue<T>::DeCreaseKey(const T &x, const T &k)
{
	if (k >= x)		//如果k的值不大于x的原关键值
		return;
	int pos = Search(x);
	if (pos == -1)
		return;
	arr[pos] = k;
	SiftUp(pos);	//向上调整
}
template<class T>
void MinPriQueue<T>::Insert(const T &key)
{
	arr[size++] =  INFINITY;    //首先添加一个∞的叶节点
	DeCreaseKey(INFINITY, key); //将该叶节点的值降低至key
}

測试代码：

#include "max_priqueue.h"
#include "min_priqueue.h"
#include <iostream>
using namespace std;

int main()
{
	int a[12] = {15, 13, 9, 5, 12, 8, 7, 4, 0, 6, 2, 1};
	int b[12]; memcpy(b, a, sizeof(a));

	cout << "原始数组元素: ";
	for (int i = 0; i < 12; i++)
		cout << a[i]  << " ";
	cout << endl;

	MaxPriQueue<int> maxpriqueue(20);
	MinPriQueue<int> minpriqueue(20);
	for (int i = 0; i < 12; i++)
		maxpriqueue.Insert(a[i]);	 //向上调整

	cout << "\n最大优先队列元素为: "; maxpriqueue.Print();
	cout << "max = " << maxpriqueue.maximum() << endl;
	cout << "删掉最大元素 " << maxpriqueue.ExtractMax() << "后 :" ; maxpriqueue.Print();
	cout << "将元素 12 添加到 23 后："; maxpriqueue.InCreaseKey(12, 23);maxpriqueue.Print();
	cout << "删掉元素4后： "; maxpriqueue.Delete(4); maxpriqueue.Print();


	for (int i = 0; i < 12; i++)
		minpriqueue.Insert(a[i]);	 //向上调整
	cout << "\n最小优先队列元素为: "; minpriqueue.Print();
	cout << "min = " << minpriqueue.minimum() << endl;
	cout << "删掉最小元素 " << minpriqueue.ExtractMin() << "后 :" ; minpriqueue.Print();
	cout << "将元素 12 减小到 3 后："; minpriqueue.DeCreaseKey(12, 23);minpriqueue.Print();
	cout << "删掉元素4后： "; minpriqueue.Delete(4); maxpriqueue.Print();

	getchar();
	return 0;
}

执行截图：

ps：代码中的继承的类的代码能够在数据结构之（二叉）堆中查看；

向量排序算法：不断将无序数组中的元素插入最小优先队列中，构建完成后。再调用n次ExtractMin()操作。也能够实现非递减排序，可是要花费O(n)的空间复杂度。（编程珠玑：p148）

3、STL中的堆（heap）

STL中与堆相关的4个函数——建立堆make_heap()、在堆中加入数据push_heap()、在堆中删除数据pop_heap()和堆排序sort_heap()。

头文件 #include <algorithm>

以下的_First与_Last为能够随机訪问的迭代器（指针），_Comp为比較函数（仿函数）。其规则——假设函数的第一个參数小于第二个參数应返回true，否则返回false。

（1）建立堆

make_heap(_First, _Last, _Comp)

默认是建立大顶堆的（less<T>()）。对int类型。能够在第三个參数传入greater<int>()得到最小堆。最小堆的时候的push_heap()、pop_heap()、sort_heap()均要显示地将greater<int>()作为第三个參数传入。

（2）在堆中加入数据

push_heap (_First, _Last)

要先在容器中加入数据：push_back(x)操作；再调用push_heap ()：将最末尾新加入的元素向上调整使之整个容器数组符合堆性质。

（3）在堆中删除数据

pop_heap(_First, _Last)

要先调用pop_heap()操作：将堆顶元素和堆的最末尾元素交换，堆大小减去1。并向下调整堆顶元素使得新的堆满足最大堆的性质；再在容器中删除数据：pop_back()操作。每次删除的是堆顶的元素

（4）堆排序

sort_heap(_First, _Last)

排序之后就不再是一个合法的heap了。

STL中heap使用演示样例代码：

#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>	 //for function object like greater<int>()
#include <ctime>
using namespace std;

void print(vector<int> &vet)
{
	for (vector<int>::const_iterator iter = vet.begin(); iter != vet.end(); iter++)
		cout << *iter << " ";
	cout << endl;
}
int main()
{
	const int MAXN = 10;
	int a[MAXN];
	srand((unsigned)time(0));
	for (int i = 0; i < MAXN; ++i)
		a[i] = rand() % (MAXN * 2);

	//动态申请vector，并对vector建堆
	vector<int> *pvet = new vector<int>(20);
	pvet->assign(a, a + MAXN);

	cout << "\n无序数组:\t\t";print(*pvet);

	//默认建大顶堆
	make_heap(pvet->begin(), pvet->end());
	cout << "大顶堆:\t\t\t";print(*pvet);

	//加入数据
	pvet->push_back(25);	 //先在容器中加入
	cout << "\n容器push_back(25)后:\t";print(*pvet);
	push_heap(pvet->begin(), pvet->end());	//再调用push_heap()
	cout << "堆push_heap()操作后:\t";print(*pvet);

	//删除数据
	pop_heap(pvet->begin(), pvet->end());  // 先调用pop_heap()
	cout << "\n堆pop_heap()操作后:\t";print(*pvet);
	pvet->pop_back();		   //再在容器中删除
	cout << "容器pop_back()后:\t";print(*pvet);
	pop_heap(pvet->begin(), pvet->end());
	cout << "\n堆pop_heap()操作后:\t";print(*pvet);
	pvet->pop_back();
	cout << "容器pop_back()后:\t";print(*pvet);

	//堆排序
	sort_heap(pvet->begin(), pvet->end());
	cout << "\n堆排序结果:\t\t";print(*pvet);

	delete pvet;
	return 0;
}

演示样例截图：

4、利用堆求解top_k

编写算法。从10亿个浮点数其中，选出其中最大的10000个。

//典型的Top K问题，用堆是最典型的思路。

//建10000个数的小顶堆，然后将10亿个数依次读取，大于堆顶，则替换堆顶。做一次堆调整。
//结束之后，小顶堆中存放的数即为所求。
代码例如以下(为了方便，这里直接使用了STL容器)：

void top_k()
{
	const int nmax = 1000000000;	  //10亿个数
	vector<float>bigs(10000, 0);

	//init vector array
	srand((unsigned)time(0));
	for (vector<float>::iterator iter = bigs.begin(); iter != bigs.end(); iter++)
		*iter = (float)rand() / 7;	//random values;
	//cout << bigs.size() << endl;

	make_heap(bigs.begin(), bigs.end(), greater<float>());//little heap, the first one is the smallest one!

	float f;
	for (int i = 0; i < nmax; i++)
	{
		srand((unsigned)time(0));
		f = (float)rand() / 7;
		if (f > bigs.front())//replace the first element?

		{
			//set the smallest one to the end!
			pop_heap(bigs.begin(), bigs.end(), greater<float>());
			//remove the smallest one
			bigs.pop_back();
			//add to the last one
			bigs.push_back(f);
			//make the heap again, the first element is still the smallest one
			push_heap(bigs.begin(), bigs.end(), greater<float>());
		}
	}
	//sort by ascent
	sort_heap(bigs.begin(), bigs.end(), greater<float>());
}

參考资料：数据结构p297-p283、算法导论第三版：p90-p92、编程珠玑：p145-p148

STL系列之四 heap 堆

posted @ 2017-08-20 12:37 wzjhoutai 阅读(263) 评论(0) 编辑收藏举报

刷新页面返回顶部

利用堆实现堆排序&amp;优先队列