并查集的实现

1、概述

　　并查集（Disjoint set或者Union-find set）是一种树型的数据结构，常用于处理一些不相交集合（Disjoint Sets）的合并及查询问题。

2、基本操作

　　并查集是一种非常简单的数据结构，它主要未来解决如下两种经常性操作而产生的，分别为：

　　A．合并两个不相交集合

　　B．判断两个元素是否属于同一个集合（经常性）

（1）合并两个不相交集合

　　合并操作很简单：先设置一个数组Father[x]，表示x的“父亲”的编号。那么，合并两个不相交集合的方法就是，找到其中一个集合最父亲的父亲（也就是最久远的祖先），将另外一个集合的最久远的祖先的父亲指向它。

上图为两个不相交集合，b图为合并后Father(b):=Father(g)

（2）判断两个元素是否属于同一集合

　　本操作可转换为寻找两个元素的最久远祖先是否相同。可以采用递归实现。

3、优化

（1）路径压缩

寻找祖先时，我们一般采用递归查找，但是当元素很多亦或是整棵树变为一条链时，每次Find_Set(x)都是O(n)的复杂度。为了避免这种情况，我们需对路径进行压缩，即当我们经过”递推”找到祖先节点后，”回溯”的时候顺便将它的子孙节点都直接指向祖先，这样以后再次Find_Set(x)时复杂度就变成O(1)了，如下图所示。可见，路径压缩方便了以后的查找。

（2)合并时，按秩合并

即合并的时候将元素少的集合合并到元素多的集合中，这样合并之后树的高度会相对较小。

4、源代码

其代码如下，可点击这里进行下载：

从数据文件读取数据，然后构建完并查集后进行输出，保存在目标文件中，程序用到了boost library，所以运行程序需要添加相应库文件。

  1 #include <fstream>
  2 #include <iostream>
  3 #include <string>
  4 
  5 #include "boost/algorithm/string.hpp"
  6 #include "boost/regex.hpp"
  7 
  8 using namespace std;
  9 using namespace boost;
 10 
 11 const int MAX_ID = 200000001;
 12 
 13 int father[MAX_ID];
 14 
 15 void init_father()
 16 {
 17     for (int i = 0; i < MAX_ID; i++)
 18     {
 19         father[i] = i;
 20     }
 21 }
 22 
 23 int find_ancestry(int id, int& count)
 24 {
 25     count++;
 26     if(id != father[id])
 27     {
 28          father[id] = find_ancestry(father[id],count);
 29     }
 30     return father[id];
 31 }
 32 
 33 int process(string instr, int id_x, int id_y, fstream &out_file)
 34 {
 35     //q
 36     if (instr == "")
 37     {
 38         return -1;
 39     }
 40     //不是同一祖先
 41     int deep_x = 0;
 42     int deep_y = 0;
 43     int temp_y = find_ancestry(id_y, deep_y);
 44     int temp_x = find_ancestry(id_x, deep_x);
 45 
 46     if (temp_x != temp_y)
 47     {
 48         //q
 49         if(instr == "Q")
 50         {
 51             out_file << "N\n";
 52         }
 53         else//p 进行合并
 54         {//优化2 ？ 优化1 ： 优化1
 55             deep_x >= deep_y ? father[temp_y] = temp_x
 56                              : father[temp_x] = temp_y; //y->x
 57         }
 58     }
 59     else
 60     {
 61         if(instr == "Q")
 62         {
 63             out_file << "Y\n";
 64         }
 65     }
 66     return 0;
 67 }
 68 
 69 fstream FOpen(string file_name, int read)
 70 {
 71     if (file_name != "" && !file_name.empty() && (read == 0 || read == 1))
 72     {
 73         fstream File;
 74         if (read)
 75         {
 76             File.open(file_name);
 77         }
 78         else
 79             File.open(file_name, ios::out);
 80         
 81         if (!File)
 82         {
 83             cout << file_name << " can't not be opened." << endl;
 84             abort();
 85         }
 86         return File;
 87     }
 88     else
 89         cout << "can't get the file name." << endl;
 90 }
 91 
 92 int checkError(const string &line_temp)
 93 {
 94     regex expr("(P|Q)(\\s\\d+){2}$");
 95     cmatch what;
 96     if(!regex_match(line_temp.c_str(), what, expr))
 97     {
 98         return -1;
 99     }
100     return 0;
101 }
102 
103 int read_line(fstream &cusu_file, string &line_temp)
104 {
105     if (cusu_file.eof())
106     {
107         return -1;
108     }
109     else
110     {
111         getline(cusu_file, line_temp);
112         return 0;
113     }
114 }
115 
116 int main()
117 {
118     init_father();
119     vector <string> pro_data;
120     string temp_data = "";
121     fstream in_file = FOpen("division.in", 1);
122     fstream out_file = FOpen("division.out", 0);
123     while(!read_line(in_file, temp_data))
124     {
125         if(!checkError(temp_data))
126         {
127             boost::algorithm::split(pro_data, temp_data, boost::algorithm::is_any_of(" "));
128             if (!(pro_data[0] == "" || pro_data[1] == "" || pro_data[2] == ""))
129             {
130                 process(pro_data[0], atoi(pro_data[1].c_str()), atoi(pro_data[2].c_str()), out_file);
131             }
132         }
133         else
134         {
135             out_file << "SORRY\n";
136             break;
137         }
138     }
139     out_file.close();
140     in_file.close();
141     return 0;
142 }

数据文件：

输出文件（对于Hi?不标准的输入会提示出错）：

5、小结

并查集对大规模数据的查询有快速响应的作用，在此程序中，可将数组替换成链表，也可以运用分布式存贮应对海量数据；此外，程序的两处优化对提高程序效率效果明显。

posted @ 2012-11-28 15:47 MichaelGD 阅读(431) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

MichaelGD