The Closest M Points//kd树+优先队列

题目

The Closest M Points
Time Limit: 16000/8000 MS (Java/Others) Memory Limit: 98304/98304 K (Java/Others)
Total Submission(s): 7570 Accepted Submission(s): 2378

Problem Description
The course of Software Design and Development Practice is objectionable. ZLC is facing a serious problem .There are many points in K-dimensional space .Given a point. ZLC need to find out the closest m points. Euclidean distance is used as the distance metric between two points. The Euclidean distance between points p and q is the length of the line segment connecting them.In Cartesian coordinates, if p = (p1, p2,..., pn) and q = (q1, q2,..., qn) are two points in Euclidean n-space, then the distance from p to q, or from q to p is given by:

Can you help him solve this problem?

Input
In the first line of the text file .there are two non-negative integers n and K. They denote respectively: the number of points, 1 <= n <= 50000, and the number of Dimensions,1 <= K <= 5. In each of the following n lines there is written k integers, representing the coordinates of a point. This followed by a line with one positive integer t, representing the number of queries,1 <= t <=10000.each query contains two lines. The k integers in the first line represent the given point. In the second line, there is one integer m, the number of closest points you should find,1 <= m <=10. The absolute value of all the coordinates will not be more than 10000.
There are multiple test cases. Process to end of file.

Output
For each query, output m+1 lines:
The first line saying :”the closest m points are:” where m is the number of the points.
The following m lines representing m points ,in accordance with the order from near to far
It is guaranteed that the answer can only be formed in one ways. The distances from the given point to all the nearest m+1 points are different. That means input like this:
2 2
1 1
3 3
1
2 2
1
will not exist.

Sample Input

3 2
1 1
1 3
3 4
2
2 3
2
2 3
1

Sample Output

the closest 2 points are:
1 3
3 4
the closest 1 points are:
1 3

思路

先构建kd树,循环地以每个维度作为划分依据来划分数据,顺便标记下边界,便于后面的查找.然后是查找,先找到距离所要查的点的最近的叶节点,然后回溯.此时当队列为空直接放入队列,或当前维度下,当前点到所查点的距离小于队顶元素的距离,也放入队列,这两种情况时都要去搜当前点的另一边的,因为可能存在交集.还有要注意输出格式最后一个不能有空格;因为要多次测试,所以每次开始前要把队列pop空.

代码

#include <iostream>
#include <cstdio>
#include <queue>
#include <cmath>
#include <algorithm>
#define Pow(x) (x) * (x)
using namespace std;

const int N = 50000+10;
const int K = 5;

int n, k, idx;//n个k维数据,idx是用来索引比较的那一维度

struct node 
{
	int feature[K];
	bool operator < (const node & u) const 
	{
		return feature[idx] < u.feature[idx];
	}
};

struct node data[N];//存放原始数据
priority_queue<pair<double, node>> Q;
struct node kdtree[4*N];//存放kd树
int flag[4*N];//用来标记边界,便于查找

void Build(int l, int r, int p, int depth)
{
	if (l > r)
		return;
	flag[p] = 1;
	flag[p*2] = flag[p*2+1] = -1;
	idx = depth % k;
	int mid = (l + r) / 2;
	nth_element(data+l, data+mid, data+r+1);//找中间那个来划分数据
	kdtree[p] = data[mid];
	Build(l, mid-1, p*2, depth+1);
	Build(mid+1, r, p*2+1, depth+1);
}

void Query(node c, int num, int p, int depth)
{
	if(flag[p] == -1)//搜到边界,返回
		return;
	pair<double, node> cur(0, kdtree[p]);
	for(int i = 0; i < k; i++)
	{
		cur.first += Pow(cur.second.feature[i] - c.feature[i]);
	}	
	int dim = depth % k;//这里没用全局变量idx是因为搜索时该变量会前后递归调用很多次,全局变量可能会改变,然后导致错误
	bool fg = 0;//标记是否还要继续搜

	int x = p*2;
	int y = p*2+1;
	if(c.feature[dim] >= kdtree[p].feature[dim])//往接近的地方搜
		swap(x,y);
	if(~flag[x])//未到边界
		Query(c,num,x,depth+1);
	//搜到底后递归回去
	if(Q.size() < num)//队列未满时,数据直接放入,但要标记下,因为之后可能会被更新
	{
		Q.push(cur);
		fg = 1;
	}
	else//队列已满
	{
		if(cur.first < Q.top().first)//若当前点比队顶数据小,则更新队列
		{
			Q.pop();
			Q.push(cur);
		}
		if(Pow(c.feature[dim] - kdtree[p].feature[dim]) < Q.top().first)//二维解释:划分直线另一侧的点到这一侧的点A的距离必然大于等于划分直线到A点的距离
		{
			fg = 1;
		}
	}
	if(~flag[y] && fg)//搜索当前节点的另一个分支,因为可能有交集
		Query(c,num,y,depth+1);
}

int main()
{
	int i, j, m, q;
	
	while(scanf("%d%d", &n, &k) != EOF)
	{
		for(i = 0; i < n; i++)
		{
			for(j = 0; j < k; j++)
			{
				scanf("%d", &data[i].feature[j]);
			}
		}
		Build(0,n-1,1,0);
		int t;//查询几次
		int num;//查询前多少个
		struct node temp;//存放待查询的数据
		scanf("%d", &t);
		for(i = 0; i < t; i++)
		{	
			for(j = 0; j < k; j++)
			{
				scanf("%d", &temp.feature[j]);
			}
			scanf("%d", &num);
			while(!Q.empty())//这里要注意
				Q.pop();
			Query(temp,num,1,0);
			struct node ans[20];
			for(m = 0; !Q.empty(); m++)
			{
				ans[m] = Q.top().second;
				Q.pop();
			}
			printf("the closest %d points are:\n", num);
			for(m = num - 1; m >= 0; m--)
			{
				for(q = 0; q < k; q++)
				{
					printf("%d%c", ans[m].feature[q], q == k-1 ? '\n' : ' ');//这里要注意最后一个不能有空格
				}
			}
		}
	}
}

参考
https://blog.csdn.net/javays1/article/details/50369176//原理解释的非常清楚

posted @ 2018-12-04 15:41  樱花色的梦  阅读(266)  评论(0编辑  收藏  举报