寻找最小的k个数(update)

类似编程之美中寻找最大的k个数

解法一:

题目没有要求最小的k个数有序,也没要求最后n-k个数有序。既然如此,就没有必要对所有元素进行排序。这时,咱们想到了用选择或交换排序,即:

1、遍历n个数,把最先遍历到的k个数存入到大小为k的数组中,假设它们即是最小的k个数;
2、对这k个数,利用选择或交换排序找到这k个元素中的最大值kmax(找最大值需要遍历这k个数,时间复杂度为 O(k) );
3、继续遍历剩余n-k个数。假设每一次遍历到的新的元素的值为x,把x与kmax比较:如果 x < kmax ,用x替换kmax,并回到第二步重新找出k个元素的数组中最大元素kmax‘;如果 x >= kmax ,则继续遍历不更新数组。

每次遍历,更新或不更新数组的所用的时间为 O(k) O(0) 。故整趟下来,时间复杂度为 n*O(k)=O(n*k)

 

解法二:快速选择算法

Quicksort can be modified to solve the selection problem, which we have seen in chapters 1 and 6. Recall that by using a priority queue, we can find the kth largest (or smallest) element in O(n + k log n). For the special case of finding the median, this gives an O(n log n) algorithm.

    Since we can sort the file in O(nlog n) time, one might expect to obtain a better time bound for selection. The algorithm we present to find the kth smallest element in a set S is almost identical to quicksort. In fact, the first three steps are the same. We will call     this algorithm quickselect(叫做快速选择). Let |Si| denote the number of elements in Si(令|Si|为Si中元素的个数). The steps of quickselect are(快速选择,即上述编程之美一书上的,思路4,步骤如下):

    1. If |S| = 1, then k = 1 and return the elements in S as the answer. If a cutoff for small files is being used and |S| <=CUTOFF, then sort S and return the kth smallest element.
    2. Pick a pivot element, v (- S.(选取一个枢纽元v属于S)
    3. Partition S - {v} into S1 and S2, as was done with quicksort.
(将集合S-{v}分割成S1和S2,就像我们在快速排序中所作的那样)

    4. If k <= |S1|, then the kth smallest element must be in S1. In this case, return quickselect (S1, k). If k = 1 + |S1|, then the pivot is the kth smallest element and we can return it as the answer. Otherwise, the kth smallest element lies in S2, and it is the (k - |S1| - 1)st smallest element in S2. We make a recursive call and return quickselect (S2, k - |S1| - 1).
(如果k<=|S1|,那么第k个最小元素必然在S1中。在这种情况下, 返回quickselect(S1,k)。如果k=1+|S1|,那么枢纽元素就是第k个最小元素,即找到,直接返回它。否则,这第k个最小元素就在S2 中,即S2中的第(k-|S1|-1)个最小元素,我们递归调用并返回quickselect(S2,k-|S1|-1))。

    In contrast to quicksort, quickselect makes only one recursive call instead of two. The worst case of quickselect is identical to that of quicksort and is O(n2). Intuitively, this is because quicksort's worst case is when one of S1 and S2 is empty; thus, quickselect(快速选择) is not really saving a recursive call. The average running time, however, is O(n)(不过,其平均运行时间为O(N)。看到了没,就是平均复杂度为O(N)这句话). The analysis is similar to quicksort's and is left as an exercise.

    The implementation of quickselect is even simpler than the abstract description might imply. The code to do this shown in Figure 7.16. When the algorithm terminates, the kth smallest element is in position k. This destroys the original ordering; if this is not desirable, then a copy must be made.

  • 选取S中一个元素作为枢纽元v,将集合S-{v}分割成S1和S2,就像快速排序那样
    • 如果k <= |S1|,那么第k个最小元素必然在S1中。在这种情况下,返回QuickSelect(S1, k)。
    • 如果k = 1 + |S1|,那么枢纽元素就是第k个最小元素,即找到,直接返回它。
    • 否则,这第k个最小元素就在S2中,即S2中的第(k - |S1| - 1)个最小元素,我们递归调用并返回QuickSelect(S2, k - |S1| - 1)。

此算法的平均运行时间为O(n)。

示例代码如下:

 1 //QuickSelect 将第k小的元素放在 a[k-1]  
 2 void QuickSelect( int a[], int k, int left, int right )
 3 {
 4     int i, j;
 5     int pivot;
 6 
 7     if( left + cutoff <= right )
 8     {
 9         pivot = median3( a, left, right );
10         //取三数中值作为枢纽元,可以很大程度上避免最坏情况
11         i = left; j = right - 1;
12         for( ; ; )
13         {
14             while( a[ ++i ] < pivot ){ }
15             while( a[ --j ] > pivot ){ }
16             if( i < j )
17                 swap( &a[ i ], &a[ j ] );
18             else
19                 break;
20         }
21         //重置枢纽元
22         swap( &a[ i ], &a[ right - 1 ] );  
23 
24         if( k <= i )
25             QuickSelect( a, k, left, i - 1 );
26         else if( k > i + 1 )
27             QuickSelect( a, k, i + 1, right );
28     }
29     else  
30         InsertSort( a + left, right - left + 1 );
31 }

 

 

posted @ 2015-08-11 15:23  尾巴草  阅读(190)  评论(0编辑  收藏  举报