Median of Two Sorted Arrays

A fast method to determine the number is odd or even:

total & 0x1 //true, if total is odd
total & 0x1 //false, if total is even

  


 

Problem Statement

There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be $O(\log (m+n))$.

In order to solve the problem of finding median, we can consider a more general problem, i.e. finding the $k$-th minimum element of an array. If we solve the general problem, we can get the median by

int total = m + n;
if(total & 0x1)
    return findKthSortedArrays(A, m, B, n, total/2+1);
else
    return (0.0 + findKthSortedArrays(A, m, B, n, total/2) + findKthSortedArrays(A, m, B, n, total/2+1)) / 2;  //use 0.0 + ... to avoid round down.

 

We will use the thinking of binary search to delete $\frac{k}{2}$ elements of the first $k$-th elements, and recurse the smaller parts with $\frac{k}{2}$ size. Thus we only need $O(\log{k})$ running time for searching.

First, let's do some pre-operation. When we get two arrays A and B, we can assume that the length of A, m, is not greater than the length of B, n. Otherwise, we can swap A and B. The details as follows:

int double findKthSortedArrays(int A[], int m, int B[], int n, int k){
    if(m > n)
        return findKthSortedArrays(B, n, A, m, k);
    ......
}

Then we can consider the part deleted. We define

$pA = min(\frac{k}{2}, m)$

$pB = k - pA$

Comparing the two element: $A[pA-1]$, $B[pB-1]$.

  • If $A[pA-1] < B[pB-1]$, it means all the elements {$A[0], ..., A[pA-1]$} are below the $k^{th}$ minimum element in array $A\cup B$. Thus they can be deleted. Because they're irrelevant with the the $k^{th}$ minimum element. Then recurse the smaller parts with (k-pA) size.

Proof:

  • We duduce it by controdiction.

 

  • If not, $A[pA-1]$ at least should be the k-th minimum element in array $A\cup B$.

 

  • On the other hand, as $A[pA-1] < B[pB-1]$, there're at most (pB-1) elements, {$B[0], ..., B[pB-2]$}, smaller than $A[pA-1]$. Thus, there're at most (pA-1)+(pB-1) elements, {$A[0], ..., A[pA-2], B[0], ..., B[pB-2]$}, less than $A[pA-1]$. That means $A[pA-1]$ at most be the $$1+(pA-1)+(pB-1)=pA+pB-2=k-1$$, i.e. the (k-1)-th element in array $A\cup B$.

 

  • From above, we get the controdition.
  •  If $B[pB-1] < A[pA-1]$, we can say all the elements {$B[0], ..., B[pB-1]$} are below the $k$-th minimum element in array $A\cup B$ with the same deduction.
    Thus they can be deleted. Because they're irrelevant with the the $k^{th}$ minimum element. Then recurse the smaller parts with (k-pB) size.
  • If $B[pB-1] == A[pA-1]$, it just means we get the $k^{th}$ minimum element, which equals to $B[pB-1]$ or $A[pA-1]$.

     

We conclude the above as code:

if(A[pA-1] < B[pB-1])
    return findKthSortedArrays(A+pA, m-pA, B, n, k-pA);
else if(A[pA-1] > B[pB-1])
    return findKthSortedArrays(A, m, B+pB, n-pB, k-pB);
else
    return A[pA-1];

 

Also we need to consider the boundary condition to stop the recursion or exclude other exceptional situations.

  • when k > m + n, throw exception, i.e. assert(k <= m+n);

 

  • when m == 0, return B[k-1]; because we have assume n is bigger than m.

 

  • when k == 1, retur $min$(A[0], B[0]); the above boundary condition have excluded the situation A == NULL or B == NULL.

 Thus our code could be

assert(k <= m+n);
if(0 == m)
    return B[k-1];
if(1 == k)
    return min(A[0], B[0]);

 


 

The complete code is:

class Solution {
private:
    int min(int a, int b){
        return a < b ? a : b;
    }
    int findKthSortedArrays(int A[], int m, int B[], int n, int k){
        assert(k <= m+n);
        if(m > n)
            return findKthSortedArrays(B, n, A, m, k);
        if(0 == m)
            return B[k-1];
        if(1 == k)
            return min(A[0], B[0]);
            
        int pA = min(k/2, m);
        int pB = k - pA;
        
        if(A[pA-1] < B[pB-1])
            return findKthSortedArrays(A+pA, m-pA, B, n, k-pA);
        else if(A[pA-1] > B[pB-1])
            return findKthSortedArrays(A, m, B+pB, n-pB, k-pB);
        else
            return A[pA-1];
    }
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        int total = m + n;
        
        if(total & 0x1)
            return findKthSortedArrays(A, m, B, n, total/2+1);
        else
            return (0.0 + findKthSortedArrays(A, m, B, n, total/2)+findKthSortedArrays(A, m, B, n, total/2+1))/2;
    }
};

 

posted @ 2014-11-21 19:48  kid551  阅读(255)  评论(0编辑  收藏  举报