Sliding Window Median

Given an array of n integer with duplicate number, and a moving window(size k), move the window at each iteration from the start of the array, find the maximum number inside the window at each moving. 

Example

For array [1, 2, 7, 7, 8], moving window size k = 3. return [7, 7, 8]

At first the window is at the start of the array like this 

[|1, 2, 7| ,7, 8] , return the maximum 7;

then the window move one step forward.

[1, |2, 7 ,7|, 8], return the maximum 7;

then the window move one step forward again.

[1, 2, |7, 7, 8|], return the maximum 8;

又是一道sliding window的题目。求中位数,和Find Median from Data Stream非常相似,可以想到用minheap和maxheap来解决是非常好的思路。

但是和data flow不一样, sliding window除了增还要删除,所以用hashheap来做比较好。

具体解法就是,把每次窗口的移动分解为一次增加元素,一次删除元素。因为有两个堆,判断在哪个堆中进行删除比较关键。因为题意,我的做法是maxheap总是存储和minheap一样数目或者多1的数字,这样中位数一直是maxheap的堆顶,则每次可以把要删除的元素和中位数比较,大则在minheap里面,小或者等于则在maxheap里删除。实际这种解法在lintcode中超时,但是如果多hashheap多写一个接口函数,contain(now),利用hash表进行检查,这种做法顺利通过,代码如下:

class Solution:
    """
    @param nums: A list of integers.
    @return: The median of element inside the window at each moving.
    """
    def medianSlidingWindow(self, nums, k):
        if len(nums) < k or not nums:
            return []
        minheap = HashHeap('min')
        maxheap = HashHeap('max')
        for i in xrange(k):
            if i % 2: 
                maxheap.add(nums[i])
                minheap.add(maxheap.pop())
            else:
                minheap.add(nums[i])
                maxheap.add(minheap.pop())
        median = []
        for i in xrange(k, len(nums)):
            median.append(maxheap.top())
            if minheap.contain(nums[i - k]):
                minheap.delete(nums[i - k])
                maxheap.add(nums[i])
                minheap.add(maxheap.pop())
            else:
                maxheap.delete(nums[i - k])
                minheap.add(nums[i])
                maxheap.add(minheap.pop())
        median.append(maxheap.top())
        return median


class Node(object):
    """
    the type of class stored in the hashmap, in case there are many same heights
    in the heap, maintain the number
    """
    def __init__(self, id, num):
        self.id = id #id means its id in heap array
        self.num = num #number of same value in this id
        
class HashHeap(object):
    def __init__(self, mode):
        self.heap = []
        self.mode = mode
        self.size = 0
        self.hash = {}
    def top(self):
        return self.heap[0] if len(self.heap) > 0 else 0
        
    def contain(self, now):
        if self.hash.get(now):
            return True
        else:
            return False
        
    def isempty(self):
        return len(self.heap) == 0
        
    def _comparesmall(self, a, b): #compare function in different mode
        if a <= b:
            if self.mode == 'min':
                return True
            else:
                return False
        else:
            if self.mode == 'min':
                return False
            else:
                return True
    def _swap(self, idA, idB): #swap two values in heap, we also need to change
        valA = self.heap[idA]
        valB = self.heap[idB]
        
        numA = self.hash[valA].num
        numB = self.hash[valB].num
        self.hash[valB] = Node(idA, numB)
        self.hash[valA] = Node(idB, numA)
        self.heap[idA], self.heap[idB] = self.heap[idB], self.heap[idA]
    
    def add(self, now):  #the key, height in this place
        self.size += 1
        if self.hash.get(now):
            hashnow = self.hash[now]
            self.hash[now] = Node(hashnow.id, hashnow.num + 1)
        else:
            self.heap.append(now)
            self.hash[now] = Node(len(self.heap) - 1,1)
            self._siftup(len(self.heap) - 1)
            
    def pop(self):  #pop the top of heap
        self.size -= 1
        now = self.heap[0]
        hashnow = self.hash[now]
        num = hashnow.num
        if num == 1:
            self._swap(0, len(self.heap) - 1)
            self.hash.pop(now)
            self.heap.pop()
            self._siftdown(0)
        else:
            self.hash[now] = Node(0, num - 1)
        return now
        
    def delete(self, now):
        self.size -= 1
        hashnow = self.hash[now]
        id = hashnow.id
        num = hashnow.num
        if num == 1:
            self._swap(id, len(self.heap)-1) #like the common delete operation
            self.hash.pop(now)
            self.heap.pop()
            if len(self.heap) > id:
                self._siftup(id)
                self._siftdown(id)
        else:
            self.hash[now] = Node(id, num - 1)
            
    def parent(self, id):
      if id == 0:
        return -1

      return (id - 1) / 2

    def _siftup(self,id):
        while abs(id -1)/2 < id :  #iterative version
            parentId = (id - 1)/2 
            if self._comparesmall(self.heap[parentId],self.heap[id]):
                break
            else:
                self._swap(id, parentId)
            id = parentId
                
    def _siftdown(self, id): #iterative version
        while 2*id + 1 < len(self.heap):
            l = 2*id + 1
            r = l + 1
            small = id
            if self._comparesmall(self.heap[l], self.heap[id]):
                small = l
            if r < len(self.heap) and self._comparesmall(self.heap[r], self.heap[small]):
                small = r
            if small != id:
                self._swap(id, small)
            else:
                break
            id = small   

上述做法时间复杂度为O(nlogk)。但是这题需要求中位数,没有特别好的方法像Sliding Window Maximum这样找到O(n)的解法。就酱

posted on 2016-07-09 11:56  Sheryl Wang  阅读(221)  评论(0编辑  收藏  举报

导航