Median Maintainence—中间值查找问题
问题描述:
随机给出一串数i, 要能够给出其中大小中间的那个数
算法描述:
一般做法,做插入排序,然后中间值在索引一半的位置,时间复杂度一般,插入排序平均时间复杂度O(n2),再找中间
值,效率不高。
这里的做法是,引入数据结构--Heap来解决问题,时间复杂度为O(logn)。
引入两个堆,max heap和 min heap来存放整数串i的两个部分,需要满足如下条件:
1. 大小条件
max heap中的元素个数只能比min中的多1个,或者是相等,否则进行调整
2. 顺序条件
max heap中存放前半部分小的值
min heap中存放后半部分大的值
max heap中最大的值只能比min中最小的值小,或者是相等,否则进行调整
也就是median产生在max heap中的堆顶,或者是max heap堆顶和min heap中的堆顶的平均值
代码如下:
class MyHeap: # heap type MAX_HEAP = 1 MIN_HEAP = 0 def __init__(self, type=MAX_HEAP, arr=None): self.type = type # if init directly by array if arr is not None: self.data = arr[:] length = len(arr) # the last non leave node begin = length / 2 - 1 for i in range(begin, -1, -1): self.heapify(i) else: self.data = [] def __heapify(self, i): length = len(self.data) left = self.__leftChild(i) right = self.__rightChild(i) largest = i while left < length or right < length: if self.type == self.MAX_HEAP: if left < length and self.data[left] > self.data[largest]: largest = left if right < length and self.data[right] > self.data[largest]: largest = right elif self.type == self.MIN_HEAP: if left < length and self.data[left] < self.data[largest]: largest = left if right < length and self.data[right] < self.data[largest]: largest = right if i != largest: self.__swap(i, largest) i = largest left = self.__leftChild(i) right = self.__rightChild(i) else: break def inset(self, item): self.data.insert(0, item) # heapify starts from 0 self.__heapify(0) def delete(self, index): self.data.pop(index) # if delete the 0 index item, heapify from 0 self.heapify(index - 1 if index - 1 else 0) def pop(self): # pop the extreme value, what ever it is max or min self.__swap(0, len(self.data) - 1) extreme = self.data.pop() self.__heapify(0) return extreme # overwrite the getitem method of MyHeap class, # so you can use [] to get value by index def __getitem__(self, index): if len(self.data) == 0: raise Error("no items") return self.data[index] # overwrite the len method of MyHeap class, # so you can len(heapclass) to get the size of heap def __len__(self): return len(self.data) def __swap(self, i, j): temp = self.data[i] self.data[i] = self.data[j] self.data[j] = temp # index of array starts from zero def __rightChild(self, i): return 2 * i + 1 def __leftChild(self, i): return 2 * i + 2 # overwrite the repr method of MyHeap class, # so you can print the readability info of heap def __repr__(self): return str(self.data) class MedianMaintain: def __init__(self): self.maxHeap = MyHeap(MyHeap.MAX_HEAP) self.minHeap = MyHeap(MyHeap.MIN_HEAP) # the total number of items in both heaps self.N = 0 def insert(self, item): # to obey size requirement rule, before insertion, if # total number is even, it is OK, insert new item to # max heap, and then adjust it if self.N % 2 == 0: self.maxHeap.inset(item) self.N += 1 if len(self.minHeap) == 0: return # to obey order requirement rule, largest of items in max heap should # less or equal than smallest of the items in the min heap, if not, # swap them if self.maxHeap[0] > self.minHeap[0]: toMin = self.maxHeap.pop() toMax = self.minHeap.pop() self.maxHeap.inset(toMax) self.minHeap.inset(toMin) else: # to obey the size requirement rule, before insertion, if the size of # max heap is odd, then to insert the new item, and pop the extreme value # to insert into min heap self.maxHeap.inset(item) toMin = self.maxHeap.pop() self.minHeap.inset(toMin) self.N += 1 def getMedian(self): # if total size if even, the median is the average of value of root of min and max heap if self.N % 2 == 0: return (self.maxHeap[0] + self.minHeap[0]) / 2.0 else: # if total size if odd, median is root of max heap return self.maxHeap[0] def __repr__(self): return "max heap: " + str(self.maxHeap) + '\n' + "min heap: " + str(self.minHeap) if __name__ == "__main__": medianMaintain = MedianMaintain() medianMaintain.insert(5) medianMaintain.insert(4) medianMaintain.insert(3) medianMaintain.insert(2) medianMaintain.insert(1) medianMaintain.insert(6) print medianMaintain print medianMaintain.getMedian()
作者:btchenguang
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利.