quick sort 的简化实现
Pivot 随机选取意义不大
第一种方法使用随机pivot,使得尽可能平均二分序列,而实际上一般来说需要排序的集合往往是乱序的,无需重新生成随机数作为pivot,大可使用固定位置的数作为pivot,这样便可以适应绝大多数情况,并且简化了逻辑,便有了第二种simple quick Sort。
从运算结果来看不管使用不使用随机pivot,性能几乎一样。
存在大量重复元素时,时间复杂度退化至 O(n2)
在大多教材上说,在输入已经有序等特定情况下,快速排序 (quicksort) 的时间复杂度会退化到 O(n2)。实际上,只要要被排序的对象集合中包括较多的相同元素,大部分教材上的实现都会退化至 O(n2)。大多数实现的partition 实现方法把与 pivot 相等的元素分到 pivot 的同一边是问题所在。
第三种实现quick_sort_quick便是为了解决这个问题的(大量重复元素存在情况下有n*10倍的性能提升)。
====== sort 80*500 numbers ====== quick QS: 2014-11-18 10:08:34.281000 my_quick_sort: 2014-11-18 10:08:35.425000 default sort: 2014-11-18 10:08:36.576000 after sort: 2014-11-18 10:08:36.579000 ====== sort 21*300 numbers with many duplicated ====== quick QS: 2014-11-18 10:08:36.579000 37ms #此处可以看出在有大量重复元素的情况下,第三种方法的性能相比另两种提高了一个数量级 my_quick_sort: 2014-11-18 10:08:36.616000 994ms default sort: 2014-11-18 10:08:37.610000 1ms after sort: 2014-11-18 10:08:37.611000 ====== compare with random or without random ====== simple QS: 2014-11-18 10:08:37.611000 my_quick_sort: 2014-11-18 10:08:38.758000 default sort: 2014-11-18 10:08:39.908000 after sort: 2014-11-18 10:08:39.909000 ====== sort 90000 numbers without duplicated ====== qucik QS: 2014-11-18 10:08:39.916000 simple QS: 2014-11-18 10:08:40.285000 my_quick_sort: 2014-11-18 10:08:40.661000 default sort: 2014-11-18 10:08:41.016000 after sort: 2014-11-18 10:08:41.018000
#排序1万个乱序:
simple QS: 2014-11-17 16:53:03.450000 38ms my_quick_sort: 2014-11-17 16:53:03.488000 35ms default sort: 2014-11-17 16:53:03.523000 after sort: 2014-11-17 16:53:03.524000 2014-11-17 16:53:03.524000
#排序1万个有序:
simple QS: 2014-11-17 16:53:03.524000 35ms my_quick_sort: 2014-11-17 16:53:03.559000 35ms default sort: 2014-11-17 16:53:03.594000 after sort: 2014-11-17 16:53:03.594000
#9万个有序:
simple QS: 2014-11-17 16:57:12.885000 367ms my_quick_sort: 2014-11-17 16:57:13.252000 352ms default sort: 2014-11-17 16:57:13.604000 1ms after sort: 2014-11-17 16:57:13.605000
#------------------------------------------------------------------------------- # Name: quick sort # Purpose: # # Author: ScottGu<gu.kai.66@gmail.com, 150316990@qq.com> # # Created: 04/09/2013 # Copyright: (c) ScottGu<gu.kai.66@gmail.com, 150316990@qq.com> 2013 # Licence: <your licence> #------------------------------------------------------------------------------- from datetime import datetime from random import randint import sys sys.setrecursionlimit(10000) def quick_sort(lst): if len(lst)==0: return lst else: pivot_idx = randint(0, len(lst) - 1) pivot = lst[pivot_idx] lesser=quick_sort([val for idx, val in enumerate(lst) if val <= pivot and idx!=pivot_idx]) greater=quick_sort([x for x in lst if x > pivot]) return lesser+[pivot]+greater def quick_sort_simple(lst): if len(lst)==0: return lst else: pivot = lst[0] lesser=quick_sort([x for x in lst[1:] if x <= pivot]) greater=quick_sort([x for x in lst[1:] if x > pivot]) return lesser+[pivot]+greater def quick_sort_quick(lst): if len(lst)==0: return lst else: pivot = lst[0] lesser=quick_sort([x for x in lst[1:] if x < pivot]) greater=quick_sort([x for x in lst[1:] if x > pivot]) return lesser+[pivot]+[x for x in lst[1:] if x == pivot]+greater def main(): lst=[20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100] lst=lst*500 ls2=[1,2,1,1,1,3,4,7,9,11,12,1,1,1,1,1,1,1,1,1,1] ls2=ls2*300 print '====== sort 80*500 numbers ====== ' print 'quick QS: '+str(datetime.now()) quick_sort_quick(lst) print 'my_quick_sort: '+str(datetime.now()) quick_sort_simple(lst) print 'default sort: '+str(datetime.now()) lst.sort() print 'after sort: '+str(datetime.now()) print '====== sort 21*300 numbers with many duplicated ====== ' print 'quick QS: '+str(datetime.now()) quick_sort_quick(ls2) print 'my_quick_sort: '+str(datetime.now()) quick_sort_simple(ls2) print 'default sort: '+str(datetime.now()) ls2.sort() print 'after sort: '+str(datetime.now()) print '====== compare with random or without random ====== ' print 'simple QS: '+str(datetime.now()) quick_sort_simple(lst) print 'my_quick_sort: '+str(datetime.now()) quick_sort(lst) print 'default sort: '+str(datetime.now()) lst.sort() print 'after sort: '+str(datetime.now()) print '====== sort 90000 numbers without duplicated ====== ' lst=[] for x in range(90000): lst.append(x) print 'qucik QS: '+str(datetime.now()) quick_sort_quick(lst) print 'simple QS: '+str(datetime.now()) quick_sort_simple(lst) print 'my_quick_sort: '+str(datetime.now()) quick_sort(lst) print 'default sort: '+str(datetime.now()) lst.sort() print 'after sort: '+str(datetime.now()) if __name__ == '__main__': main()
--
Scott,
Programmer in Beijing
[If you can’t explain it to a six year old, you don’t understand it yourself. —Albert Einstein ]