Timsort是一种混合排序算法，由merge sort和insertion sort衍生而来。这个算法查找已经排序好的子集，并且利用这些来有效的排序其它数据。这个过程通过归并一个已知子集，称为run，和当前存在的run，直到特定区域被填满。

最坏时间O(nlogn)

最好时间O(n)

平均时间O(nlogn)

最坏空间O(n)

操作

--------------------------------------------------------------------------------------------------------------------

Timsort的主要思路是利用现实世界中很多已经局部有序的数据的特点。TS通过寻找runs，至少两个元素的子集。Runs或者是非降序的，或者是严格降序的。如果降序必须是严格降序的，降序的runs之后将会通过简单的交换元素，从两端向中间聚合。如果是严格降序的，那么这个方法是稳定的。主要流程描述：

一个特别的算法用来将输入队列划分为子队列
每个子队列用简单的插入排序
排序好的子队列用merge sort聚合进一个队列

算法描述

===================================================

定义

-----------------------------------------------------------------------------------------

N：输入队列大小
run：一个有序的子队列，子队列的顺序是严格降序或者非降序。例如：“a0<=a1<=a2...” or "a0>a1>a2..."
minrun:正如上面描述的，算法的第一步是将输入队列划分成runs，minrun是这个run的最小长度，这个数字是通过N来计算出来的。

Step 0 计算Minrun

-------------------------------------------------------------------------------------------------------------------------------------

minrun是由N来决定的，依据以下规则：

不能太长，因为minrun只会会被用插入排序处理，短一点效率高。
不能太短，run越短，越多的runs要在下一步被merge。
如果N/minrun是2的幂最好（接近），因为merge排序在runs长度相同的时候效率最高。

这里作者给出了实验，如果minrun > 256,规则1违背，minrun<8,规则2违背，最好的长度是32到65.例外：如果N<64,minrun=N,Timsort退化成merge sort。minrun计算很简单，取最高六位，如该剩下的位包含1,那么再加个1,代码如下：

int GetMinrun(int n) {

int r = 0; /* becomes 1 if the least significant bits contain at least one off bit */

while (n >= 64) {

r |= n & 1;

n >>= 1;

}

return n + r;

}

Step 1 Splitting into Runs and Their Sorting

----------------------------------------------------------------------------
到现在为止，有一个长度为N的输入队列和一个计算好的minrun，算法流程如下：

当前元素基址设定为输入队列开始
从当前队列开始，在输入队列中查找run（有序子队列）。根据定义，run将会至少包括当前元素和随后的一个，但再往后的就要看运气。如果最终队列是降序的，那么元素将会是非降序顺序（将两端的数据向中间移动，交换元素）
如果当前run比minrun小，那么取当前run之后的minrun-size（run）个元素。最终的run大小是minrun或者更多，一部分是排序的（理想状态下所有）
对run执行插入排序。因为这个run是小的并且局部有序，插入很快，高效。
移动基址到当前run后
如果当前输入队列未达到末尾，继续执行过程2。

Step2 Merge

-------------------------------------------------------------------------------------------------

到目前为止，input队列被分为runs。如该输入队列接近随机，run大小都会接近minrun，如该数据是范围有序，run大小超过minrun。现在runs需要被结合来完成排序，当然，两个要求要满足：

被结合的run的大小要尽量一致，这样更高效
算法的稳定性要保证，没有不必要偏移，例如两个连续的相等的数，不能交换位置

这个可以用这种方式来实现：

创建空的pair stack<run base address>-<run size>，取第一个run
添加一些数据到栈<base address>-<size>到当前run
评估一下当前的run是否应该被合并到前一个。检查如下两个条件是否满足：X，Y，Z是栈顶的三个run：X>Y+Z Y>Z
如果其中一个条件不满足，那么Y就和X，Z中较小的队列进行合并。这个过程一直进行知道两个条件都满足或者所有数据都有序
对应任何一个没被考虑的run，重复第二步

这样run就适合merge排序。这样的方式merge更加平衡。

Runs Merging

------------------------------------------------------------------------------------------------------------------------------------------

merge过程需要额外的内存，这个方法最后利用原来的两个run的内存来存储数据

一个临时的run被创建，大小是要merge run中最小的
将最短的run拷贝到临时run
定位到较大的和临时的两个run队列
对于每个，用以下方法比较，将较小的移动到一个新的排序队列，移动元素指针到下一个元素
重复第四步直到其中一个取空
将剩下的run中的元素加入到新队列

Modifications to the Merging Sort

---------------------------------------------------------------------------------------------------------------------------------------------

基本方法是，如果发现一个队列比另一个大好多，那么切换模式，进入galloping模式，批量移动。

All seems perfect in the above merge sort. Except for one thing: imagine the merge of such two arrays:

A = {1, 2, 3,..., 9999, 10000}

B = { 20000, 20001, ...., 29999, 30000}

The above procedure will work for them too, but at each step four, one comparison and one moving should be performed to give 10000 comparisons and 10000 moves. Timsort offers here a modification called galloping. It means the following:

Start the merge sort as described above.
At each moving of an element from the temporary or large run to the final one, the run where the element was moved from is recorded.
If a number of elements (in this representation of the algorithm this number equals 7) was moved from one and the same run, it can be assumed that the next element will also come from the same run. To prove it, the galloping mode is switched, i.e. go through the run that is expected to supply the next set of data with a binary search (remember that the array is ordered and we have all rights for binary search) to find the current element from the other merged run. The binary search is more efficient than the linear one, thus the number of searches will be much smaller.
Finally, when the data in the supplying run do not fit (or having reached the run end), this data can be moved in a bulk (which can be more efficient than moving separate elements).

The explanation can be a bit vague, so let’s have a look at the example: A = {1, 2, 3,..., 9999, 10000} B = { 20000, 20001, ...., 29999, 30000}

In the first seven iterations, numbers 1, 2, 3, 4, 5, 6 and 7 from run A are compared with number 20000 and, after 20000 is found to be greater, they are moved from array A to the final one.
Starting with the next iteration, the galloping mode is switched on: number 20000 is compared in sequence with numbers 8, 10, 14, 22, 38, n+2^i, ..., 10000 from run A. As you can see, the number of such comparisons will be far less than 10000.
When run A is empty, it is known that it is smaller than run B (we could also have stopped somewhere in the middle). The data from run A is moved to the final one, and the procedure is repeated.

posted @ 2014-10-12 10:41 孤独的小马哥阅读(709) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

小马哥

Timsort

Modifications to the Merging Sort

公告