源码：Arrays.sort()

Arrays.sort() 调用方法 DualPivotQuicksort.sort()，

下文重点分析该方法。

public static void sort(int[] a) {
    DualPivotQuicksort.sort(a, 0, a.length - 1, null, 0, 0);
}

结论：数组长度 n

[1, 47)：插入
[47, 286)：快排
[286, +∞)：
1. 无结构：快排
2. 有结构：归并

1、sort()

判断数组元素个数 < 快排阈值（QUICKSORT_THRESHOLD）

private static final int QUICKSORT_THRESHOLD = 286;

static void sort(int[] a, int left, int right,
                int[] work, int workBase, int workLen) {
   if (right - left < QUICKSORT_THRESHOLD) {
       ...
   }
   ...
}

是：小数组。调用重载 sort()，排序后返回。
```
sort(a, left, right, true);
return;
```
否：大数组，查看后续代码。

2、小数组：重载 sort()

判断数组元素个数 < 插入排序阈值（INSERTION_SORT_THRESHOLD）

private static final int INSERTION_SORT_THRESHOLD = 47;

private static void sort(int[] a, int left, int right, boolean leftmost) {
    int length = right - left + 1;

    if (length < INSERTION_SORT_THRESHOLD) {
        ...
    }
    ...
}

是：插入排序
否：快排

2.1、插入

if (leftmost) {
    /*
     * Traditional (without sentinel) insertion sort,
     * optimized for server VM, is used in case of
     * the leftmost part.
     */
    for (int i = left, j = i; i < right; j = ++i) {
        int ai = a[i + 1];
        while (ai < a[j]) {
            a[j + 1] = a[j];
            if (j-- == left) {
                break;
            }
        }
        a[j + 1] = ai;
    }
} else {
    /*
     * Skip the longest ascending sequence.
     */
    do {
        if (left >= right) {
            return;
        }
    } while (a[++left] >= a[left - 1]);

    /*
     * Every element from adjoining part plays the role
     * of sentinel, therefore this allows us to avoid the
     * left range check on each iteration. Moreover, we use
     * the more optimized algorithm, so called pair insertion
     * sort, which is faster (in the context of Quicksort)
     * than traditional implementation of insertion sort.
     */
    for (int k = left; ++left <= right; k = ++left) {
        int a1 = a[k], a2 = a[left];

        if (a1 < a2) {
            a2 = a1; a1 = a[left];
        }
        while (a1 < a[--k]) {
            a[k + 2] = a[k];
        }
        a[++k + 1] = a1;

        while (a2 < a[--k]) {
            a[k + 1] = a[k];
        }
        a[k + 1] = a2;
    }
    int last = a[right];

    while (last < a[--right]) {
        a[right + 1] = a[right];
    }
    a[right + 1] = last;
}
return;

2.2、快排

2.2.1、备选 pivot

选择中心元素周围的 5 个等距元素，升序排序得到备选 pivot。

int seventh = (length >> 3) + (length >> 6) + 1;

// 选择：将数组长度无符号右移，得到中心元素索引
int e3 = (left + right) >>> 1; // The midpoint
int e2 = e3 - seventh;
int e1 = e2 - seventh;
int e4 = e3 + seventh;
int e5 = e4 + seventh;

// 排序：比较并交换
if (a[e2] < a[e1]) { 
    int t = a[e2]; a[e2] = a[e1]; a[e1] = t; 
}

if (a[e3] < a[e2]) { 
    int t = a[e3]; a[e3] = a[e2]; a[e2] = t;
    if (t < a[e1]) {
        a[e2] = a[e1]; a[e1] = t; 
    }

}
if (a[e4] < a[e3]) {
    int t = a[e4]; a[e4] = a[e3]; a[e3] = t;
    if (t < a[e2]) { 
        a[e3] = a[e2]; a[e2] = t;
        if (t < a[e1]) {
            a[e2] = a[e1]; a[e1] = t; 
        }
    }
}
if (a[e5] < a[e4]) { 
    int t = a[e5]; a[e5] = a[e4]; a[e4] = t;

    if (t < a[e3]) {
        a[e4] = a[e3]; a[e3] = t;
        if (t < a[e2]) { 
            a[e3] = a[e2]; a[e2] = t;
            if (t < a[e1]) { 
                a[e2] = a[e1]; a[e1] = t; 
            }
        }
    }
}

2.2.2、分区

① 指针

初始化指针 less 和 great，分别指向数组头尾。
选择 e2 和 e4 的值作为 pivot。
将数组首尾元素值分别临时赋值 pivot（e2 和 e4），分区排序完成时会将 pivot 值恢复。

移动指针 less 和 great，确保

less 左边的元素 < pivot1

great 右边的元素 > pivot2

int less  = left;
int great = right;

if (a[e1] != a[e2] && a[e2] != a[e3] && a[e3] != a[e4] && a[e4] != a[e5]) {

    int pivot1 = a[e2];
    int pivot2 = a[e4];

    a[e2] = a[left];
    a[e4] = a[right];

    while (a[++less] < pivot1);
    while (a[--great] > pivot2);
}

② 分区（❗）

指针 less 和 great 将数组分为三部分。

分区：
- left：全小于 pivot1
- center：有大有小
- right：全大于 pivot2

指针：

less：指向 center 的首个元素，即 left 的后一个元素。
great：指向 center 的最后元素，即 great 的前一个元素。

k：在 center 中游走的指针。

  left part           center part                   right part
+--------------------------------------------------------------+
|  < pivot1  |  pivot1 <= && <= pivot2  |    ?    |  > pivot2  |
+--------------------------------------------------------------+
              ^                          ^       ^
              |                          |       |
             less                        k     great

③ 排序

在 center 中使用指针 k 循环，

目的：使 center 中的元素值大小为 [pivot1, pivot2]

outer:
for (int k = less - 1; ++k <= great; ) {
    int ak = a[k];
    ...
}

< pivot1：将 a[k] 移到 left（交换 a[k] 与 a[less]，且 less++）。
```
if (ak < pivot1) {
    a[k] = a[less];

    a[less] = ak;
    ++less;
}
```
> pivot2：将 a[k] 直接或间接移到 right（根据 a[great] 大小决定操作）。
```
else if (ak > pivot2) {
    ...
}
```
1. 缩小 great：
  1. 确保 great 右边的元素 > pivot2，循环结束后 a[great] <= pivot2（需要进一步确认与 pivot1 的大小关系）
  2. 若过程中遇到 k，则退出 outer for 循环。
```
while (a[great] > pivot2) {
    if (great-- == k) {
        break outer;
    }
}
```
2. a[great] < pivot1（间接）：
  1. 将 less 移动到 center，a[great] 作为新的 less，且 less++。
  2. 将 a[k] 赋值到 a[great]，且 great--。
```
if (a[great] < pivot1) {
    a[k] = a[less];
    a[less] = a[great];
    ++less;
}
// 在if-else之后
a[great] = ak;
--great;
```
3. pivot1 <= a[great] <= pivot2（直接）：交换 a[k] 与 a[great]，且 great--。
```
else {
    a[k] = a[great];
}

// 在if-else之后
a[great] = ak;
--great;
```

④ 恢复

通过交换，将 pivot 值恢复到最终位置。

以 pivot1 为例：（pivot2 同理）

分析：
1. 分区操作前，pivot1（即 a[e2]）被临时赋值为 a[left]。
2. 数组中存在两个 a[left] 值，一个在 left，一个随着分区操作被排序。
3. pivot1 最终位置应是 [less - 1]，但不能直接覆盖此值。
操作步骤：
1. 将 a[less - 1] 移动到 a[left]，减少一个重复的 a[left] 值。
2. 将 pivot1 移动到 [less - 1]，pivot1 位置正确。
```
// pivot1
a[left]  = a[less - 1];
a[less - 1] = pivot1;

// pivot2
a[right] = a[great + 1];
a[great + 1] = pivot2;
```

2.3、递归

递归调用重载 sort()

left/right：对 left part 和 right part 进行排序，pivot 除外（位于 less - 1 和 great + 1）
```
sort(a, left, less - 2, leftmost);
sort(a, great + 2, right, false);
```
center：对 center part 进行排序。
```
sort(a, less, great, false);
```

3、大数组

3.1、检查数组

检查数组结构，是否接近排序状态。

本质：分组排序

遍历数组，寻找降序组。
若遇到降序组，对其升序排序，并 count++。
遍历结束，判断 count > MAX_RUN_COUNT（67）
1. 是：认为数组不具备结构，调用重载 sort()。
2. 否：归并排序。

TODO...

Hint：以上是 Arrays.sort() 的主要流程。

具体细节看源码。

posted @ 2022-03-07 01:19 Jaywee 阅读(59) 评论(0) 编辑收藏举报

刷新页面返回顶部

Jaywee