Algorithm | Sort
Bubble sort
Bubble sort, sometimes incorrectly referred to as sinking sort, is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort. Although the algorithm is simple, most of the other sorting algorithms are more efficient for large lists.
Selection sort
selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2) time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and it has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited.
The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. Initially, the sorted sublist is empty and the unsorted sublist is the entire input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right.
Heapsort
Heapsort is a comparison-based sorting algorithm. Heapsort is part of the selection sort family; it improves on the basic selection sort by using a logarithmic-time priority queue rather than a linear-time search. Although somewhat slower in practice on most machines than a well-implemented quicksort, it has the advantage of a more favorable worst-case O(n log n) runtime. Heapsort is an in-place algorithm, but it is not a stable sort.
The heapsort algorithm can be divided into two parts.
In the first step, a heap is built out of the data. The heap is often placed in an array with the layout of a complete binary tree. The complete binary tree maps the binary tree structure into the array indices; each array index represents a node; the index of the node's parent, left child branch, or right child branch are simple expressions. For a zero-based array, the root node is stored at index 0; if i is the index of the current node, then
iParent = floor((i-1) / 2) iLeftChild = 2*i + 1 iRightChild = 2*i + 2
建堆的开销是O(n)。这个是可以证明的。
从倒数第二层往上建堆,假设堆高为h,那么第h-1层有\(2^{h-2}\)个结点,需要调整1次。第h-i层有\(2^{h - i - 1}\)个结点,需要调整i次。所以整个开销就是:
\(1 \times 2^{h-2} + 2 \times 2^{h -3} + \cdots + (h - 1) \times 2^0\) ,拆成多个等比序列,然后用等比序列的求和公式可以得到结果为
\( 2^h - h - 1\), 因为h=O(lgn),所以结果应该就是O(n).
In the second step, a sorted array is created by repeatedly removing the largest element from the heap (the root of the heap), and inserting it into the array. The heap is updated after each removal to maintain the heap. Once all objects have been removed from the heap, the result is a sorted array. Heapsort can be performed in place. The array can be split into two parts, the sorted array and the heap.
Insertion sort
Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:
- Simple implementation
- Efficient for (quite) small data sets
- Adaptive (i.e., efficient) for data sets that are already substantially sorted: the time complexity is O(n + d), where d is the number of inversions
- More efficient in practice than most other simple quadratic (i.e., O(n2)) algorithms such as selection sort or bubble sort; the best case (nearly sorted input) is O(n)
- Stable; i.e., does not change the relative order of elements with equal keys
- In-place; i.e., only requires a constant amount O(1) of additional memory space
- Online; i.e., can sort a list as it receives it
Insertion sort iterates, consuming one input element each repetition, and growing a sorted output list. Each iteration, insertion sort removes one element from the input data, finds the location it belongs within the sorted list, and inserts it there. It repeats until no input elements remain.
Quicksort
Quicksort, or partition-exchange sort, is a sorting algorithm developed by Tony Hoare that, on average, makes O(n log n) comparisons to sort n items. In the worst case, it makes O(n2) comparisons, though this behavior is rare. Quicksort is often faster in practice than other O(n log n) algorithms. Additionally, quicksort's sequential and localized memory references work well with a cache. Quicksort is a comparison sort and, in efficient implementations, is not a stable sort. Quicksort can be implemented with an in-place partitioning algorithm, so the entire sort can be done with only O(log n) additional space used by the stack during the recursion.
Quicksort is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-array: the low elements and the high elements. Quicksort can then recursively sort the sub-arrays.
The steps are:
- Pick an element, called a pivot, from the array.
- Reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
- Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values.
Merge sort
In computer science, merge sort (also commonly spelled mergesort) is an O(n log n) comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the implementation preserves the input order of equal elements in the sorted output. Mergesort is a divide and conquer algorithm that was invented by John von Neumann in 1945. A detailed description and analysis of bottom-up mergesort appeared in a report by Goldstine and Neumann as early as 1948.
Conceptually, a merge sort works as follows:
- Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
- Repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining. This will be the sorted list.
1 class Sort { 2 public: 3 Sort() { 4 int n = 10; 5 srand(time(NULL)); 6 n = rand() % 100 + 10; 7 for (int i = 0; i < n; ++i) { 8 int v = rand() % 1000; 9 data.push_back(v); 10 data.push_back(v); 11 data.push_back(v); 12 data.push_back(v); 13 } 14 } 15 16 void bubbleSort() { 17 int n = data.size(); 18 for (int i = 0; i < n - 1; ++i) { 19 bool swapped = false; // optimized 20 for (int j = 1; j < n - i; ++j) { 21 if (data[j] >= data[j - 1]) continue; 22 swap(data[j - 1], data[j]); 23 swapped = true; 24 } 25 if (!swapped) return; // optimized 26 } 27 } 28 29 void insertSort() { 30 for (int i = 0; i < data.size(); ++i) { 31 int tmp = data[i]; 32 int j = i - 1; 33 while (j >= 0 && data[j] >= tmp) { 34 data[j + 1] = data[j]; 35 j--; 36 } 37 data[j + 1] = tmp; 38 } 39 } 40 41 void selectSort() { 42 int n = data.size(); 43 for (int i = 0; i < n - 1; ++i) { 44 int min = i; 45 for (int j = i + 1; j < n; ++j) { 46 if (data[j] < data[min]) min = j; 47 } 48 if (min != i) { 49 swap(data[min], data[i]); 50 } 51 } 52 } 53 54 void heapSort() { 55 int n = data.size(); 56 for (int i = n / 2 - 1; i >= 0; --i) { 57 adjust(i, n - 1); 58 } 59 for (int i = 0; i < n - 1; ++i) { 60 swap(data[0], data[n - i - 1]); 61 adjust(0, n - i - 2); 62 } 63 } 64 65 void quickSort() { 66 qSortHelper(0, data.size() - 1); 67 } 68 69 void mergeSort() { 70 //mSortHelper(0, data.size() - 1); 71 mSortHelper2(0, data.size() - 1); 72 } 73 74 void print() const { 75 for (int i = 0; i < data.size(); ++i) 76 cout << data[i] << " "; 77 cout << endl; 78 } 79 80 bool isSorted() const { 81 for (int i = 1; i < data.size(); ++i) 82 if (data[i] < data[i - 1]) return false; 83 return true; 84 } 85 86 int operator[](int index) { 87 if (index < 0 || index >= data.size()) return -1; 88 else return data[index]; 89 } 90 91 int size() const { return data.size(); } 92 private: 93 void swap(int& a, int& b) { 94 int t = a; 95 a = b; 96 b = t; 97 } 98 99 void adjust(int start, int end) { 100 if (start >= end) return; 101 int i = start; 102 int tmp = data[start]; 103 while (i < end) { // this condition can be changed to i * 2 + 1 <= end 104 int left = i * 2 + 1; 105 if (left > end) break; 106 int right = i * 2 + 2; 107 int max = left; 108 if (right <= end && data[max] < data[right]) max = right; 109 if (tmp >= data[max]) break; // bug, must compare with data[start], not data[i] 110 data[i] = data[max]; 111 i = max; 112 } 113 data[i] = tmp; 114 } 115 116 void qSortHelper(int start, int end) { 117 if (start >= end) return; 118 int i = start + 1, j = end; 119 while (i < j) { 120 while (j > start && data[j] >= data[start]) j--; 121 while (i <= end && data[i] <= data[start]) i++; 122 if (i >= j) break; 123 swap(data[i], data[j]); 124 } 125 if (data[j] <= data[start]) swap(data[j], data[start]); // when there are > 2 numbers left 126 qSortHelper(start, j - 1); 127 qSortHelper(j + 1, end); 128 } 129 130 // two consecutive ranges: [l1, r1] [l2, r2] 131 void mergeRange(int l1, int r1, int l2, int r2) { 132 vector<int> copy; 133 int i = l1, j = l2; 134 while (i <= r1 && j <= r2) { 135 if (data[i] < data[j]) { 136 copy.push_back(data[i]); 137 i++; 138 } else { 139 copy.push_back(data[j]); 140 j++; 141 } 142 } 143 144 while (i <= r1) { 145 copy.push_back(data[i]); 146 i++; 147 } 148 149 while (j <= r2) { 150 copy.push_back(data[j]); 151 j++; 152 } 153 for (int i = l1; i <= r2; ++i) { // bug here 154 data[i] = copy[i - l1]; 155 } 156 } 157 158 // top down merge 159 void mSortHelper(int start, int end) { 160 if (end - start < 1) return; 161 if (end - start == 1 && data[start] > data[end]) { 162 swap(data[start], data[end]); 163 return; 164 } 165 int mid = (end + start) / 2; 166 mSortHelper(start, mid); 167 mSortHelper(mid + 1, end); 168 mergeRange(start, mid, mid + 1, end); // reconstruct 169 } 170 171 // bottom up merge 172 void mSortHelper2(int start, int end) { 173 for (int width = 1; width < data.size(); width <<= 1) { 174 for (int i = 0; i < data.size(); i += (width << 1)) { 175 int end = i + (width << 1) - 1; 176 if (end >= data.size()) end = data.size(); 177 mergeRange(i, i + width - 1, i + width, end); 178 } 179 } 180 } 181 182 vector<int> data; 183 };