164. Maximum Gap
题目:
Given an unsorted array, find the maximum difference between the successive elements in its sorted form.
Try to solve it in linear time/space.
Return 0 if the array contains less than 2 elements.
You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.
链接: http://leetcode.com/problems/maximum-gap/
题解:
给定未排序数组,求两个相邻元素间的差的最大值。想了一会不确定,于是去看了一眼Discuss的标题,果然是用Radix Sort。其实一开始看到题目里面所有数字都可以被表示为32-bit signed integer时,就应该要想到才对。 估计做法就是用Radix Sort,排序完毕后再扫描一边数组。正好加深一下对Radix Sort的理解和书写。要好好研究一下Key-indexed Counting,LSD,MSD和Bucket Sort。这道题用Bucket Sort的话就是把每个元素放入一个bucket中,然后计算非空bucket间的最大距离,原理都差不多。一刷用了LSD,代码大都参考Sedgewick大神,二刷的话要试一试MSD以及bucket sort。注意的一点是LSD处理最高8位bit时, 0 ~ 127为正, 128 ~ 255为负,所以要特殊处理一下accumulated count数组。做完这题以后就发现上过算法2真好...
LSD Radix Sort, Time Complexity - O(n), Space Complexity - O(n)
public class Solution { public int maximumGap(int[] nums) { if(nums == null || nums.length < 2) return 0; lsdSort(nums); int maxDiff = Integer.MIN_VALUE; for(int i = 1; i < nums.length; i++) maxDiff = Math.max(nums[i] - nums[i - 1], maxDiff); //may overflow return maxDiff; } private void lsdSort(int[] nums) { int len = nums.length; int[] aux = new int[nums.length]; int totalBits = 32; int bitsPerByte = 8; int window = totalBits / bitsPerByte; int R = 1 << bitsPerByte; // R is radix int mask = R - 1; // 11111111 for(int d = 0; d < window; d++) { // for each window compared int[] count = new int[R + 1]; for(int i = 0; i < len; i++) { // count least significant 8 bits of each num in nums int c = (nums[i] >> (d * bitsPerByte)) & mask; // use mask to get least significant 8 bits count[c + 1]++; } for(int i = 0; i < R; i++) // count accumulated position for each nums count[i + 1] += count[i]; if(d == window - 1) { // for most significant 8 bits, 0 ~ 127 is pos, 128 ~ 255 is neg, so 128 ~ 255 goes first int shift1 = count[R] - count[R / 2]; int shift2 = count[R / 2]; for(int i = 0; i < R / 2; i++) count[i] += shift1; for(int i = R / 2; i < R; i++) count[i] -= shift2; } for(int i = 0; i < len; i++) { // move data int c = (nums[i] >> (d * bitsPerByte)) & mask; aux[count[c]++] = nums[i]; } for(int i = 0; i < len; i++) // copy back nums[i] = aux[i]; } } }
二刷:
依然使用了LSD radix sort,代码大都来自塞神的index counting。
- 我们把32位的数字分为4个batch,每个batch有8个digit,使用LSD的话我们是从最低一级的batch开始计算。这里我们要使用一个mask = 255,也就是二进制的11111111来找到每个batch。也要做一个auxiliary array - aux[] 用来保存临时结果
- 对于每个batch,我们都要做一次index counting sort,一共四次,从最低8位开始,到最高8位
- 每次确定了batch之后我们就可以创建buckets数组。这里buckets数组 count = new int[R + 1] , 这里的 + 1很重要
- 我们第一次遍历数组,根据最低8位,统计所有数字出现的频率,然后放入相应的bucket里,这里有个小trick很关键,我们虽然将低8位映射到从[0 - 255]这样的256个bucket里,但写入count数组的时候,我们有意把count[0]留下,把数字的count写到 [1 - 256]的bucket中。这样做的好处,要在后面才能看到,就是从经过aggregation后的count数组,将排序后的数字填入到aux[]数组中时。
- 然后,我们对统计好的数字和频率进行aggregation,利用count数组求一个accumulated count。经历过这一步以后,我们的count[i]代表的意思就是,有多少个数字应该被放在 i 之前的buckets里。
- 接下来,我们根据aggregation的结果,我们把这些数字填到aux[]里
- 最后把aux[]中的数字拷贝回nums中,继续计算下一个batch。 因为LSD的操作是stable的,所以我们进行完全部4个batch以后就会得到最终结果
- 这里要注意的一点是,32位数字最高8位digit的这个batch和其他三个不同。在这8位里0 ~ 127为正数,而128 ~ 255为负数。所以在这个batch里,计算完aggregation数组后,我们要做一个很巧妙的shift操作
- 先求出shift1 = count[R] - count[R / 2], 也就是在输入的nums[]中有多少个数字为负
- 再求出shift2 = count[R / 2], 输入的nums[]中有多少个正数
- 对于[0 - 127]的这些数字,他们都应该排在负数前面,所以我们对每一个count[i] 增加 shift1
- 对于[128 - 255]的这些数字,他们都应该排在正数后面,所以我们对每一个count[i] 减少 shift2
- 之后再根据新的count[]数组,把数字放入aux[]数组中,再拷贝回原数组nums[]里。
Java:
Time Complexity - O(n), Space Complexity - O(n)
public class Solution { public int maximumGap(int[] nums) { // LSD LSDradix(nums); int maxGap = 0; for (int i = 1; i < nums.length; i++) maxGap = Math.max(maxGap, nums[i] - nums[i - 1]); return maxGap; } private void LSDradix(int[] nums) { if (nums == null || nums.length == 0) return; int batchSize = 8; int batchNum = 32 / batchSize; int R = 1 << batchSize; // radix, each bucket we consider only [0 - 255] , 256 numbers int mask = R - 1; // 11111111, each batch 8 digits int len = nums.length; int[] aux = new int[len]; for (int d = 0; d < batchNum; d++) { int[] count = new int[R + 1]; for (int num : nums) { int idx = (num >> (d * batchSize)) & mask; count[idx + 1]++; // index counting } for (int k = 1; k < count.length; k++) count[k] += count[k - 1]; // aggregation, find out the buckets boundaries if (d == batchNum - 1) { // for first 8 digits 01111111 is max, 11111111 is min, we shift nums int shift1 = count[R] - count[R / 2]; int shift2 = count[R / 2]; for (int k = 0; k < R / 2; k++) count[k] += shift1; for (int k = R / 2; k < R; k++) count[k] -= shift2; } for (int num : nums) { int idx = (num >> (d * batchSize)) & mask; aux[count[idx]++] = num; // we place each num in each buckets, from the start slot of the bucket } for (int k = 0; k < len; k++) nums[k] = aux[k]; } } }
Reference:
https://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf
https://leetcode.com/discuss/18499/bucket-sort-java-solution-with-explanation-o-time-and-space
https://leetcode.com/discuss/18487/i-solved-it-using-radix-sort
https://leetcode.com/discuss/19951/use-average-gap-to-achieve-o-n-run-time-java-solution
https://leetcode.com/discuss/34289/pigeon-hole-principle
https://leetcode.com/discuss/53636/radix-sort-solution-in-java-with-explanation