LeetCode 164. Maximum Gap[翻译]

164. Maximum Gap

164. 最大间隔

Given an unsorted array, find the maximum difference between the successive elements in its sorted form.

Return 0 if the array contains less than 2 elements.

给定一个未排序数组,找出该数组在有序形式时,连续元素之间最大的间隔。

Example 1:

Input: [3,6,9,1]
Output: 3
Explanation: The sorted form of the array is [1,3,6,9], either
             (3,6) or (6,9) has the maximum difference 3.
示例 1:
输入:[3,6,9,1]
输出:3
解释:数组排序后的形式是[1,3,6,9],(3,6) 或者 (6,9)都是最大的距离,即3

Example 2:

Input: [10]
Output: 0
Explanation: The array contains less than 2 elements, therefore return 0.

示例2:
输入:[10]
输出:0
解释:数组所含元素小于2个,因此返回0

Note:

  • You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.
  • Try to solve it in linear time/space.

注意:

  • 你可以假定数组中所有元素都是非负整数并且适用于32位有符号的整数范围。
  • 尝试在线性时间/空间内解决问题。

Solution

方案

Approach 3: Buckets and The Pigeonhole Principle

方式3:桶和鸽巢定理

Intuition

直觉

Sorting an entire array can be costly. At worst, it requires comparing each element with every other element. What if we didn't need to compare all pairs of elements? That would be possible if we could somehow divide the elements into representative groups, or rather, buckets. Then we would only need to compare these buckets.

对整个数组排序代价大。最坏的情况下,这需要让每个元素和其他元素进行比较。假设我们不需要比较所有邻近的元素呢?这是有可能的,如果我们能通过一种方式把元素划分到不同的组,或者说是桶。然后我们只需要对桶和桶进行比较。

Digression: The Pigeonhole Principle The Pigeonhole Principle states that if n items are put into m containers, with n > m, then at least one container must contain more than one item.

题外话:鸽巢原理 鸽巢原理规定,当n个物品被放入m个容器内,并且n>m时,得到至少有一个容器必然包含一个以上的物品。

Suppose for each of the n elements in our array, there was a bucket. Then each element would occupy one bucket. Now what if we reduced, the number of buckets? Some buckets would have to accommodate more than one element.

假设,对于数组中的n个元素的每一个元素,都有一个桶。然后,每个元素占据一个桶。现在假设我们减少,桶的数量?一些桶将不得不容纳超过一个元素。

Now let's talk about the gaps between the elements. Let's take the best case, where all elements of the array are sorted and have a uniform gap between them.This means every adjacent pair of elements differ by the same constant value. So for n elements of the array, there are n-1n1 gaps, each of width, say, tt. It is trivial to deduce that t = (max - min)/(n-1)(where max and min are the minimum and maximum elements of the array). This width is the maximal width/gap between two adjacent elements in the array; precisely the quantity we are looking for!

现在让我们讨论元素间的间隔。让我们考虑最好的情况,即数组所有的元素都排好序并且它们之间有统一的间隔。这意味着每对相邻的元素相差着相同的不变值。所以对于有n个元素的数组,应当有n-1个间隔,它们的宽度,即,t,易于推导出 t = (max - min) / (n - 1) (max 和 min是数组元素的最大值和最小值(根据原文进行了纠正))。这个宽度是数组中相邻元素的最大宽度/间隔。恰恰是我们要找的数。

One can safely argue that this value of tt, is in fact, the smallest value that tt can ever accomplish of any array with the same number of elements (i.e. n) and the same range (i.e. (max - min)). To test this fact, you can start with a uniform width array (as described above) and try to reduce the gap between any two adjacent elements. If you reduce the gap between arr[i-1] and arr[i] to some value t - p, then you will notice that the gap between arr[i] and arr[i+1] would have increased to t + p. Hence the maximum attainable gap would have become t + p from t. Thus the value of the maximum gap t can only increase.

有人可以很有把握的提出这个值,t,是事实上,具有相同元素个数(即,n)和相同取值范围(即,(max - min))能做到的最小的 t。为了测试这个事实,你可以开始一个统一宽度的数组(如上所述),然后尝试减少任一两个相邻元素的间隔。如果你减少了arr[i - 1] 和 arr[i] 的间隔为 t - p, 然后你会注意到 arr[i] 和 arr[i + 1]的间隔增加到了 t + p. 因此,最大可得到的间隔从 t 变成了t + p 

Buckets!

Coming back to our problem, we have already established by application of the Pigeonhole Principle, that if we used buckets instead of individual elements as our base for comparison, the number of comparisons would reduce if we could accommodate more than one element in a single bucket. That does not immediately solve the problem though. What if we had to compare elements within a bucket? We would end up no better.

回到我们的问题,我们已经通过使用鸽巢原理建立起来,即如果我们使用桶代替单独的元素作为我们比较的基准,比较的数量将减少如果我们能使超过一个元素放入单个桶内。尽管这不会立即解决问题。如果我们不得不比较桶内元素怎么办?我们将得到不会更好的结果。

So the current motivation remains: somehow, if we only had to compare among the buckets, and not the elements within the buckets, we would be good. It would also solve our sorting problem: we would just distribute the elements to the right buckets. Since the buckets can be already ordered, and we only compare among buckets, we wouldn't have to compare all elements to sort them!

所以,目前的动机依然是:以某种方式,如果我们仅仅是不得不在桶间进行比较,并且不是比较桶内元素,我们将得到不错的结果。它也将解决我们的排序问题:我们将仅需把元素分配到正确的桶内。由于桶可以是已经有序的,并且我们只在桶间比较,所以,我们将不至于不得不比较所有元素去排序。

But if we only had buckets to compare, we would have to ensure, that the gap between the buckets itself represent the maximal gap in the input array. How do we go about doing that?

但是,如果我们仅对桶进行比较,我们将不得不保证,桶之间的间隔本身代表整个输入数组的最大间隔。我们如何做到这个呢?

We could do that just by setting the buckets to be smaller than t = (max - min)/(n-1) (as described above). Since the gaps (between elements) within the same bucket would only be t, we could deduce that the maximal gap would indeed occur only between two adjacent buckets.

我们可以实现它,通过设置桶大小小于 t = (max - min) / (n - 1) (如上所示). 由于同一桶内元素间的间隔只能 <= t, 我们可以推断出,最大的间隔将的确出现仅出现在两个相邻的桶之间

Hence by setting bucket size b to be 1 < b (maxmin)/(n1), we can ensure that at least one of the gaps between adjacent buckets would serve as the maximal gap.

因此,通过设置桶大小b为 1 < b <= (max - min) / (n - 1). 我们能确保至少有一个相邻的桶之间的间隔将是最大间隔

Clarifications

说明

A few clarifications are in order:

以下是一些说明:

  • Would the buckets be of uniform size? Yes. Each of them would be of the same size b.

  • 桶会是统一的大小吗?是的,任一一个都是同样的大小 b.
  • But, then wouldn't the gap between them be uniform/constant as well? Yes it would be. The gap between them would be 1 integer unit wide. That means a two adjacent buckets of size 3 could hold integers with values, say, 3 - 6 and 7 - 9. We avoid overlapping buckets.

  • 但是,桶之间的间隔也不会是统一/相同的大小吗?是的,它会是(统一的大小)。桶之间的间隔是1个整数单位宽度。这意味着一个两个相邻的大小为3的桶可以装入整数,比如,3-6 和 7-9. 我们避免互相覆盖的桶。
  • Then what are you talking about when you say the gap between two adjacent buckets could be the maximal gap? When we are talking about the size of a bucket, we are talking about its holding capacity. That is the range of values the bucket can represent (or hold). However the actual extent of the bucket are determined by the values of the maximum and minimum element a bucket holds. For example a bucket of size 5 could have a capacity to hold values between 6 - 10. However, if it only holds the elements 7,8 and 9, then its actual extent is only (9 - 7) + 1 = 3 which is not the same as the capacity of the bucket.

  • 然后,当你说两个相邻的桶的间隔是最大间隔的时候是在说什么呢?当我们讨论桶大小时,我们将讨论桶的容量。即桶能表示(容纳)的值的范围。然而,桶的实际长度由桶容纳的元素的最大值和最小值决定的。举例,大小为5的桶拥有的容量能容纳值6-10.然后,如果它值容纳了元素7,8和9,然后它的实际长度仅仅是(9 - 7) + 1 = 3 这和桶的实际容量不一样。
  • Then how do you compare adjacent buckets? We do that by comparing their extents. Thus we compare the minimum element of the next bucket to the maximum element of the current bucket. For example: if we have two buckets of size 5 each, holding elements [1,2,3] and [9, 10] respectively, then the gap between the buckets would essentially refer to the value 9 - 3 = 6 (which is larger than the size of either bucket).

  • 然后,你如何比较相邻的桶?我们通过比较桶的长度去实现它。因此我们比较下一个桶的最小元素和当前桶的最大元素。举例:如果我们有两个大小都是 5 的桶,分别拥有元素[1, 2, 3] 和 [9, 10],然后两个桶之间的间隔必然指向值 9 - 3 = 6 (该值大于两个桶的任一一个桶的大小)
  • But then aren't we comparing elements again?! We are, yes! But only compare about twice the elements as the number of buckets (i.e. the minimum and maximum elements of each bucket). If you followed the above, you would realize that this amount is certainly less than the actual number of elements in the array, given a suitable bucket size was chosen.

  • 但是,然后我们不再比较元素了吗?是的,我们会比较。但是,仅比较两倍于桶数量的次数(即,每个桶的最小和最大元素)。如果你按照上面的做,选一个合适大小的桶大小,你将意识到(比较的)数量当然少于实际数组元素的数量

Algorithm

算法

  • We choose a bucket size b such that 1 < b (maxmin)/(n1). Let's just choose b=(maxmin)/(n1)⌋.

  • 我们选择桶大小 b 满足 1 < b <= (max - min) / (n - 1). 让我们选择 b = ⌊(maxmin)/(n1)⌋.
  • Thus all the n elements would be divided among k=(maxmin)/b⌉ buckets.

  • 因此,所有 n 个元素将被划分到 ⌈(maxmin)/b⌉ 个桶中
  • Hence the i^{th} bucket would hold the range of values:[min+(i1)b, min+ib) (1-based indexing).

  • 因此,第 i 个桶拥有的值范围是: [min+(i1)b, min+ib) (1开始的索引).
  • It is trivial to calculate the index of the bucket to which a particular element belongs. That is given by (nummin)/b⌋ (0-based indexing) where num is the element in question.

  • 易于计算出某个元素所属于的桶。通过  (nummin)/b⌋ (0开始的索引)计算,其中 num 就是某个元素。
  • Once all n elements have been distributed, we compare k-1 adjacent bucket pairs to find the maximum gap.
  • 一旦全部 n 个元素都被分配完,我们比较 k - 1 个相邻的桶对找出最大间隔。
class Bucket {
public:
    bool used = false;
    int minval = numeric_limits<int>::max();        // same as INT_MAX
    int maxval = numeric_limits<int>::min();        // same as INT_MIN
};

int maximumGap(vector<int>& nums)
{
    if (nums.empty() || nums.size() < 2)
        return 0;

    int mini = *min_element(nums.begin(), nums.end()),
        maxi = *max_element(nums.begin(), nums.end());

    int bucketSize = max(1, (maxi - mini) / ((int)nums.size() - 1));        // bucket size or capacity
    int bucketNum = (maxi - mini) / bucketSize + 1;                         // number of buckets
    vector<Bucket> buckets(bucketNum);

    for (auto&& num : nums) {
        int bucketIdx = (num - mini) / bucketSize;                          // locating correct bucket
        buckets[bucketIdx].used = true;
        buckets[bucketIdx].minval = min(num, buckets[bucketIdx].minval);
        buckets[bucketIdx].maxval = max(num, buckets[bucketIdx].maxval);
    }

    int prevBucketMax = mini, maxGap = 0;
    for (auto&& bucket : buckets) {
        if (!bucket.used)
            continue;

        maxGap = max(maxGap, bucket.minval - prevBucketMax);
        prevBucketMax = bucket.maxval;
    }

    return maxGap;
}
//Java version, @author: Daneil Tan
class Bucket {
    public int max = Integer.MIN_VALUE;
    public int min = Integer.MAX_VALUE;
}
class Solution {
        public int maximumGap(int[] nums) {
            if (nums.length < 2) return 0;
            int max = Arrays.stream(nums).max().getAsInt();
            int min = Arrays.stream(nums).min().getAsInt();
            //Bucket size or capacity
            int bucketSize = Math.max(1, (max - min) / (nums.length - 1));
            //Number of buckets 
            int bucketNum = (max - min) / bucketSize + 1;

            Bucket[] buckets = new Bucket[bucketNum];

            for (int num : nums) {
                int bucketIdx = (num - min) / bucketSize;
                if (buckets[bucketIdx] == null) buckets[bucketIdx] = new Bucket();
                buckets[bucketIdx].max = Math.max(buckets[bucketIdx].max, num);
                buckets[bucketIdx].min = Math.min(buckets[bucketIdx].min, num);
            }

            int maxGap = 0, prevBucketMax = min;
            for (int i = 0; i < buckets.length; i++) {
                if (buckets[i] == null) continue;
                if (buckets[i].min - prevBucketMax > maxGap) 
                    maxGap = buckets[i].min - prevBucketMax;
                prevBucketMax = buckets[i].max;
            }
            return maxGap;
        }
}

  

Complexity Analysis

复杂度分析

  • Time complexity: O(n+b)O(n).

  • 时间复杂度:O(n+b)O(n).

  Distributing the elements of the array takes one linear pass (i.e. O(n) complexity). Finding the maximum gap among the buckets takes a linear pass over the bucket storage (i.e. O(b) complexity). Hence overall process takes linear runtime.

  分配数组的元素花费一次线性遍历(即,O(n)复杂度). 找到桶间的最大间隔在桶大小上花费了一次线性遍历(即,O(b)复杂度)。因此,整个过程花费了线性运行时间。

  • Space complexity: O(2b)O(b) extra space.

  • 空间复杂度:O(2b)O(b) 额外空间.

  Each bucket stores a maximum and a minimum element. Hence extra space linear to the number of buckets is required.

  每个桶存储了最大元素和最小元素。因此,和桶大小一样的额外空间是必要的。

 

英文原文链接:https://leetcode.com/problems/maximum-gap/solution/

posted @ 2020-08-02 22:20  tanliwei  阅读(238)  评论(0编辑  收藏  举报