398. Random Pick Index
Given an array of integers with possible duplicates, randomly output the index of a given target number. You can assume that the given target number must exist in the array.
Note:
The array size can be very large. Solution that uses too much extra space will not pass the judge.
Example:
int[] nums = new int[] {1,2,3,3,3}; Solution solution = new Solution(nums); // pick(3) should return either index 2, 3, or 4 randomly. Each index should have equal probability of returning. solution.pick(3); // pick(1) should return 0. Since in the array only nums[0] is equal to 1. solution.pick(1);
1 public class Solution { 2 3 int[] nums; 4 Random r = new Random(); 5 6 public Solution(int[] nums) { 7 this.nums = nums; 8 } 9 10 public int pick(int target) { 11 ArrayList<Integer> idxs = new ArrayList<Integer>(); 12 for (int i = 0; i < nums.length; i++) { 13 if (target == nums[i]) { 14 idxs.add(i); 15 } 16 } 17 return idxs.get(r.nextInt(idxs.size())); 18 } 19 }
Simple Reservior Sampling approach
1 public class Solution { 2 3 int[] nums; 4 Random rnd; 5 6 public Solution(int[] nums) { 7 this.nums = nums; 8 this.rnd = new Random(); 9 } 10 11 public int pick(int target) { 12 int result = -1; 13 int count = 0; 14 for (int i = 0; i < nums.length; i++) { 15 if (nums[i] != target) 16 continue; 17 if (rnd.nextInt(++count) == 0) 18 result = i; 19 } 20 21 return result; 22 } 23 }
Simple Reservior Sampling
Suppose we see a sequence of items, one at a time. We want to keep a single item in memory, and we want it to be selected at random from the sequence. If we know the total number of items (n), then the solution is easy: select an index ibetween 1 and n with equal probability, and keep the i-th element. The problem is that we do not always know n in advance. A possible solution is the following:
- Keep the first item in memory.
- When the i-th item arrives (for {\displaystyle i>1}):
- with probability {\displaystyle 1/i}, keep the new item (discard the old one)
- with probability {\displaystyle 1-1/i}, keep the old item (ignore the new one)
So:
- when there is only one item, it is kept with probability 1;
- when there are 2 items, each of them is kept with probability 1/2;
- when there are 3 items, the third item is kept with probability 1/3, and each of the previous 2 items is also kept with probability (1/2)(1-1/3) = (1/2)(2/3) = 1/3;
- by induction, it is easy to prove that when there are n items, each item is kept with probability 1/n.