[LintCode] Word Abbreviation
Given an array of n distinct non-empty strings, you need to generate minimal possible abbreviations for every word following rules below.
- Begin with the first character and then the number of characters abbreviated, which followed by the last character.
- If there are any conflict, that is more than one words share the same abbreviation, a longer prefix is used instead of only the first character until making the map from word to abbreviation become unique. In other words, a final abbreviation cannot map to more than one original words.
- If the abbreviation doesn't make the word shorter, then keep it as original.
Notice
- Both n and the length of each word will not exceed 400.
- The length of each word is greater than 1.
- The words consist of lowercase English letters only.
- The return answers should be in the same order as the original array.
Example
Given dict = ["like", "god", "internal", "me", "internet", "interval", "intension", "face", "intrusion"]
return ["l2e","god","internal","me","i6t","interval","inte4n","f2e","intr4n"]
This problem is not hard not algorithmic wise, but implementation wise. To tackle this kind of problem,
going through a good example as simulation is sufficient enough to get the following working algorithm.
1. Create an arraylist that stores not-uniquely mapped string's index.
2. Initialize nonAbbrLen = 2, according to rule 3.
3. As long as there is still not-uniquely mapped string, repeat the following.
a.For each not-uniquely mapped string str, get its next level's abbreviation abbr and
store them in (abbr, str's index in dict[]).
b.After going through all the strings, add all uniquely mapped strings from
the hash map to the final result.
c.For all not uniquely mapped strings, add them to a new array list and assign
this list to unprocessedIndex.
d.Increase the next abbreviation level by 1 to further distinguish strings that
currently share the same abbreviation.
e. Repeat until unprocessdIndex is empty.
1 public class Solution { 2 public String[] wordsAbbreviation(String[] dict) { 3 if(dict == null || dict.length == 0){ 4 return new String[0]; 5 } 6 int n = dict.length; 7 String[] res = new String[n]; 8 List<Integer> unprocessedIndex = new ArrayList<Integer>(); 9 for(int i = 0; i < n; i++){ 10 unprocessedIndex.add(i); 11 } 12 int nonAbbrLen = 2; 13 while(unprocessedIndex.size() != 0){ 14 HashMap<String, ArrayList<Integer>> map = new HashMap<String, ArrayList<Integer>>(); 15 for(int i : unprocessedIndex){ 16 String curr = dict[i]; 17 String abbr = curr.length() <= (nonAbbrLen + 1) ? 18 curr : curr.substring(0, nonAbbrLen - 1) + (curr.length() - nonAbbrLen) 19 + curr.charAt(curr.length() - 1); 20 if(map.containsKey(abbr)){ 21 map.get(abbr).add(i); 22 } 23 else{ 24 ArrayList<Integer> list = new ArrayList<Integer>(); 25 list.add(i); 26 map.put(abbr, list); 27 } 28 } 29 List<Integer> next = new ArrayList<Integer>(); 30 for(String word : map.keySet()){ 31 if(map.get(word).size() == 1){ 32 for(int idx : map.get(word)){ 33 res[idx] = word; 34 } 35 } 36 else{ 37 for(int idx : map.get(word)){ 38 next.add(idx); 39 } 40 } 41 } 42 unprocessedIndex = next; 43 nonAbbrLen++; 44 } 45 return res; 46 } 47 }
The key point of the above solution is to use hashmap to determine if a certain abbreviation level can uniquely map to
only one word. If a map key points to a list of size 1, then we know this key uniquely maps to one word.
Related Problems