CareerCup: 17.14 minimize unrecognized characters
Oh, no! You have just completed a lengthy document when you have an unfortu- nate Find/Replace mishap. You have accidentally removed all spaces, punctuation, and capitalization in the document. A sentence like "I reset the computer. It still didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you can add back in the punctation and capitalization later, once you get the individual words properly separated. Most of the words will be in a dictionary, but some strings, like proper names, will not. Given a dictionary (a list of words), design an algorithm to find the optimal way of "unconcatenating" a sequence of words. In this case, "optimal" is defined to be the parsing which minimizes the number of unrecognized sequences of characters. For example, the string "jesslookedjustliketimherbrother" would be optimally parsed as "JESS looked just like TIM her brother". This parsing has seven unrecognized char- acters, which we have capitalized for clarity.
这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。
有了维护量,现在需要考虑转移方程,如下:
int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);
亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:
1 package fib; 2 3 import java.util.Arrays; 4 import java.util.HashSet; 5 import java.util.Set; 6 7 public class unconcatenating { 8 public int optway(String s, Set<String> dict) { 9 if (s==null || s.length()==0) return 0; 10 int len = s.length(); 11 if (dict.isEmpty()) return len; 12 int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars 13 Arrays.fill(res, Integer.MAX_VALUE); 14 res[0] = 0; 15 for (int i=1; i<=len; i++) { 16 for (int j=0; j<i; j++) { 17 String str = s.substring(j, i); 18 int unrecogNum = dict.contains(str)? 0 : i-j; 19 res[i] = Math.min(res[i], res[j]+unrecogNum); 20 } 21 } 22 return res[len]; 23 } 24 25 26 public static void main(String[] args) { 27 unconcatenating example = new unconcatenating(); 28 Set<String> dict = new HashSet<String>(); 29 dict.add("reset"); 30 dict.add("the"); 31 dict.add("computer"); 32 dict.add("it"); 33 dict.add("still"); 34 dict.add("didnt"); 35 dict.add("boot"); 36 int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict); 37 System.out.print("opt # of unrecognized chars is "); 38 System.out.println(result); 39 } 40 41 }
output是:opt # of unrecognized chars is 8