Leetcode: Word Break
Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words. Note: The same word in the dictionary may be reused multiple times in the segmentation. You may assume the dictionary does not contain duplicate words. Example 1: Input: s = "leetcode", wordDict = ["leet", "code"] Output: true Explanation: Return true because "leetcode" can be segmented as "leet code". Example 2: Input: s = "applepenapple", wordDict = ["apple", "pen"] Output: true Explanation: Return true because "applepenapple" can be segmented as "apple pen apple". Note that you are allowed to reuse a dictionary word. Example 3: Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"] Output: false
首先我们要决定要存储什么历史信息以及用什么数据结构来存储信息。然后是最重要的递推式,就是如从存储的历史信息中得到当前步的结果。最后我们需要考虑的就是起始条件的值。
接下来我们套用上面的思路来解这道题。首先我们要存储的历史信息res[i]是表示到字符串s的第i个元素为止能不能用字典中的词来表示,我们需要一个长度为n的布尔数组来存储信息。然后假设我们现在拥有res[0,...,i-1]的结果,我们来获得res[i]的表达式。思路是对于每个以i为结尾的子串,看看他是不是在字典里面以及他之前的元素对应的res[j]是不是true,如果都成立,那么res[i]为true,写成式子是
假设总共有n个字符串,并且字典是用HashSet来维护,那么总共需要n次迭代,每次迭代需要一个取子串的O(i)操作,然后检测i个子串,而检测是constant操作。所以总的时间复杂度是O(n^2)(i的累加仍然是n^2量级),而空间复杂度则是字符串的数量,即O(n)。代码如下:
1 class Solution { 2 public boolean wordBreak(String s, List<String> wordDict) { 3 Set<String> set = new HashSet<>(); 4 for (String word : wordDict) { 5 set.add(word); 6 } 7 boolean[] dp = new boolean[s.length() + 1]; 8 dp[0] = true; 9 for (int i = 1; i < dp.length; i ++) { 10 for (int j = 0; j < i; j ++) { 11 if (dp[j] && set.contains(s.substring(j, i))) { 12 dp[i] = true; 13 break; 14 } 15 } 16 } 17 return dp[dp.length - 1]; 18 } 19 }