随笔- 509 文章- 0 评论- 151 阅读- 22万

《Cracking the Coding Interview》——第17章：普通题——题目14

2014-04-29 00:20

题目：给定一个长字符串，和一个词典。如果允许你将长串分割成若干个片段，可能会存在某些片段在词典里查不到，有些则查得到。请设计算法进行分词，使得查不到的片段个数最少。

解法：用空间换取时间的动态规划算法，首先用O(n^2)的时间判断每一个片段是否在字典里。这个过程其实可以通过字典树来进行加速，时间上能优化一个阶，不过我没写，偷懒用<unordered_set>代表了字典。之后通过O(n)时间的动态规划，dp[i]表示当前位置的查不到的片段的最少个数。对于懂代码的人，代码说的比文字清楚，所以请看代码。

代码：

 1 // 17.14 Given a dictionary of words, and a long string. You may find a way to cut the string into words, where some of them may or may not be in the dictionary.
 2 // Dynamic programming is a good thing, but trades space in for time.
 3 #include <iostream>
 4 #include <string>
 5 #include <unordered_set>
 6 #include <vector>
 7 using namespace std;
 8 
 9 int main()
10 {
11     string data;
12     unordered_set<string> dict;
13     vector<vector<bool> > contains;
14     vector<int> dp;
15     int i, j;
16     string s;
17     int n;
18     int tmp;
19     
20     while (cin >> data && data != "") {
21         cin >> n;
22         for (i = 0; i < n; ++i) {
23             cin >> s;
24             dict.insert(s);
25         }
26         n = (int)data.length();
27         
28         contains.resize(n);
29         for (i = 0; i < n; ++i) {
30             contains[i].resize(n);
31         }
32         for (i = 0; i < n; ++i) {
33             s = "";
34             for (j = i; j < n; ++j) {
35                 s.push_back(data[j]);
36                 contains[i][j] = (dict.find(s) != dict.end());
37             }
38         }
39         
40         dp.resize(n);
41         for (i = 0; i < n; ++i) {
42             dp[i] = contains[0][i] ? 0 : i + 1;
43             for (j = 0; j < i; ++j) {
44                 tmp = dp[j] + (contains[j + 1][i] ? 0 : i - j);
45                 dp[i] = dp[i] < tmp ? dp[i] : tmp;
46             }
47         }
48         
49         printf("%d\n", dp[n - 1]);
50         
51         for (i = 0; i < n; ++i) {
52             contains[i].clear();
53         }
54         contains.clear();
55         dp.clear();
56         dict.clear();
57     }
58     
59     return 0;
60 }

posted on 2014-04-29 00:29 zhuli19901106 阅读(411) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 全程不用写代码，我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了，比商业数据库还牛
· .NET10 - 预览版1新功能体验（一）

公告

昵称： zhuli19901106
园龄： 11年4个月
粉丝： 274
关注： 9

+加关注

2025年3月

日

一

二

三

四

五

六

随笔档案 (508)

2015年10月(1)

2015年7月(2)

2015年5月(6)

2015年4月(12)

2015年3月(1)

2015年2月(1)

2015年1月(15)

2014年8月(1)

2014年7月(18)

2014年6月(22)

2014年5月(83)

2014年4月(73)

2014年3月(73)

2014年2月(62)

2014年1月(34)

公告

搜索

常用链接

我的标签

随笔档案 (508)

阅读排行榜

评论排行榜

推荐排行榜

最新评论