LeetCode解题思路：3. Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters.

Examples:

Given "abcabcbb", the answer is "abc", which the length is 3.

Given "bbbbb", the answer is "b", with the length of 1.

Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

题意：给定字符串，找出最长不重复子串，并返回长度，注意不是子序列。

基本思路：子串就是连续的，子序列就是不连续也可以。当然这道题如果找子序列也容易的很。就是做个哈希表，然后去表里查这个字符是否已经存在，如果不存在就长度+1，最后返回长度。求连续的也可以按照这个思路来，就是找子串中有没有重复的字符，如果没有子串增长，如果有就从子串的第二个字符开始重新构造子串。

参考代码一（不是我写的）：

 1 class Solution {
 2 public:  
 3     bool hasSameChar(string s)
 4     {
 5         if(s=="")
 6             return false;
 7         map<char,bool> exist2bool;
 8         for(const char & c:s)
 9         {
10             if(exist2bool.count(c)==0)
11                 exist2bool[c]=1;
12             else
13                 return true;
14         }
15         return false;
16     }
17     int lengthOfLongestSubstring(const string& s) {
18         int from=0,to=0;
19         int maxLen=0;
20         while(to!=s.size())
21         {
22             while(from>to)
23                  to++;            
24             if(!hasSameChar(s.substr(from,to-from+1)))
25             {
26                 if((to-from+1)>maxLen)
27                     maxLen=to-from+1;
28                 to++;
29             }
30             else
31             {
32                 from++;
33             }
34         }
35         return maxLen;
36     }   
37 };

上面这段代码运行时间885ms，运行时间久的不像是高效率的c++代码。

问题出在哪里呢，我觉得是这样：1.嵌套太多了，while套while再调用for；2.每次都要在调用函数中申请map，增加了大量系统调用。

所以有了下面这个变形，参考代码二：

 1 class Solution {
 2 public:
 3     int lengthOfLongestSubstring(string s) {
 4         vector<int> hashtab(128,0);
 5         int maxlen = 0, len = s.size();
 6         for(size_t i=0; i<len ; ++i)
 7         {
 8             int tmplen = 0;
 9             for(size_t j=i; j<len; ++j)
10             {
11                 if(hashtab[s[j]] == 0)
12                 {
13                     hashtab[s[j]]++;
14                     maxlen = max(maxlen, ++tmplen);
15                 }else{                    
16                     hashtab.assign(128,0);
17                     break;
18                 }
19             }
20         }
21         return maxlen;
22     }
23 };

仍然是使用hash表的方法，这段代码运行时间35ms，但是由于没有反复的分配和释放内存，所以成倍的提高了运行速度。

思路二：但是其实没有必要使用这种方法，或者可以换个使用hash表的思路，另表中所有值均为-1，用hash表记录每个字符出现的位置，如果在循环时这个字符没出现过，那在表中的值将小于起始位置值，如果出现过那么起始位置就换至上一次出现的位置。参考代码如下：

 1 class Solution {
 2 public:
 3     int lengthOfLongestSubstring(string s) {
 4         vector<int> hashtab(128,-1);
 5         int start = -1,index = 0, maxlen = 0;
 6         for(char c:s)
 7         {
 8             if(hashtab[c]>start)
 9                 start = hashtab[c];
10             hashtab[c] = index++;
11             maxlen = max(maxlen,index - start -1);
12         }
13         return maxlen;
14     }
15 };

运行时间9ms。好处只需要一趟遍历就可以完成，上面两种都需要很多次遍历才可以完成，而且也避免了重复查找。

不过还有问题，是否可以将hash表和start作为全局变量来使用，而用for_each来代替for循环，再次提高效率？

posted on 2017-09-05 11:59 Hello_Motty 阅读(229) 评论(0) 收藏举报

刷新页面返回顶部

Hello_Motty

公告