187. Repeated DNA Sequences

问题描述:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC", "CCCCCAAAAA"]

 

解题思路:

用一个hashMap来存储s中每一个长度为10的子串以及个数。

每次对子串t进行检查,若存在于m并且个数为1,加入返回数组。

然后对m[t]自增1.

空间复杂度:O(n), 时间复杂度O(n)

 

 

代码:

#define LEN 10
class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        vector<string> ret;
        unordered_map<string, int> m;
        if(s.size() < LEN) return ret;
        int n = (int)s.size();
        for(int i = 0; i + LEN-1 < n; i++){
            string t = s.substr(i, LEN);
            if(m.count(t) != 0 && m[t] == 1){
                ret.push_back(t);
            }
            m[t]++;
        }
        return ret;
    }
};

 

posted @ 2018-08-16 08:06  妖域大都督  阅读(120)  评论(0编辑  收藏  举报