187. Repeated DNA Sequences
问题描述:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC", "CCCCCAAAAA"]
解题思路:
用一个hashMap来存储s中每一个长度为10的子串以及个数。
每次对子串t进行检查,若存在于m并且个数为1,加入返回数组。
然后对m[t]自增1.
空间复杂度:O(n), 时间复杂度O(n)
代码:
#define LEN 10 class Solution { public: vector<string> findRepeatedDnaSequences(string s) { vector<string> ret; unordered_map<string, int> m; if(s.size() < LEN) return ret; int n = (int)s.size(); for(int i = 0; i + LEN-1 < n; i++){ string t = s.substr(i, LEN); if(m.count(t) != 0 && m[t] == 1){ ret.push_back(t); } m[t]++; } return ret; } };