Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
 1 public class Solution {
 2     public List<String> findRepeatedDnaSequences(String s) {
 3         List<String> list = new ArrayList<String>();
 4         Set<String> temp = new HashSet<String>();
 5         if (s == null || s.length() <= 10)
 6             return list;
 7         Set<String> set = new HashSet<String>();
 8         for (int i = 10; i <= s.length(); i++) {
 9             if (set.contains(s.substring(i - 10, i))) {
10                 temp.add(s.substring(i - 10, i));
11             } else {
12                 set.add(s.substring(i - 10, i));
13             }
14         }
15         return new ArrayList<String>(temp);
16     }
17 }

 

posted @ 2016-07-31 12:46  北叶青藤  阅读(243)  评论(0编辑  收藏  举报