[LeetCode] 187. Repeated DNA Sequences Java
题目:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
题意及分析:给出一个字符串,其中只有A,C,G,T四个字母,每10个字母作为一个子字符串,要求找到出现不止一次的子字符串。这道题直接用hastable方法求解,遍历字符串,对每个子字符串做判断,若在hashtable中不存在,就添加进去;若存在,如果出现的次数为1,那么将其添加进结果中,并更新出现次数,否则继续遍历。还有一种方法是将a,c,g,t使用3位bit来保存,然后10个字母,就30bit,这样就可以用一个整数来保存。
代码:
import java.util.ArrayList; import java.util.Hashtable; import java.util.List; public class Solution { public List<String> findRepeatedDnaSequences(String s) { List<String> res = new ArrayList<>(); Hashtable<String,Integer> temp = new Hashtable<>(); for(int i=0;i<s.length()-9;i++){ //将每一个长度为10的子字符串进行遍历,没有就将其放进hashtable里面,有且现在之出现了一次就添加进结果里面。 String subString = s.substring(i,i+10); if(temp.containsKey(subString)){ int count=temp.get(subString); //如果为1,则添加进结果,否则继续遍历 if(count==1){ temp.remove(subString); temp.put(subString,2); res.add(subString); } }else{ temp.put(subString,1); } } return res; } }