Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].
每个长度为10的字符串,通过查找map,判断之前是否出现过。
用hash可以降低空间复杂度。
class Solution {public: vectorfindRepeatedDnaSequences(string s) { vector ret; if(s.size() < 10) return ret; unordered_map m; hash strh; for(int i = 0; i <= s.size()-10; i ++) { string sub = s.substr(i, 10); m[strh(sub)] ++; if(m[strh(sub)] == 2) { //find repeat ret.push_back(sub); } } return ret; }};