Encode and Decode Strings 解答
Question
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Machine 1 (sender) has the function:
string encode(vector<string> strs) { // ... your code return encoded_string; }
Machine 2 (receiver) has the function:
vector<string> decode(string s) { //... your code return strs; }
So Machine 1 does:
string encoded_string = encode(strs);
and Machine 2 does:
vector<string> strs2 = decode(encoded_string);
strs2
in Machine 2 should be the same as strs
in Machine 1.
Implement the encode
and decode
methods.
Note:
- The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters.
- Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.
- Do not rely on any library method such as
eval
or serialize methods. You should implement your own encode/decode algorithm.
Solution 1 -- JSON format
第一种方法是参考的JSON的规则。
Encode: 我们将输入的字符串数组封装成JSON中的array。[ (left bracket) and ] (right bracket) 表示开头的结尾。中间的分隔用, (comma)。然后对于每个字符串,是 wrapped in double quotes。
由于字符串中本来就可能有双引号或是back slash (\),所以我们需要对这两种符号做转义。方法是多加一个back slash
如 原字符串 \"aafg" -> \\\"aafg\"
JSON里还有更复杂的字符串处理方法。但我们这里的目标只是让encode,再decode后的字符串相同,所以不必那么复杂。
Decode处理原则如下
1. 一个boolean的variable记录当前应该是下一个字符串的开头还是当前字符串的结束
2. 碰到bracket,根据是开始/结束,新建一个空字符串/将当前的字符串存入结果中
3. 碰到back slash,看它下一个元素是否是back slash / bracket,如果是,则将它下一个元素加到字符串中,计数加一。
1 public class Codec { 2 private final char start = '['; 3 private final char end = ']'; 4 private final char include = '"'; 5 private final char strSplit = ','; 6 7 // Encodes a list of strings to a single string. 8 public String encode(List<String> strs) { 9 StringBuilder sb = new StringBuilder(); 10 sb.append(start); 11 for (String str : strs) { 12 sb.append(include); 13 int len = str.length(); 14 for (int i = 0; i < len; i++) { 15 char current = str.charAt(i); 16 if (current == '"' || current == '\\') { 17 sb.append('\\'); 18 } 19 sb.append(current); 20 } 21 sb.append(include); 22 sb.append(strSplit); 23 } 24 sb.append(end); 25 return sb.toString(); 26 } 27 28 // Decodes a single string to a list of strings. 29 public List<String> decode(String s) { 30 List<String> result = new ArrayList<String>(); 31 if (s == null || s.length() < 1) { 32 return result; 33 } 34 int len = s.length(); 35 if (s.charAt(0) != start || s.charAt(len - 1) != end) { 36 return result; 37 } 38 boolean startSymbol = true; 39 StringBuilder sb = new StringBuilder(); 40 for (int i = 1; i < len - 1; i++) { 41 char current = s.charAt(i); 42 if (current == include) { 43 if (startSymbol) { 44 sb = new StringBuilder(); 45 } else { 46 result.add(sb.toString()); 47 } 48 startSymbol = !startSymbol; 49 continue; 50 } 51 if (current == strSplit && startSymbol) { 52 continue; 53 } 54 if (current == '\\') { 55 char next = s.charAt(i + 1); 56 if (next == '\\' || next == '"') { 57 sb.append(next); 58 i++; 59 continue; 60 } 61 } 62 sb.append(current); 63 } 64 return result; 65 } 66 } 67 68 // Your Codec object will be instantiated and called as such: 69 // Codec codec = new Codec(); 70 // codec.decode(codec.encode(strs));
Solution 2
利用了Java里String的 int indexOf(int ch, int fromIndex)函数。
同时存入字符串和字符串的长度。
1 public class Codec { 2 3 // Encodes a list of strings to a single string. 4 public String encode(List<String> strs) { 5 StringBuilder sb = new StringBuilder(); 6 for (String str : strs) { 7 sb.append(str.length()).append('/').append(str); 8 } 9 return sb.toString(); 10 } 11 12 // Decodes a single string to a list of strings. 13 public List<String> decode(String s) { 14 List<String> result = new ArrayList<String>(); 15 int length = s.length(); 16 int i = 0; 17 while (i < length) { 18 int slash = s.indexOf('/', i); 19 int size = Integer.valueOf(s.substring(i, slash)); 20 result.add(s.substring(slash + 1, slash + size + 1)); 21 i = slash + size + 1; 22 } 23 return result; 24 } 25 }