输入自动提示与补全功能的设计与实现
包含一个简短而完整的Web示例, 演示如何根据用户输入的字符进行自动提示和补全。
一、 场景与目标
在使用 IDE 开发软件时, IDE 会提供一种“智能提示”, 根据所输入的字符列出可能的词组; 在日常Web开发中,根据用户输入进行自动提示和补全,也能很好地改善使用体验。本文实现输入自动提示与补全功能。
输入自动补全功能实际上是“前缀匹配问题”, 即给定一个前缀以及一个单词列表, 找出所有包含该前缀的单词。
本文实现的功能是: 根据用户输入的关键字, 给出与之匹配的 Java 关键字。
二、 算法与设计
最简单直观的方案莫过于直接遍历单词列表, 检测每个单词是否包含前缀, 并返回。这样做的缺点是, 每次都要遍历单词列表, 效率非常低下。 一个更好的思路是, 先构建一个前缀匹配映射 Map<Prefix, List<Matcher>>, key 是每一个单词中所包含的前缀, value 是包含该 key 的所有单词列表。 那么, 问题就转化为给定一个单词列表 list<Word>, 将其转换为 Map<Prefix, List<Matcher>> , 这里 Word, Prefix, Matcher
均为 String 类型。
一种思路是, 遍历每一个单词包含的每一个前缀, 找出所有包含该前缀的单词。
for word in words
for prefix in word(0,i)
for word in words
if (word.startWith(prefix)) {
result.put(prefix, result.get(prefix).add(word));
}
显然, 其效率是 O(总前缀数*总单词数), 在单词列表比较大的情况下, 其效率是比较低的。 要想避免这种嵌套遍历, 就必须充分利用每一次遍历,获取充分的信息。
另一种思路是, 先找出每个单词中所包含的前缀匹配对, 再将这些前缀匹配对合并为最终的前缀匹配映射。 类似 Map - Reduce 方式。
for word in words
for prefix in word(0,i)
pairs.add(new Pair(prefix, word))
mergePairs(pairs)
其效率是O(总的前缀数)。
三、 代码设计与实现
下面给出代码设计与实现。 注意到, 这是通过多次小步重构达到的结果, 而不是一次性实现。 具体是, 先写出一个最简单的实现, 可以把应用跑起来; 然后, 思考更有效率的实现, 最后进行了抽象。
1. 定义接口
package autocomplete; import java.util.Set; public interface PrefixMatcher { Set<String> obtainMatchedWords(String inputText); }
1 package autocomplete; 2 3 import java.util.Collections; 4 import java.util.HashMap; 5 import java.util.HashSet; 6 import java.util.Map; 7 import java.util.Set; 8 9 public abstract class AbstractPrefixMatcher implements PrefixMatcher { 10 11 protected final String[] javaKeywords = new String[] { 12 "abstract", "assert", 13 "boolean", "break", "byte", 14 "case", "catch", "char", "class", "const", "continue", 15 "default", "do", "double", 16 "else", "enum", "extends", 17 "final", "finally", "float", "for", 18 "goto", 19 "if", "implements", "import", "instanceof", "int", "interface", 20 "long", 21 "native", "new", 22 "package", "private", "protected", "public", 23 "return", 24 "strictfp", "short", "static", "super", "switch", "synchronized", 25 "this", "throw", "throws", "transient", "try", 26 "void", "volatile", 27 "while" 28 }; 29 30 protected Map<String, Set<String>> prefixMatchers = new HashMap<String, Set<String>>(); 31 32 abstract void dynamicAddNew(String inputText); 33 34 public Set<String> obtainMatchedWords(String inputText) { 35 Set<String> matchers = prefixMatchers.get(inputText); 36 if (matchers == null) { 37 Set<String> input = new HashSet<String>(); 38 input.add(inputText); 39 dynamicAddNew(inputText); 40 return input; 41 } 42 return matchers; 43 } 44 45 protected Map<String, Set<String>> obtainPrefixMatchers() { 46 return Collections.unmodifiableMap(prefixMatchers); 47 } 48 49 }
package autocomplete; import java.util.HashMap; import java.util.HashSet; import java.util.Map; import java.util.Set; public class SimpleWordMatcher extends AbstractPrefixMatcher { public SimpleWordMatcher() { prefixMatchers = buildPrefixMatchers(javaKeywords); } /** * 将输入的单词组转化为前缀匹配的映射 * @param keywords * @return * * eg. {"abc", "acd", "bcd"} ===> * {"a": ["abc", "acd"], "ab": ["abc"], "abc": ["abc"], * "ac": ["acd"], "acd": ["acd"], "b": ["bcd"], "bc": ["bcd"], "bcd": ["bcd"] * } */ public Map<String, Set<String>> buildPrefixMatchers(String[] keywords) { HashMap<String, Set<String>> prefixMatchers = new HashMap<String, Set<String>>(); for (String keyword: keywords) { int wordLen = keyword.length(); for (int i=1; i < wordLen; i++) { String prefix = keyword.substring(0, i); for (String keyword2: javaKeywords) { if (keyword2.startsWith(prefix)) { Set<String> matchers = prefixMatchers.get(prefix); if (matchers == null) { matchers = new HashSet<String>(); } matchers.add(keyword2); prefixMatchers.put(prefix, matchers); } } } } return prefixMatchers; } public static void main(String[] args) { SimpleWordMatcher wordMatcher = new SimpleWordMatcher(); MapUtil.printMap(wordMatcher.obtainPrefixMatchers()); String[] prefixes = new String[] {"a", "b", "c", "d", "e", "f", "g", "i", "l", "n", "p", "r", "s", "t", "v", "w", "do", "finally"}; for (String prefix: prefixes) { System.out.println(wordMatcher.obtainMatchedWords(prefix)); } } @Override void dynamicAddNew(String inputText) { } }
4. 性能更好的实现
package autocomplete; import java.util.ArrayList; import java.util.HashMap; import java.util.HashSet; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.Set; public class EffectiveWordMatcher extends AbstractPrefixMatcher { public EffectiveWordMatcher() { prefixMatchers = buildPrefixMatchers(javaKeywords); } static class Pair { private String key; private String value; public Pair(String key, String value) { this.key = key; this.value = value; } public String getKey() { return key; } public String getValue() { return value; } public String toString() { return "<" + key + "," + value + ">"; } } private Map<String, Set<String>> buildPrefixMatchers(String[] javakeywords) { List<Pair> pairs = strarr2pairs(javakeywords); return mergePairs(pairs); } /* * 将 字符串数组转化为前缀匹配对 * eg. ["ab", "ac"] ===> * [<"a","ab">, <"ab", "ab">, <"a", "ac">, <"ac", "ac">] */ private List<Pair> strarr2pairs(String[] javakeywords) { List<Pair> pairs = new ArrayList<Pair>(); for (String keyword: javakeywords) { int wordLen = keyword.length(); for (int i=1; i < wordLen; i++) { String prefix = keyword.substring(0, i); Pair pair = new Pair(prefix, keyword); pairs.add(pair); } } return pairs; } /* * 将多个 <key,value> 合并为一个映射 * eg. [<"a", "abstract">, <"b", "boolean">, <"a", "assert">, <"b", "break">, <"c", "continue">] ===> * {"a"=>["abstract", "assert", "b"=>["boolean", "break"], "c"=>["continue"]} */ private static Map<String, Set<String>> mergePairs(List<Pair> pairs) { Map<String, Set<String>> result = new HashMap<String, Set<String>>(); if (pairs != null && pairs.size() > 0) { for (Pair pair: pairs) { String key = pair.getKey(); String value = pair.getValue(); Set<String> matchers = result.get(key); if (matchers == null) { matchers = new HashSet<String>(); } matchers.add(value); result.put(key, matchers); } } return result; } @Override void dynamicAddNew(String inputText) { if (checkValid(inputText)) { List<Pair> newpairs = strarr2pairs(new String[] {inputText}); Map<String, Set<String>> newPreixMatchers = mergePairs(newpairs); mergeMap(newPreixMatchers, prefixMatchers); } } private boolean checkValid(String inputText) { return false; } private Map<String, Set<String>> mergeMap(Map<String, Set<String>> src, Map<String, Set<String>> dest) { Set<Map.Entry<String, Set<String>>> mapEntries = src.entrySet(); Iterator<Map.Entry<String, Set<String>>> iter = mapEntries.iterator(); while (iter.hasNext()) { Map.Entry<String, Set<String>> entry = iter.next(); String key = entry.getKey(); Set<String> newMatchers = entry.getValue(); if (dest.containsKey(key)) { dest.get(key).addAll(newMatchers); } else { dest.put(key, newMatchers); } } return dest; } public static void main(String[] args) { EffectiveWordMatcher wordMatcher = new EffectiveWordMatcher(); MapUtil.printMap(wordMatcher.obtainPrefixMatchers()); String[] prefixes = new String[] {"a", "b", "c", "d", "e", "f", "g", "i", "l", "n", "p", "r", "s", "t", "v", "w", "do", "finally"}; for (String prefix: prefixes) { System.out.println(wordMatcher.obtainMatchedWords(prefix)); } } }
5. Servlet 使用
package servlets; import java.io.IOException; import java.util.Set; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import autocomplete.EffectiveWordMatcher; import autocomplete.PrefixMatcher; public class AutoCompleteServlet extends HttpServlet { protected PrefixMatcher wordMatcher = new EffectiveWordMatcher(); public void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { doPost(req, resp); } public void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { resp.setContentType("text/plain;charset=UTF8"); String inputText = req.getParameter("inputText"); Set<String> matchers = wordMatcher.obtainMatchedWords(inputText); StringBuilder sb = new StringBuilder(); for (String m: matchers) { sb.append(m); sb.append(' '); } sb.deleteCharAt(sb.length()-1); resp.getWriter().print(sb.toString()); } }
6. 前端交互
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>输入自动补全功能演示</title> <script type="text/javascript" src="jquery-1.10.2.min.js"></script> <script> $(document).ready(function() { var searchMatchers = function(keycode) { var inputText = $('#inputText').val(); $.ajax( { url: 'servlets/AutoCompleteServlet', data: { 'inputText': inputText }, dataType: 'text', timeout: 10000, success: function(data) { if (keycode == 13) { // Enter $('#inputText').val($('#matchedKeywords').val()); $('#resultRegion').empty(); return ; } if (keycode == 38 || keycode == 40) { // 上下箭头 $('#matchedKeywords').trigger('focus'); return ; } $('#resultRegion').empty(); var matchers = data.split(' '); if (matchers.length > 0 && inputText != '') { $('#resultRegion').append('<select id="matchedKeywords"></select>') $('#matchedKeywords').append('<option value="' + '' + '">' + '' + '</option>'); for (i=0; i<matchers.length; i++) { var keyword = matchers[i]; $('#matchedKeywords').append('<option value="' + keyword + '">' + keyword + '</option>'); } $('#matchedKeywords').attr('size', matchers.length+1); $('#matchedKeywords').height(20*(matchers.length+1)); $('#matchedKeywords').click(function() { $('#inputText').val($('#matchedKeywords').val()); $('#resultRegion').empty(); }); } } } ); } $(this).bind("keyup", function(eventObj) { var keycode = eventObj.which; searchMatchers(keycode); }); }); </script> <style type="text/css"> #main { margin: 15% 20% 0% 25%; } #inputText { width: 450px; height: 30px; } #matchedKeywords { width: 450px; height: 25px; } #resultRegion { text-align: left; margin: 0 0 0 128px; } </style> </head> <body> <center> <div id="main"> <h3> 输入自动补全功能演示,请输入Java关键字: </h3> <input type="text" name="inputText" id="inputText" value=""/><br/> <div id="resultRegion"></div> </div> </center> </body> </html>
四、 效果图
五、 小结
在 EffectiveWordMatcher 中还可以支持根据用户输入动态添加关键字的功能, 但由于涉及到并发问题, 暂时没有做更多深入。 读者可以阅读源码自行完善, 我也会给出后续思考。