首先从solr的启动开始说,solr本身有很多component,在solr启动时回一个一个的调用,每个component都包含一个prepare()方法和一个process()
components的逐个调用在solr/core/src/java文件夹下的org.apache.solr.handler.component包SearchHandler.java
component构造如下图:
当然,我们要研究的纠错存在于Spellcheck Component。
在SearchHandler.java中定义了两个循环,对各个component的prepare()方法和process()方法进行调用,代码如下:
下面,我们进入spellcheckComponent
在spellcheckComponent中,有两个主要方法:prepare()方法
public void prepare(ResponseBuilder rb) throws IOException { SolrParams params = rb.req.getParams(); if (!params.getBool(COMPONENT_NAME, false)) { return; } SolrSpellChecker spellChecker = getSpellChecker(params); if (params.getBool(SPELLCHECK_BUILD, false)) { spellChecker.build(rb.req.getCore(), rb.req.getSearcher()); rb.rsp.add("command", "build"); } else if (params.getBool(SPELLCHECK_RELOAD, false)) { spellChecker.reload(rb.req.getCore(), rb.req.getSearcher()); rb.rsp.add("command", "reload"); } }
process()方法
public void process(ResponseBuilder rb) throws IOException {
...... SolrParams params = rb.req.getParams(); if (!params.getBool(COMPONENT_NAME, false) || spellCheckers.isEmpty()) { return; } boolean shardRequest = "true".equals(params.get(ShardParams.IS_SHARD)); String q = params.get(SPELLCHECK_Q);//获取查询词 SolrSpellChecker spellChecker = getSpellChecker(params);//获取参数,包括词典 Collection<Token> tokens = null; .......
if (tokens != null && tokens.isEmpty() == false) { if (spellChecker != null) { int count = params.getInt(SPELLCHECK_COUNT, 1); boolean onlyMorePopular = params.getBool(SPELLCHECK_ONLY_MORE_POPULAR, DEFAULT_ONLY_MORE_POPULAR); System.out.println("onlyMorePopular"+onlyMorePopular);//onlyMorePopularfalse boolean extendedResults = params.getBool(SPELLCHECK_EXTENDED_RESULTS, false); System.out.println("extendedResults"+extendedResults);//extendedResultsfalse boolean collate = params.getBool(SPELLCHECK_COLLATE, false); System.out.println("collate"+collate);//collatetrue float accuracy = params.getFloat(SPELLCHECK_ACCURACY, Float.MIN_VALUE); Integer alternativeTermCount = params.getInt(SpellingParams.SPELLCHECK_ALTERNATIVE_TERM_COUNT); Integer maxResultsForSuggest = params.getInt(SpellingParams.SPELLCHECK_MAX_RESULTS_FOR_SUGGEST); ModifiableSolrParams customParams = new ModifiableSolrParams();//载入配置文件中的部分选项 for (String checkerName : getDictionaryNames(params)) { System.out.println("getDictionaryNames(params):"+getDictionaryNames(params));//[Ljava.lang.String;@9bdb78 customParams.add(getCustomParams(checkerName, params)); }//逐一查询词典名称,并将其放入customParams(用户参数) Integer hitsInteger = (Integer) rb.rsp.getToLog().get("hits"); System.out.println(hitsInteger);//0 long hits = 0; if (hitsInteger == null) { hits = rb.getNumberDocumentsFound(); } else { hits = hitsInteger.longValue(); System.out.println("hits"+hits);//hits0 } .......
boolean isCorrectlySpelled = hits > (maxResultsForSuggest==null ? 0 : maxResultsForSuggest);//判断词是否正确,返回isCorrectlySpelled NamedList suggestions = toNamedList(shardRequest, spellingResult, q,extendedResults, collate, isCorrectlySpelled);//提供建议词suggestions if (collate) { addCollationsToResponse(params, spellingResult, rb, q, suggestions, spellChecker.isSuggestionsMayOverlap()); } NamedList response = new SimpleOrderedMap(); response.add("suggestions", suggestions); System.out.println("suggestions"+suggestions); //suggestions{中国银航={numFound=1,startOffset=0,endOffset=4,suggestion=[中国银行]},collation={collationQuery=中国银行,hits=3,misspellingsAndCorrections={中国银航=中国银行}}} rb.rsp.add("spellcheck", response);//spellcheck,response的值:{suggestions={中国银航={numFound=1,startOffset=0,endOffset=4,suggestion=[中国银行]},collation={collationQuery=中国银行,hits=3,misspellingsAndCorrections={中国银航=中国银行}}}} } else { throw new SolrException(SolrException.ErrorCode.NOT_FOUND, "Specified dictionaries do not exist: " + getDictionaryNameAsSingleString(getDictionaryNames(params))); } } }
关键函数boolean isCorrectlySpelled = hits > (maxResultsForSuggest==null ? 0 : maxResultsForSuggest);//判断词是否正确,返回isCorrectlySpelled
NamedList suggestions = toNamedList(shardRequest, spellingResult, q,extendedResults, collate, isCorrectlySpelled);//提供建议词suggestions
跟踪:
在solr还没有启动之前,
- 先载入public void init(NamedList args)和 public void prepare(ResponseBuilder rb)
- 在prepare()方法中params.getBool(COMPONENT_NAME, false)为false;params.getBool(SPELLCHECK_BUILD, false)为false
- 再启动solr之后,进入browse之前,又调用到了prepare方法,载入了dictName——SpellCheckComponent.java:dictName:[Ljava.lang.String;
并输出webapp=/solr path=/browse params={} hits=415 status=0 QTime=1304091进入browse界面
当在browse界面中传入查询词q之后