solr如何计算score?
solr计算一个query的score分为两个部分:
- Lucene的算分模型
- Boost
其中Lucene的算分模型包括:
1. tf - Term Frequency. The frequency with which a term appears in a document. Given a search query, the higher the term frequency, the higher the document score.
2. idf - Inverse Document Frequency. The rarer a term is across all documents in the index, the higher it's contribution to the score.
3. coord - Coordination Factor. The more query terms that are found in a document, the higher it's score.
coord is the coordination factor - if there are multiple terms in a query, the more terms that match, the higher the score.
4. fieldNorm - Field length. The more words that a field contains, the lower it's score. This factor penalizes documents with longer field values. In another word, matches on a smaller field score higher than matches on a larger field
Boost可以分为index-time boost和query-time boost:
Index-time boosts are applied when adding documents, and apply to the entire document or to specific fields.
Query-time boosts are applied when constructing a search query, and apply to specific fields.