solr search基础知识(控制符及其参数)
1、^ 控制符
(1)查询串上用^
搜索: 天后王菲,如果希望将王菲的相关度加大,用^控制符。
天后 王菲^10.5 结果就会将含有王菲的document权重加大分数提高,排序靠前,10.5为权重。
(2)feild上用^
name ^ 10
即name含有搜索串时候,权重比content含有搜索串的权重高,分数也就高。
2、*符号
当分词器中是最大切分时候,搜索小单元词汇“海波”,如果分析器都分析为“黄海波”,那么则用 *海波* 能搜索出结果!!
3、OR符号
要想既搜索 【黄海波 视频】 又搜索【黄海波】,搜索串可以写成: "黄海波 视频" 黄海波 或者 "黄海波 视频" OR 黄海波
注意:+ - && || ! ( ) { } [ ] ^ " ~ * ? : / 需要用反斜杠实现转义!
4、括号-组搜索
(黄奕 OR 视频) 黄海波 =====》搜索 黄奕 AND 黄海波 视频 AND 黄海波
区间搜索:黄海波 AND last_modified:[2015-03-06T23:59:59.999Z TO *] 时间要大于2015年的。。。。结果
5、dismax中的 mm
当mm不设置的时候:如果设置boolean 查询逻辑为 AND,则mm = 100% 搜索串被切分后的词语必须都出现,如果查询逻辑为OR,则mm=1 出现搜索串切分后的其中一个词语就可以。
mm:值可以使正正数,负整数,正的百分数,负的百分数。正数表示分析器分词后必须出现的个数,负数表示可以不出现词语的个数。
如mm : -2 表示可以有任意2个词语不出现!
6、bq boost query
The parameter specifies an additional, optional, query clause that will be added to the user's main query to influence the score. For example, if bq
you wanted to add a relevancy boost for recent documents:
bq=date:[NOW/DAY-1YEAR TO NOW/DAY] 给距今一年内的文档的相关度提高。
7、各个配置参数解释:
(1) qf(query feild): the parameter introduces a list of fields, each of which is assigned a boost factor to increase or decrease that particular field's importance in qf
the query. 针对某个feild增加boost权重,提高该feild上的搜索相关度
(2) mm(Minimum Should Match) :
mm:值可以使正正数,负整数,正的百分数,负的百分数。正数表示分析器分词后必须出现的个数,负数表示可以不出现词语的个数。
如mm : -2 表示可以有任意2个词语不出现!
(3) pf(phrase feilds): Once the list of matching documents has been identified using the and parameters, the parameter can be used to "boost" the score of fq qf pf
documents in cases where all of the terms in the q parameter appear in close proximity.
The format is the same as that used by the parameter: a list of fields and "boosts" to associate with each of them when making phrase queries qf
out of the entire q parameter用于指定一组field,当query完全匹配pf指定的某一个field时,来进行boost。
(4) ps(phrase slop ):
(5) qs(query phrase slop):
(6) tie(tie breaker):The parameter specifies a float value (which should be something much less than 1) to use as tiebreaker in DisMax queries. tie
When a term from the user's input is tested against multiple fields, more than one field may match. If so, each field will generate a different score
based on how common that word is in that field (for each document relative to all other documents). The parameter lets you control how tie
much the final score of the query will be influenced by the scores of the lower scoring fields compared to the highest scoring field.
A value of "0.0" makes the query a pure "disjunction max query": that is, only the maximum scoring subquery contributes to the final score. A
value of "1.0" makes the query a pure "disjunction sum query" where it doesn't matter what the maximum scoring sub query is, because the final
score will be the sum of the subquery scores. Typically a low value, such as 0.1, is useful. 这个参数很少用到
(7) bq(boost query): The parameter specifies an additional, optional, query clause that will be added to the user's main query to influence the score. For example, if bq
you wanted to add a relevancy boost for recent documents:
q=cheese
bq=date:[NOW/DAY-1YEAR TO NOW/DAY]
(8)bf(boost function):The parameter specifies functions (with optional boosts) that will be used to construct FunctionQueries which will be added to the user's main bf
query as optional clauses that will influence the score. Any function supported natively by Solr can be used, along with a boost value.
(9)qt(query type):指定那个类型的request handler来处理查询请求,一般不用指定,默认是standard.(4.1以后默认的standard是 dismax query parser,4.1之前默认的是 standard query parser)
(10)qf(query fields):指定solr从哪些field中搜索。当在solrconfig中配置了qf,就会和schema中配置的默认搜索域<defaultSearchField>????????</defaultSearchField>都被搜索
(11)pf:phrase query,pf2(Phrase bigram fields):"the big pig"----"the big" "big pig"
pf3: (Phrase trigram fields):"the nice big pig"---"the nice big" "nice big pig"
(12)bf:boost function: recip(rord(myfield),1,2,3)^1.5; recip(ms(NOW,mydatefield),3.16e-11,1,1)理解这些公式需要看functionQuery相关内容。