【solr专题之二】配置文件:solr.xml solrConfig.xml schema.xml
1、关于默认搜索域
<str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 </str>由于content不占任何的权重,因此如果某个文档只在content中包含关键字的话,搜索结果并不会返回这个文档。因此,对于nutch提取的索引来说,要增加content的权重,以及url的权重(如果需要的话):
<str name="qf"> content^1.0 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 </str>
二、Search Handler
- <requestHandler name="/browse" class="solr.SearchHandler">
- <lst name="defaults">
- <str name="echoParams">explicit</str>
- <!-- VelocityResponseWriter settings -->
- <str name="wt">velocity</str>
- <str name="v.template">browse</str>
- <str name="v.layout">layout</str>
- <str name="title">Solritas_test</str>
- <!-- Query settings -->
- <str name="defType">edismax</str>
- <str name="qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
- <str name="df">content</str>
- <str name="mm">100%</str>
- <str name="q.alt">*:*</str>
- <str name="rows">10</str>
- <str name="fl">*,score</str>
- <!--more like this setting-->
- <str name="mlt.qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
- <str name="mlt.fl">text,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename</str>
- <int name="mlt.count">3</int>
- <!-- Faceting defaults -->
- <str name="facet">on</str>
- <str name="facet.field">cat</str>
- <str name="facet.field">manu_exact</str>
- <str name="facet.field">content_type</str>
- <str name="facet.field">author_s</str>
- <str name="facet.query">ipod</str>
- <str name="facet.query">GB</str>
- <str name="facet.mincount">1</str>
- <str name="facet.pivot">cat,inStock</str>
- <str name="facet.range.other">after</str>
- <str name="facet.range">price</str>
- <int name="f.price.facet.range.start">0</int>
- <int name="f.price.facet.range.end">600</int>
- <int name="f.price.facet.range.gap">50</int>
- <str name="facet.range">popularity</str>
- <int name="f.popularity.facet.range.start">0</int>
- <int name="f.popularity.facet.range.end">10</int>
- <int name="f.popularity.facet.range.gap">3</int>
- <str name="facet.range">manufacturedate_dt</str>
- <str name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS</str>
- <str name="f.manufacturedate_dt.facet.range.end">NOW</str>
- <str name="f.manufacturedate_dt.facet.range.gap">+1YEAR</str>
- <str name="f.manufacturedate_dt.facet.range.other">before</str>
- <str name="f.manufacturedate_dt.facet.range.other">after</str>
- <!-- Highlighting defaults -->
- <str name="hl">on</str>
- <str name="hl.fl">content features title name</str>
- <str name="hl.encoder">html</str>
- <str name="hl.simple.pre"></str>
- <str name="hl.simple.post"></str>
- <str name="f.title.hl.fragsize">0</str>
- <str name="f.title.hl.alternateField">title</str>
- <str name="f.name.hl.fragsize">0</str>
- <str name="f.name.hl.alternateField">name</str>
- <str name="f.content.hl.snippets">3</str>
- <str name="f.content.hl.fragsize">200</str>
- <str name="f.content.hl.alternateField">content</str>
- <str name="f.content.hl.maxAlternateFieldLength">750</str>
- <!-- Spell checking defaults -->
- <str name="spellcheck">on</str>
- <str name="spellcheck.extendedResults">false</str>
- <str name="spellcheck.count">5</str>
- <str name="spellcheck.alternativeTermCount">2</str>
- <str name="spellcheck.maxResultsForSuggest">5</str>
- <str name="spellcheck.collate">true</str>
- <str name="spellcheck.collateExtendedResults">true</str>
- <str name="spellcheck.maxCollationTries">5</str>
- <str name="spellcheck.maxCollations">3</str>
- </lst>
- <!-- append spellchecking to our list of components -->
- <arr name="last-components">
- <str>spellcheck</str>
- </arr>
- </requestHandler>
2、二级元素包括first-components, last-components, defautls等。
3、Velocity的配置
- <!-- VelocityResponseWriter settings -->
- <str name="wt">velocity</str>
- <str name="v.template">browse</str>
- <str name="v.layout">layout</str>
- <str name="title">Solritas_test</str>
-
v.template: template name to use, without the .vm suffix. If not specified, "default"[.vm] will be used.
-
v.template.<name>: overrides a file system template
-
debugQuery: if true, default view displays explanations for each hit and additional debugging information in the footer.
-
v.json: Escapes and wraps Velocity generated response with v.json parameter as a JavaScript function.
-
v.layout: Template name that wraps main template (v.template). Main template renders to a $content that can be used in layout template.
-
v.base_dir: overwrites default template load path (conf/velocity/).
-
v.properties: specifies a Velocity properties file to be applied, found using the Solr resource loader mechanism. If not specified, no .properties file is loaded. Example: v.properties=velocity.properties where velocity.properties can be found using Solr's resource loader mechanism, for example in the conf/ directory (not conf/velocity which is for templates only). The .properties file could also be located inside a JAR in the lib/ directory, or other locations.
-
v.contentType: sets the value of the HTTP response's Content-Type header (in case (x)html pages should be UTF-8 (instead of ISO-8859-1) encoded, make sure you set this option to text/xml;charset=UTF-8 (for XHTML) and text/html;charset=UTF-8 (for HTML), respectively)
velocity的其余配置参考:http://blog.csdn.net/jediael_lu/article/details/38039267。
4、搜索域qf
- <str name="qf">
- text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
- title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0
- </str>
5、QueryParser的选择 defType,常用efType=lucene, defType=edismax
- <str name="defType">edismax</str>
若无指定搜索域,则此域作为默认的搜索域。
df/qf/defaultSearchField比较:
(1)使用solrConfig中的df属性代替schema中的defaultSearchField。
(2)df is
the default field and will only take effect if the qf is
not defined.
7、默认的query
- <str name="q.alt">*:*</str>
q.alt: 当q字段为空时,用于设置缺省的query,通常设置q.alt为*:*。
8、 mm:minimal should match。Solr支持三种查询clause,即“必须出现”, “不能出现”和“可以出现”,分别对应于AND, -, OR。
- <str name="mm">100%</str>
When dealing with queries there are 3 types of "clauses" that Lucene knows about: mandatory, prohibited, and 'optional' (aka: "SHOULD") By default all words or phrases specified in the "q" param are treated as "optional" clauses unless they are preceeded by a "+" or a "-". When dealing with these "optional" clauses, the "mm" option makes it possible to say that a certain minimum number of those clauses must match (mm). Specifying this minimum number can be done in complex ways, equating to ideas like...
-
At least 2 of the optional clauses must match, regardless of how many clauses there are: "2"
-
At least 75% of the optional clauses must match, rounded down: "75%"
-
If there are less than 3 optional clauses, they all must match; if there are 3 or more, then 75% must match, rounded up: "2<-25%"
-
If there are less than 3 optional clauses, they all must match; for 3 to 5 clauses, one less than the number of clauses must match, for 6 or more clauses, 80% must match, rounded down: "2<-1 5<80%"
Full details on the variety of complex expressions supported are explained in detail here.
In Solr 1.4 and prior, you should basically set mm=0 if you want the equivilent of q.op=OR, and mm=100% if you want the equivilent of q.op=AND. In 3.x and trunk the default value of mm is dictated by the q.op param (q.op=AND => mm=100%; q.op=OR => mm=0%). Keep in mind the default operator is effected by your schema.xml <solrQueryParser defaultOperator="xxx"/> entry. In older versions of Solr the default value is 100% (all clauses must match)
9、每页返回的行数
- <str name="rows">10</str>
10、返回Field的集合
- <str name="fl">*,score</str>
fl: 是逗号分隔的列表,用来指定文档结果中应返回的 Field 集。默认为 “*”,指所有的字段。以上即返回所有域,而加上score。
11、对返回结果排序
(1)排序的字段必须是index=true
(2)<str name="sort">tstamp asc</str>
若此元素放在<default>中,则指定默认元素,query时可以改变。
若放在<invariant>中,则在query中也不可以改变。
这应该对其它元素同样适用。
参考:http://stackoverflow.com/questions/24966924/how-to-change-the-default-rank-field-from-score-to-other-filed-in-solr/24971353#24971353