Solr -- Solr Facet 2
solr将以导航为目的的查询结果称为facet. 它并不会修改查询结果信息, 只是在查询结果上根据分类添加了count信息, 然后用户根据count信息做进一步的查询, 比如淘宝的查询列表中, 上面会表示不同的类目相关查询结果的数量.
比如搜索数码相机, 在搜索结果栏会根据厂商, 分辨率等维度列出, 这里厂商, 分辨率就是一个个facet.
然后在厂商下面会有nikon, canon, sony等品牌, 这个叫约束(constraints)
接下来是根据选择, 列出当前的导航路径, 这个叫面包屑(breadcrumb).
solr有几种facet:
普通facet, 比如从厂商品牌的维度建立fact
查询facet, 比如根据价格查询时, 将根据价格, 设置多个区间, 比如0-10, 10-20, 20-30等
日期facet, 也是一种特殊的范围查询, 比如按照月份进行facet.
facet的主要好处就是可以任意对搜索条件进行组合, 避免无效搜索, 改善搜索体验.
facet都是在查询时通过参数指定. 比如
在http api中这样写:
引用
"&facet=true&facet.field=manu"
.
java代码这样写:
- new SolrQuery("*:*").setFacet(true).addFacetField("manu");
而xml返回的结果为这样:
- <lst name="facet_fields">
- <lst name="manu">
- <int name="Canon USA">17</int>
- <int name="Olympus">12</int>
- <int name="Sony">12</int>
- <int name="Panasonic">9</int>
- <int name="Nikon">4</int>
- </lst>
- </lst>
通过java代码可以这样获取facet结果:
- List<FacetField> facetFields = queryResponse.getFacetFields();
在已有的查询基础上增加facet query, 可以这样写:
- solrQuery.addFacetQuery("quality:[* TO 10]")
比如对价格按照指定的区间进行facet, 可以这样加上facet后缀:
引用
&facet=true&facet.query=price:[* TO 100]
&facet.query=price:[100 TO 200];&facet.query=[price:200 TO 300]
&facet.query=price:[300 TO 400];&facet.query=[price:400 TO 500]
&facet.query=price:[500 TO *]
如果要对价格在400到500期间的产品做进一步的搜索, 那么可以这样写(使用了solr的过滤查询):
引用
http://localhost:8983/solr/select?q=camera &facet=on&facet.field=manu&facet.field=camera_type &fq=price:[400 to 500]
注意这里的facet field不再包含price了
如果这里对类型做进一步的查询, 那么query语句可以这样写:
引用
http://localhost:8983/solr/select?q=camera &facet=on&facet.field=manu &fq=price:[400 to 500] &fq=camera_type:SLR
facet的使用场景:
1.类目导航
2.自动提示, 需要借助一个支持多值的tag field.
3.热门关键词排行, 也需要借助一个tag field
I've gone through the related questions on this site but haven't found a relevant solution.
When querying my Solr4 index using an HTTP request of the form
&facet=true&facet.field=country
The response contains all the different countries along with counts per country.
How can I get this information using SolrJ? I have tried the following but it only returns total counts across all countries, not per country:
solrQuery.setFacet(true);
solrQuery.addFacetField("country");
The following does seem to work, but I do not want to have to explicitly set all the groupings beforehand:
solrQuery.addFacetQuery("country:usa");
solrQuery.addFacetQuery("country:canada");
Secondly, I'm not sure how to extract the facet data from the QueryResponse object.
So two questions:
1) Using SolrJ how can I facet on a field and return the groupings without explicitly specifying the groups?
2) Using SolrJ how can I extract the facet data from the QueryResponse object?
Thanks.
Update:
I also tried something similar to Sergey's response (below).
List<FacetField> ffList = resp.getFacetFields();
log.info("size of ffList:" + ffList.size());
for(FacetField ff : ffList){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
log.info("ffname:" + ffname + "|ffcount:" + ffcount);
}
The above code shows ffList with size=1 and the loop goes through 1 iteration. In the output ffname="country" and ffcount is the total number of rows that match the original query.
There is no per-country breakdown here.
I should mention that on the same solrQuery object I am also calling addField and addFilterQuery. Not sure if this impacts faceting:
solrQuery.addField("user-name");
solrQuery.addField("user-bio");
solrQuery.addField("country");
solrQuery.addFilterQuery("user-bio:" + "(Apple OR Google OR Facebook)");
Update 2:
I think I got it, again based on what Sergey said below. I extracted the List object using FacetField.getValues().
List<FacetField> fflist = resp.getFacetFields();
for(FacetField ff : fflist){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
List<Count> counts = ff.getValues();
for(Count c : counts){
String facetLabel = c.getName();
long facetCount = c.getCount();
}
}
In the above code the label variable matches each facet group and count is the corresponding count for that grouping.