Facet['fæsɪt]很难翻译,只能靠例子来理解了。Solr作者Yonik Seeley也给出更为直接的名字:导航(Guided Navigation)、参数化查询(Paramatic Search)。
上面是比较直接的Faceted Search例子,品牌、产品特征、卖家,均是 Facet 。而Apple、Lenovo等品牌,就是 Facet values 或者说 Constraints ,而Facet values所带的统计值就是 Facet count/Constraint count 。
2 、Facet 使用
q = 超级本
facet = true
facet.field = 产品特性
facet.field = 品牌
facet.field = 卖家
http://…/select?q=超级本&facet=true&wt=json
&facet.field=品牌&facet.field=产品特性&facet.field=卖家
也可以提交查询条件,设置fq(filter query)。
q = 电脑
facet = true
fq = 价格:[8000 TO *]
facet.mincount = 1 // fq将不符合的字段过滤后,会显示count为0
facet.field = 产品特性
facet.field = 品牌
facet.field = 卖家
http://…/select?q=超级本&facet=true&wt=json
&fq=价格:[8000 TO *]&facet.mincount=1
&facet.field=品牌&facet.field=产品特性&facet.field=卖家
"facet_counts": { "facet_fields": { "品牌": [ "Apple", 4, "Lenovo", 39 …] "产品特性": [ "显卡", 42, "酷睿", 38 …] …}}
如果用户选择了Apple这个分类,查询条件中需要添加另外一个fq查询条件,并移除Apple所在的facet.field。
http://…/select?q=超级本&facet=true&wt=json
&fq=价格:[8000 TO *]&fq=品牌:Apple&facet.mincount=1
&facet.field= 品牌&facet.field=产品特性&facet.field=卖家
3 、Facet 参数
facet.prefix – 限制constaints的前缀
facet.mincount=0 – 限制constants count的最小返回值,默认为0
facet.sort=count – 排序的方式,根据count或者index
facet.offset=0 – 表示在当前排序情况下的偏移,可以做分页
facet.limit=100 – constraints返回的数目
facet.missing=false – 是否返回没有值的field
facet.date – Deprecated, use facet.range
facet.query
指定一个查询字符串作为Facet Constraint
facet.query = rank:[* TO 20]
facet.query = rank:[21 TO *]
"facet_counts": { "facet_fields": { "品牌": [ "Apple", 4, "Lenovo", 10 …] "产品特性": [ "显卡", 11, "酷睿", 20 …] …}}
facet.range
http://…/select?&facet=true
&facet.range=price
&facet.range.start=5000
&facet.range.end=8000
&facet.range.gap=1000
<result numFound="27" ... />
...
<lst name="facet_counts">
<lst name="facet_queries">
<int name="rank:[* TO 20]">2</int>
<int name="rank:[21 TO *]">15</int>
</lst>
...
WARNING: range范围是左闭右开,[start, end)
facet.pivot
这个是Solr 4.0的新特性,pivot和facet一样难理解,还是用例子来讲吧。
Syntax: facet.pivot=field1,field2,field3...
e.g. facet.pivot=comment_user, grade
#docs |
#docs grade:好 |
#docs 等级:中 |
#docs 等级:差 |
|
comment_user:1 |
10 |
8 |
1 |
1 |
comment_user:2 |
20 |
18 |
2 |
0 |
comment_user:3 |
15 |
12 |
2 |
1 |
comment_user:4 |
18 |
15 |
2 |
1 |
"facet_counts":{
"facet_pivot":{
"comment_user, grade ":[{
"field":"comment_user",
"value":"1",
"count":10,
"pivot":[{
"field":"grade",
"value":"好",
"count":8}, {
"field":"grade",
"value":"中",
"count":1}, {
"field":"grade",
"value":"差",
"count":1}]
}, {
"field":" comment_user ",
"value":"2",
"count":20,
"pivot":[{
…
没有pivot机制的话,要做到上面那点可能需要多次查询:
http://...q= comment&fq= grade:好&facet=true&facet.field=comment_user
http://...q=comment&fq=grade:中&facet=true&facet.field=comment_user
http://...q=comment&fq=grade:差&facet=true&facet.field=comment_user
Facet.pivot - Computes a Matrix of Constraint Counts across multiple Facet Fields. by Yonik Seeley.
上面那个解释很不错,只能理解不能翻译。
facet.pivot自己的理解,就是按照多个维度进行分组查询,以下是自己的实战代码,按照newsType,property两个维度统计:
-
public List<ReportNewsTypeDTO> queryNewsType(
-
ReportQuery reportQuery) {
-
HttpSolrServer solrServer = SolrServer.getInstance().getServer();
-
SolrQuery sQuery = new SolrQuery();
-
List<ReportNewsTypeDTO> list = new ArrayList<ReportNewsTypeDTO>();
-
try {
-
String para = this.initReportQueryPara(reportQuery, 0);
-
sQuery.setFacet(true);
-
sQuery.add("facet.pivot", "newsType,property");//根据这两维度来分组查询
-
sQuery.setQuery(para);
-
QueryResponse response = solrServer.query(sQuery,SolrRequest.METHOD.POST);
-
NamedList<List<PivotField>> namedList = response.getFacetPivot();
-
System.out.println(namedList);//底下为啥要这样判断,把这个值打印出来,你就明白了
-
if(namedList != null){
-
List<PivotField> pivotList = null;
-
for(int i=0;i<namedList.size();i++){
-
pivotList = namedList.getVal(i);
-
if(pivotList != null){
-
ReportNewsTypeDTO dto = null;
-
for(PivotField pivot:pivotList){
-
dto = new ReportNewsTypeDTO();
-
dto.setNewsTypeId((Integer)pivot.getValue());
-
dto.setNewsTypeName(News.newsTypeMap.get((Integer)pivot.getValue()));
-
int pos = 0;
-
int neg = 0;
-
List<PivotField> fieldList = pivot.getPivot();
-
if(fieldList != null){
-
for(PivotField field:fieldList){
-
int proValue = (Integer) field.getValue();
-
int count = field.getCount();
-
if(proValue == 1){
-
pos = count;
-
}else{
-
neg = count;
-
}
-
}
-
}
-
dto.setPositiveCount(pos);
-
dto.setNegativeCount(neg);
-
list.add(dto);
-
}
-
}
-
}
-
}
-
-
return list;
-
} catch (SolrServerException e) {
-
log.error("查询solr失败", e);
-
e.printStackTrace();
-
} finally{
-
solrServer.shutdown();
-
solrServer = null;
-
}
-
return list;
-
}
namedList打印结果:
{newsType,property=
[
newsType:8 [4260] [property:1 [3698] null, property:0 [562] null],
newsType:1 [1507] [property:1 [1389] null, property:0 [118] null],
newsType:2 [1054] [property:1 [909] null, property:0 [145] null],
newsType:6 [715] [property:1 [581] null, property:0 [134] null],
newsType:4 [675] [property:1 [466] null, property:0 [209] null],
newsType:3 [486] [property:1 [397] null, property:0 [89] null],
newsType:7 [458] [property:1 [395] null, property:0 [63] null],
newsType:5 [289] [property:1 [263] null, property:0 [26] null],
newsType:9 [143] [property:1 [138] null, property:0 [5] null]
]
}
这下应该明白了。写到这里,突然想到一个,所有的分组查询统计,不管是一个维度两个维度都可以使用face.pivot来统计,不错的东东。