Elasticsearch缺失字段排序和自定义排序

Elasticsearch 字段缺失排序和自定义排序

例题1

现有需求需按照有vr图的数据排在最前,再后面是有图片的数据,最后是既没有vr图也没有图片的数据

先使用ES的字段缺失排序实现

{
    "from": 0,
    "size": 10,
    "query": {
       "bool": {
            "filter": [
                {
                    "term": {
                        "city_id": {
                            "value": 11,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "id",
            "name",
            "area",
            "addr",
            "useage",
            "price",
            "thumb_url",
            "vr_cover",
            "vr_house_id",
            "vr_url"
        ],
        "excludes": []
    },
    "sort": [
        {
            "vr_url": {
                "missing": "_last"
            }
        },
        {
            "thumb_url": {
                "missing": "_last"
            }
        }
    ],
    "track_total_hits": true  //数据超过1w条时设置为true
}

用以上json查询出的数据就是无论升序还是降序都可将有vr图的放在最前,再是有图片的数据,既没有vr也没有图片的数据则排在最后。

对应的Java代码是

FieldSortBuilder vrSortBuilder = SortBuilders
                .fieldSort("vr_url");
        FieldSortBuilder imgSortBuilder = SortBuilders
                .fieldSort("thumb_url");
        vrSortBuilder.missing("_last");
        imgSortBuilder.missing("_last");
SearchSourceBuilder searchSourceBuilder.sort(vrSortBuilder)//先按vr排序
                    					.sort(imgSortBuilder)//再按图片排序

同理,既然可以将缺失值排在最后也可以将缺失值排在最前,只需将_last换成_first即可


例题2

现有需求:需从一个索引中获取数据,排序时无论升序或降序,都需要将价格为null0的数据排至最后。

此时,例题1中的方法已经无法完成此需求了,因为使用ES提供的字段缺失排序时,字段为0代表该数据是有值的,所以字段值为0的数据也会参与排序。故在此引入一种自定义排序方式。

先解决在升序时的排序方式:

{
    "from": 0,
    "size": 10,
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "city_id": {
                            "value": 11,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "id",
            "name",
            "area",
            "addr",
            "useage",
            "price",
            "thumb_url",
            "vr_cover",
            "vr_house_id",
            "vr_url"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_script": {
                "script": {
                    "source": "if(doc['price'].size()<=0){return factor;}else if(doc['price'].size()>0){if(Double.parseDouble(doc['price'].value)==0){return factor}else if(Double.parseDouble(doc['price'].value)>0){return Double.parseDouble(doc['price'].value)}}",
                    "lang": "painless",
                    "params": {
                      "factor": 10000000
                    }
                },
                "type": "number",
                "order": "asc"
            }
        }
    ]
}

此排序方式最重要的点在于自定义,而实现自定义的核心在于下面这一段

{
    "source": 
    "if(doc['price'].size()<=0){  		//若price字段值为null
    	return factor;				//使用一个尽可能大的数据代替null来参与排序,此处factor为下面params中的参数,可实现动态变化
	}else if(doc['price'].size()>0){	//若price字段值不为null
        if(Double.parseDouble(doc['price'].value)==0){	//若price字段的值为0
            return factor;				//使用一个尽可能大的数据代替0参与排序
        }else if(Double.parseDouble(doc['price'].value)>0){	//若price字段的值不为0
            return Double.parseDouble(doc['price'].value);	//使用price字段原本的值参与排序
		}
	}",
    "lang": "painless",
	"params": {
        "factor": 10000000
    }
}

此时解决了升序时的排序方法开始考虑降序时的排序:

{
    "from": 0,
    "size": 10,
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "city_id": {
                            "value": 11,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1.0
        }
    },
    "_source": {
        "includes": [
            "id",
            "name",
            "area",
            "addr",
            "useage",
            "price",
            "thumb_url",
            "vr_cover",
            "vr_house_id",
            "vr_url"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_script": {
                "script": {
                    "source": "if(doc['price'].size()<=0){return 0;}else if(doc['price'].size()>0){return Double.parseDouble(doc['price'].value);}",
                    "lang": "painless"
                },
                "type": "number",
                "order": "desc"
            }
        }
    ]
}

此段核心则在于

if(doc['price'].size()<=0){							//若price字段的值为null
    return 0;										//使用0代替null参与排序
}else if(doc['price'].size()>0){					//若price字段的值不为null
    return Double.parseDouble(doc['price'].value);	//使用price原本的值排序
}
Java代码实现
Map params = new HashMap<>();
String scriptStr = "";
SearchSourceBuilder searchSourceBuilder = null;
ScriptSortBuilder guideSort = null;
if("asc".equals(OrderDirection)){
	scriptStr = "if(doc['guide_price'].size()<=0){return 10000000;}else 	if(doc['guide_price'].size()>0)				  {if(Double.parseDouble(doc['guide_price'].value)==0){return 10000000}else if(Double.parseDouble(doc['guide_price'].value)>0){return Double.parseDouble(doc['guide_price'].value)}}";
	Script script = new Script(ScriptType.INLINE,"painless",scriptStr , params);
	guideSort = new ScriptSortBuilder(script, ScriptSortBuilder.ScriptSortType.NUMBER);
	searchSourceBuilder.sort(guideSort);
}else if("desc".equals(OrderDirection)){
	scriptStr = "if(doc['guide_price'].size()<=0){return 0;}else if(doc['guide_price'].size()>0){return Double.parseDouble(doc['guide_price'].value);}";
	Script script = new Script(ScriptType.INLINE,"painless",scriptStr , params);
	guideSort = new ScriptSortBuilder(script, ScriptSortBuilder.ScriptSortType.NUMBER);
	searchSourceBuilder.sort(guideSort);
}

有了自定义排序,es则可以实现字段权重排序等大部分不同规则的定制排序。

posted @ 2021-10-13 17:20  阿伦啊  阅读(3455)  评论(0编辑  收藏  举报