elastic使用时报错Text fields are not optimised for operations that require per-document field data
一、elasticsearch在做聚合查询的时候报错
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [interests] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
]
发送的DSL语句如下:
GET rest_logs-*/_search
{
"aggs": {
"my-agg-name": {
"terms": {
"field": "requestUrl"
}
}
},
"size": 0
}
查询结果:
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [requestUrl] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}]
产生该错误的原因是因为该字段不支持聚合或者排序。由于 requestUrl的字段值在存储的时候进行分词了,假设 requestUrl的值为 “music forestry sports” 会被拆分为 music 、forestry 、sports 并建立倒排索引进行存储,如果我们使用 requestUrl 进行聚合或者排序操作,es并不知道要拿什么值去进行聚合或者排序。
所以为了解决这个问题,es也提供了两种解决方案:
1、用两种方式对同一个字符串进行索引,在文档中包括两个字段: analyzed 用于搜索,not_analyzed 用于排序或者聚合,但是保存相同的字符串两次在 _source 字段是浪费空间的。 我们真正想要做的是传递一个 单字段 但是却用两种方式索引它。所有的 _core_field 类型 (strings, numbers, Booleans, dates) 接收一个 fields 参数,故有了第二种方案。
2、为字段添加一个映射
原来的映射关系,这种直接使用 requestUrl 会报上述错误
"requestUrl": {
"type": "string",
"analyzer": "english"
}
直接添加映射
"requestUrl": {
"type": "string",
"analyzer": "english",
"fields": {
"keyword": {
"type": "string",
"index": "not_analyzed"
}
}
}
然后重新查询,完美解决。
GET rest_logs-*/_search
{
"aggs": {
"my-agg-name": {
"terms": {
"field": "requestUrl.keyword"
}
}
},
"size": 0
}
https://blog.csdn.net/wngpenghao/article/details/123001881
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html