elastic search text类型字段不支持聚合查询,及解决方案

 


问题现象

在对es的数据做聚合操作时,报错。


报错语句:

curl -XPOST http://10.11.3.63:9200/hadoop_impala_2021-04/_search -d ' { "aggs": { "qyeryTypes": { "cardinality": { "field": "queryType" } } } }'

报错内容

{ "error": { "root_cause": [{ "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." }], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [{ "shard": 0, "index": "hadoop_impala_2021-04", "node": "r_1YWLc-RbejCWrguDHj3g", "reason": { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." } }], "caused_by": { "type": "illegal_argument_exception", "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory." } }, "status": 400 }

报错原因

在text类型的字段上禁用Fielddata,因为会占用很大的内存。如果实在想对text类型进行聚合,可以在对应字段上设置fielddata=true,以便通过取消反转索引将fielddata加载到内存中。


解决方案

方案一:

对于要聚合的字段加.keyword

例如:

"aggs": { "answer": { "terms": { "field": "queryType.keyword" } }

但是并不管用。于是使用方案二.

方案二:

修改es的mapping,但是es不支持修改字段类型,于是需要新建索引,将老索引数据导入新索引。

  1. 创建新索引
curl -X PUT "10.11.3.63:9200/hadoop_impala_copy_2021-04?pretty" -d ' { "mappings": { "event": { "properties": { "impala_version": { "type": "text" }, "query_status": { "type": "text" }, "ddl_type": { "type": "text" }, "queryType": { "type": "date" }, "queryState": { "type": "keyword" } } } }'
  1. 将原来的索引的数据导入新索引中
curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d' { "source": { "index": "hadoop_impala_2021-04" }, "dest": { "index": "hadoop_impala_copy_2021-04" } }'
  1. 聚合查询
curl -XPOST http://10.11.3.63:9200/hadoop_impala_copy_2021-04/_search -d ' { "aggs": { "qyeryTypes": { "terms": { "field": "queryType" } } } }'

聚合查询结果正常。

  1. 删除旧索引
curl -XDELETE 10.11.3.63:9200/hadoop_impala_2021-04
  1. 新建mapping

  2. 导入数据

curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d' { "source": { "index": "hadoop_impala_copy_2021-04" }, "dest": { "index": "hadoop_impala_2021-04" } }'

对于数据重新坐下聚合查询,结果正常!


__EOF__

本文作者彬在俊
本文链接https://www.cnblogs.com/erlou96/p/16878359.html
关于博主:评论和私信会在第一时间回复。或者直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
声援博主:如果您觉得文章对您有帮助,可以点击文章右下角推荐一下。您的鼓励是博主的最大动力!
posted @   彬在俊  阅读(611)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
点击右上角即可分享
微信分享提示