elastic search text类型字段不支持聚合查询,及解决方案
问题现象
在对es的数据做聚合操作时,报错。
报错语句:
curl -XPOST http://10.11.3.63:9200/hadoop_impala_2021-04/_search -d '
{
"aggs": {
"qyeryTypes": {
"cardinality": {
"field": "queryType"
}
}
}
}'
报错内容
{
"error": {
"root_cause": [{
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [{
"shard": 0,
"index": "hadoop_impala_2021-04",
"node": "r_1YWLc-RbejCWrguDHj3g",
"reason": {
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}
}],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
}
},
"status": 400
}
报错原因
在text类型的字段上禁用Fielddata,因为会占用很大的内存。如果实在想对text类型进行聚合,可以在对应字段上设置fielddata=true,以便通过取消反转索引将fielddata加载到内存中。
解决方案
方案一:
对于要聚合的字段加
.keyword
例如:
"aggs": {
"answer": {
"terms": {
"field": "queryType.keyword"
}
}
但是并不管用。于是使用方案二.
方案二:
修改es的mapping,但是es不支持修改字段类型,于是需要新建索引,将老索引数据导入新索引。
- 创建新索引
curl -X PUT "10.11.3.63:9200/hadoop_impala_copy_2021-04?pretty" -d '
{
"mappings": {
"event": {
"properties": {
"impala_version": {
"type": "text"
},
"query_status": {
"type": "text"
},
"ddl_type": {
"type": "text"
},
"queryType": {
"type": "date"
},
"queryState": {
"type": "keyword"
}
}
}
}'
- 将原来的索引的数据导入新索引中
curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
"source": {
"index": "hadoop_impala_2021-04"
},
"dest": {
"index": "hadoop_impala_copy_2021-04"
}
}'
- 聚合查询
curl -XPOST http://10.11.3.63:9200/hadoop_impala_copy_2021-04/_search -d '
{
"aggs": {
"qyeryTypes": {
"terms": {
"field": "queryType"
}
}
}
}'
聚合查询结果正常。
- 删除旧索引
curl -XDELETE 10.11.3.63:9200/hadoop_impala_2021-04
-
新建mapping
-
导入数据
curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
"source": {
"index": "hadoop_impala_copy_2021-04"
},
"dest": {
"index": "hadoop_impala_2021-04"
}
}'
对于数据重新坐下聚合查询,结果正常!