elastic search text类型字段不支持聚合查询,及解决方案

问题现象

在对es的数据做聚合操作时,报错。


报错语句:

curl -XPOST http://10.11.3.63:9200/hadoop_impala_2021-04/_search -d '
{
  "aggs": {
    "qyeryTypes": {
      "cardinality": {
        "field": "queryType"
      }
    }
  }
}'

报错内容

{
	"error": {
		"root_cause": [{
			"type": "illegal_argument_exception",
			"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
		}],
		"type": "search_phase_execution_exception",
		"reason": "all shards failed",
		"phase": "query",
		"grouped": true,
		"failed_shards": [{
			"shard": 0,
			"index": "hadoop_impala_2021-04",
			"node": "r_1YWLc-RbejCWrguDHj3g",
			"reason": {
				"type": "illegal_argument_exception",
				"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
			}
		}],
		"caused_by": {
			"type": "illegal_argument_exception",
			"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
		}
	},
	"status": 400
}

报错原因

在text类型的字段上禁用Fielddata,因为会占用很大的内存。如果实在想对text类型进行聚合,可以在对应字段上设置fielddata=true,以便通过取消反转索引将fielddata加载到内存中。


解决方案

方案一:

对于要聚合的字段加.keyword

例如:

"aggs": {
"answer": {
  "terms": {
    "field": "queryType.keyword"
  }
}

但是并不管用。于是使用方案二.

方案二:

修改es的mapping,但是es不支持修改字段类型,于是需要新建索引,将老索引数据导入新索引。

  1. 创建新索引
curl -X PUT "10.11.3.63:9200/hadoop_impala_copy_2021-04?pretty" -d '
{
        "mappings": {
                "event": {
                        "properties": {
                            "impala_version": {
                                "type": "text"
                            },
                            "query_status": {
                                "type": "text"
                            },
                            "ddl_type": {
                                "type": "text"
                            },
                            "queryType": {
                                "type": "date"
                            },
                            "queryState": {
                                "type": "keyword"
                        }
                }
        }
}'
  1. 将原来的索引的数据导入新索引中
curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "hadoop_impala_2021-04"
  },
  "dest": {
    "index": "hadoop_impala_copy_2021-04"
  }
}'
  1. 聚合查询
curl -XPOST http://10.11.3.63:9200/hadoop_impala_copy_2021-04/_search -d '
{
  "aggs": {
    "qyeryTypes": {
      "terms": {
        "field": "queryType"
      }
    }
  }
}'

聚合查询结果正常。

  1. 删除旧索引
curl -XDELETE 10.11.3.63:9200/hadoop_impala_2021-04
  1. 新建mapping

  2. 导入数据

curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "hadoop_impala_copy_2021-04"
  },
  "dest": {
    "index": "hadoop_impala_2021-04"
  }
}'

对于数据重新坐下聚合查询,结果正常!

posted @ 2022-11-10 19:26  彬在俊  阅读(566)  评论(0编辑  收藏  举报