问题现象

在对es的数据做聚合操作时，报错。

报错语句：

curl -XPOST http://10.11.3.63:9200/hadoop_impala_2021-04/_search -d '
{
  "aggs": {
    "qyeryTypes": {
      "cardinality": {
        "field": "queryType"
      }
    }
  }
}'

报错内容

{
	"error": {
		"root_cause": [{
			"type": "illegal_argument_exception",
			"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
		}],
		"type": "search_phase_execution_exception",
		"reason": "all shards failed",
		"phase": "query",
		"grouped": true,
		"failed_shards": [{
			"shard": 0,
			"index": "hadoop_impala_2021-04",
			"node": "r_1YWLc-RbejCWrguDHj3g",
			"reason": {
				"type": "illegal_argument_exception",
				"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
			}
		}],
		"caused_by": {
			"type": "illegal_argument_exception",
			"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [queryType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."
		}
	},
	"status": 400
}

报错原因

在text类型的字段上禁用Fielddata，因为会占用很大的内存。如果实在想对text类型进行聚合，可以在对应字段上设置fielddata=true，以便通过取消反转索引将fielddata加载到内存中。

回到顶部

解决方案

方案一：

对于要聚合的字段加.keyword

例如：

"aggs": {
"answer": {
  "terms": {
    "field": "queryType.keyword"
  }
}

但是并不管用。于是使用方案二.

方案二：

修改es的mapping，但是es不支持修改字段类型，于是需要新建索引，将老索引数据导入新索引。

创建新索引

curl -X PUT "10.11.3.63:9200/hadoop_impala_copy_2021-04?pretty" -d '
{
        "mappings": {
                "event": {
                        "properties": {
                            "impala_version": {
                                "type": "text"
                            },
                            "query_status": {
                                "type": "text"
                            },
                            "ddl_type": {
                                "type": "text"
                            },
                            "queryType": {
                                "type": "date"
                            },
                            "queryState": {
                                "type": "keyword"
                        }
                }
        }
}'

将原来的索引的数据导入新索引中

curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "hadoop_impala_2021-04"
  },
  "dest": {
    "index": "hadoop_impala_copy_2021-04"
  }
}'

聚合查询

curl -XPOST http://10.11.3.63:9200/hadoop_impala_copy_2021-04/_search -d '
{
  "aggs": {
    "qyeryTypes": {
      "terms": {
        "field": "queryType"
      }
    }
  }
}'

聚合查询结果正常。

删除旧索引

curl -XDELETE 10.11.3.63:9200/hadoop_impala_2021-04

新建mapping
导入数据

curl -u elastic:elastic -X POST "10.11.3.63:9200/_reindex" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "hadoop_impala_copy_2021-04"
  },
  "dest": {
    "index": "hadoop_impala_2021-04"
  }
}'

对于数据重新坐下聚合查询，结果正常！

__EOF__

本文作者：彬在俊
本文链接：https://www.cnblogs.com/erlou96/p/16878359.html
关于博主：评论和私信会在第一时间回复。或者直接私信我。
版权声明：本博客所有文章除特别声明外，均采用 BY-NC-SA 许可协议。转载请注明出处！
声援博主：如果您觉得文章对您有帮助，可以点击文章右下角【推荐】一下。您的鼓励是博主的最大动力！