ES index的数据量大于1万的特殊处理场景

问题一: 查询结果中 hits.total.value 值最大为10000的限制

问题描述:在用restHighLevel的SearchSourceBuilder查询index时,通过hit获取总数量时,默认的最大限制是10000 

String totalHits = response.getHits().getTotalHits().toString().split(" ")[0];
logger.info("pageQuery totalHits={}", response.getHits().getTotalHits());

 打印结果:pageQuery totalHits=10000+ hits

 解决方法:

请求时设置 "track_total_hits": true

  1. Rest 请求设置方法:
post /test_index/_search?pretty
{
  "track_total_hits":true,
  "query": {
    "match_all": {}
  }
}
  1. API设置方法:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().trackTotalHits(true);

上面的代码重跑后:

pageQuery totalHits=940000 hits

注意:

当你要查结果列表的时候,想要同时获取总数,用_search,加上track_total_hits=true
单纯的想知道总数的时候,使用_count (无需加track_total_hits=true)或者  用_search,加上size=0和track_total_hits=true

 

问题二: 分页查询 from 大于 10000 时的数据异常

异常信息

...
Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]];
...

 

解决方法:
修改 max_result_window 设置的最大索引值,注意以 put 方式提交,例如设置最大翻页数量为100万时

PUT /test_index/_settings
{ "max_result_window" : 1000000}

 

 

posted on 2013-05-07 09:41  duanxz  阅读(3363)  评论(0编辑  收藏  举报