ES index的数据量大于1万的特殊处理场景
问题一: 查询结果中 hits.total.value 值最大为10000的限制
问题描述:在用restHighLevel的SearchSourceBuilder查询index时,通过hit获取总数量时,默认的最大限制是10000
String totalHits = response.getHits().getTotalHits().toString().split(" ")[0];
logger.info("pageQuery totalHits={}", response.getHits().getTotalHits());
打印结果:pageQuery totalHits=10000+ hits
解决方法:
请求时设置 "track_total_hits": true
- Rest 请求设置方法:
post /test_index/_search?pretty { "track_total_hits":true, "query": { "match_all": {} } }
- API设置方法:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().trackTotalHits(true);
上面的代码重跑后:
pageQuery totalHits=940000 hits
注意:
当你要查结果列表的时候,想要同时获取总数,用_search,加上track_total_hits=true
单纯的想知道总数的时候,使用_count (无需加track_total_hits=true)或者 用_search,加上size=0和track_total_hits=true
问题二: 分页查询 from 大于 10000 时的数据异常
异常信息
... Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]]; ...
解决方法:
修改 max_result_window 设置的最大索引值,注意以 put 方式提交,例如设置最大翻页数量为100万时
PUT /test_index/_settings
{ "max_result_window" : 1000000}