(转)Elasticsearch分析聚合
Elasticsearch不仅仅适合做全文检索,分析聚合功能也很好用。下面通过实例来学习。
一、准备数据
{"index":{ "_index": "books", "_type": "IT", "_id": "1" }} {"id":"1","title":"Java编程思想","language":"java","author":"Bruce Eckel","price":70.20,"year": 2007,"description":"Java学习必读经典,殿堂级著作!赢得了全球程序员的广泛赞誉。"} {"index":{ "_index": "books", "_type": "IT", "_id": "2" }} {"id":"2","title":"Java程序性能优化","language":"java","author":"葛一鸣","price":46.50,"year": 2012,"description":"让你的Java程序更快、更稳定。深入剖析软件设计层面、代码层面、JVM虚拟机层面的优化方法"} {"index":{ "_index": "books", "_type": "IT", "_id": "3" }} {"id":"3","title":"Python科学计算","language":"python","author":"张若愚","price":81.40,"year": 2016,"description":"零基础学python,光盘中作者独家整合开发winPython运行环境,涵盖了Python各个扩展库"} {"index":{ "_index": "books", "_type": "IT", "_id": "4" }} {"id":"4","title":"Python基础教程","language":"python","author":"张若愚","price":54.50,"year": 2014,"description":"经典的Python入门教程,层次鲜明,结构严谨,内容翔实"} {"index":{ "_index": "books", "_type": "IT", "_id": "5" }} {"id":"5","title":"JavaScript高级程序设计","language":"javascript","author":"Nicholas C.Zakas","price":66.40,"year":2012,"description":"JavaScript技术经典名著"}
准备5条数据,保存着books.json中,批量导入:
curl -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json
二、Group By分组统计
执行命令:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "per_count": { "terms": { "field": "language" } } } }'
统计结果:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "per_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "java", "doc_count" : 2 }, { "key" : "python", "doc_count" : 2 }, { "key" : "javascript", "doc_count" : 1 } ] } } }
按编程语言分类,java类2本,python类1本,javascript类1本。
三、Max最大值
执行命令,统计price最大的:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "max_price": { "max": { "field": "price" } } } }'
返回结果:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "max_price" : { "value" : 81.4 } } }
四、Min最小值
求价格最便宜的那本:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "max_price": { "max": { "field": "price" } } } }'
统计结果:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "max_price" : { "value" : 81.4 } } }
五、Average平均值
分组统计并求5本书的平均价格:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "per_count": { "terms": { "field": "language" }, "aggs": { "avg_price": { "avg": { "field": "price" } } } } } } '
返回结果:
{ "took" : 4, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "per_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "java", "doc_count" : 2, "avg_price" : { "value" : 58.35 } }, { "key" : "python", "doc_count" : 2, "avg_price" : { "value" : 67.95 } }, { "key" : "javascript", "doc_count" : 1, "avg_price" : { "value" : 66.4 } } ] } } }
六、Sum求和
求5本书总价:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d ' { "size": 0, "aggs": { "sum_price": { "sum": { "field": "price" } } } }'
返回结果:
{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "sum_price" : { "value" : 319.0 } } }
七、基本统计
基本统计会返回字段的最大值、最小值、平均值、求和:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "grades_stats": { "stats": { "field": "price" } } } }'
返回结果:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "grades_stats" : { "count" : 5, "min" : 46.5, "max" : 81.4, "avg" : 63.8, "sum" : 319.0 } } }
八、高级统计
高级统计还会返回方差、标准差等:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d' { "size": 0, "aggs": { "grades_stats": { "extended_stats": { "field": "price" } } } } '
统计结果:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "grades_stats" : { "count" : 5, "min" : 46.5, "max" : 81.4, "avg" : 63.8, "sum" : 319.0, "sum_of_squares" : 21095.46, "variance" : 148.65199999999967, "std_deviation" : 12.19229264740638, "std_deviation_bounds" : { "upper" : 88.18458529481276, "lower" : 39.41541470518724 } } } }
九、百分比统计
curl -XPOST "http://localhost:9200/books/_search?pretty" -d ' { "size": 0, "aggs": { "load_time_outlier": { "percentiles": { "field": "year" } } } } '
返回结果:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "load_time_outlier" : { "values" : { "1.0" : 2007.2, "5.0" : 2008.0000000000002, "25.0" : 2012.0, "50.0" : 2012.0, "75.0" : 2014.0, "95.0" : 2015.6000000000001, "99.0" : 2015.92 } } } }
十、分段统计
统计价格小于50、50-80、大于80的百分比:
curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size": 0, "aggs": { "price_ranges": { "range": { "field": "price", "ranges": [{ "to": 50 }, { "from": 50, "to": 80 }, { "from": 80 }] } } } } '
返回结果:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "price_ranges" : { "buckets" : [ { "key" : "*-50.0", "to" : 50.0, "to_as_string" : "50.0", "doc_count" : 1 }, { "key" : "50.0-80.0", "from" : 50.0, "from_as_string" : "50.0", "to" : 80.0, "to_as_string" : "80.0", "doc_count" : 3 }, { "key" : "80.0-*", "from" : 80.0, "from_as_string" : "80.0", "doc_count" : 1 } ] } } }