(转)Elasticsearch分析聚合
Elasticsearch不仅仅适合做全文检索,分析聚合功能也很好用。下面通过实例来学习。
一、准备数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | { "index" :{ "_index" : "books" , "_type" : "IT" , "_id" : "1" }} { "id" : "1" , "title" : "Java编程思想" , "language" : "java" , "author" : "Bruce Eckel" , "price" : 70.20 , "year" : 2007 , "description" : "Java学习必读经典,殿堂级著作!赢得了全球程序员的广泛赞誉。" } { "index" :{ "_index" : "books" , "_type" : "IT" , "_id" : "2" }} { "id" : "2" , "title" : "Java程序性能优化" , "language" : "java" , "author" : "葛一鸣" , "price" : 46.50 , "year" : 2012 , "description" : "让你的Java程序更快、更稳定。深入剖析软件设计层面、代码层面、JVM虚拟机层面的优化方法" } { "index" :{ "_index" : "books" , "_type" : "IT" , "_id" : "3" }} { "id" : "3" , "title" : "Python科学计算" , "language" : "python" , "author" : "张若愚" , "price" : 81.40 , "year" : 2016 , "description" : "零基础学python,光盘中作者独家整合开发winPython运行环境,涵盖了Python各个扩展库" } { "index" :{ "_index" : "books" , "_type" : "IT" , "_id" : "4" }} { "id" : "4" , "title" : "Python基础教程" , "language" : "python" , "author" : "张若愚" , "price" : 54.50 , "year" : 2014 , "description" : "经典的Python入门教程,层次鲜明,结构严谨,内容翔实" } { "index" :{ "_index" : "books" , "_type" : "IT" , "_id" : "5" }} { "id" : "5" , "title" : "JavaScript高级程序设计" , "language" : "javascript" , "author" : "Nicholas C.Zakas" , "price" : 66.40 , "year" : 2012 , "description" : "JavaScript技术经典名著" } |
准备5条数据,保存着books.json中,批量导入:
curl -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json
二、Group By分组统计
执行命令:
1 2 3 4 5 6 7 8 9 10 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "per_count" : { "terms" : { "field" : "language" } } } }' |
统计结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | { "took" : 3 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "per_count" : { "doc_count_error_upper_bound" : 0 , "sum_other_doc_count" : 0 , "buckets" : [ { "key" : "java" , "doc_count" : 2 }, { "key" : "python" , "doc_count" : 2 }, { "key" : "javascript" , "doc_count" : 1 } ] } } } |
按编程语言分类,java类2本,python类1本,javascript类1本。
三、Max最大值
执行命令,统计price最大的:
1 2 3 4 5 6 7 8 9 10 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "max_price" : { "max" : { "field" : "price" } } } }' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | { "took" : 2 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "max_price" : { "value" : 81.4 } } } |
四、Min最小值
求价格最便宜的那本:
1 2 3 4 5 6 7 8 9 10 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "max_price" : { "max" : { "field" : "price" } } } }' |
统计结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | { "took" : 3 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "max_price" : { "value" : 81.4 } } } |
五、Average平均值
分组统计并求5本书的平均价格:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "per_count" : { "terms" : { "field" : "language" }, "aggs" : { "avg_price" : { "avg" : { "field" : "price" } } } } } } ' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | { "took" : 4 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "per_count" : { "doc_count_error_upper_bound" : 0 , "sum_other_doc_count" : 0 , "buckets" : [ { "key" : "java" , "doc_count" : 2 , "avg_price" : { "value" : 58.35 } }, { "key" : "python" , "doc_count" : 2 , "avg_price" : { "value" : 67.95 } }, { "key" : "javascript" , "doc_count" : 1 , "avg_price" : { "value" : 66.4 } } ] } } } |
六、Sum求和
求5本书总价:
1 2 3 4 5 6 7 8 9 10 11 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d ' { "size" : 0 , "aggs" : { "sum_price" : { "sum" : { "field" : "price" } } } }' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | { "took" : 6 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "sum_price" : { "value" : 319.0 } } } |
七、基本统计
基本统计会返回字段的最大值、最小值、平均值、求和:
1 2 3 4 5 6 7 8 9 10 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "grades_stats" : { "stats" : { "field" : "price" } } } }' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | { "took" : 2 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "grades_stats" : { "count" : 5 , "min" : 46.5 , "max" : 81.4 , "avg" : 63.8 , "sum" : 319.0 } } } |
八、高级统计
高级统计还会返回方差、标准差等:
1 2 3 4 5 6 7 8 9 10 11 12 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d' { "size" : 0 , "aggs" : { "grades_stats" : { "extended_stats" : { "field" : "price" } } } } ' |
统计结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | { "took" : 3 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "grades_stats" : { "count" : 5 , "min" : 46.5 , "max" : 81.4 , "avg" : 63.8 , "sum" : 319.0 , "sum_of_squares" : 21095.46 , "variance" : 148.65199999999967 , "std_deviation" : 12.19229264740638 , "std_deviation_bounds" : { "upper" : 88.18458529481276 , "lower" : 39.41541470518724 } } } } |
九、百分比统计
1 2 3 4 5 6 7 8 9 10 11 12 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d ' { "size" : 0 , "aggs" : { "load_time_outlier" : { "percentiles" : { "field" : "year" } } } } ' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | { "took" : 3 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "load_time_outlier" : { "values" : { "1.0" : 2007.2 , "5.0" : 2008.0000000000002 , "25.0" : 2012.0 , "50.0" : 2012.0 , "75.0" : 2014.0 , "95.0" : 2015.6000000000001 , "99.0" : 2015.92 } } } } |
十、分段统计
统计价格小于50、50-80、大于80的百分比:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | curl -XPOST "http://localhost:9200/books/_search?pretty" -d '{ "size" : 0 , "aggs" : { "price_ranges" : { "range" : { "field" : "price" , "ranges" : [{ "to" : 50 }, { "from" : 50 , "to" : 80 }, { "from" : 80 }] } } } } ' |
返回结果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | { "took" : 1 , "timed_out" : false , "_shards" : { "total" : 5 , "successful" : 5 , "failed" : 0 }, "hits" : { "total" : 5 , "max_score" : 0.0 , "hits" : [ ] }, "aggregations" : { "price_ranges" : { "buckets" : [ { "key" : "*-50.0" , "to" : 50.0 , "to_as_string" : "50.0" , "doc_count" : 1 }, { "key" : "50.0-80.0" , "from" : 50.0 , "from_as_string" : "50.0" , "to" : 80.0 , "to_as_string" : "80.0" , "doc_count" : 3 }, { "key" : "80.0-*" , "from" : 80.0 , "from_as_string" : "80.0" , "doc_count" : 1 } ] } } } |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 张高兴的大模型开发实战:(一)使用 Selenium 进行网页爬虫
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构
2014-09-25 HDU3549:Flow Problem(最大流入门EK)