linux的统计实现

场景：

将下面的数据里category里的分类统计计数

数据源

es_ip10000.json

{"_index":"order","_type":"service","_id":"107.151.83.180:22","_score":1,"_source":{"ip":"107.151.83.180","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
{"_index":"order","_type":"service","_id":"107.151.84.167:22","_score":1,"_source":{"ip":"107.151.84.167","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
{"_index":"order","_type":"service","_id":"107.151.84.177:22","_score":1,"_source":{"ip":"107.151.84.177","parent_category":["支撑系统"],"category":["其他支撑系统"]}}
{"_index":"order","_type":"service","_id":"107.152.188.252:1723","_score":1,"_source":{"ip":"107.152.188.252","parent_category":["网络产品"],"category":["路由器"]}}
{"_index":"order","_type":"service","_id":"107.151.89.125:1025","_score":1,"_source":{"ip":"107.151.89.125"}}
{"_index":"order","_type":"service","_id":"107.152.58.217:22","_score":1,"_source":{"ip":"107.152.58.217","parent_category":["支撑系统"],"category":["服务"]}}
{"_index":"order","_type":"subdomain","_id":"107.15.221.83:443","_score":1,"_source":{"ip":"107.15.221.83","parent_category":["办公外设","系统软件"],"category":["打印机","操作系统"]}}

取_source下的category字段

cat es_ip10000.json | jq ._source.category > category.txt

输出结果

[
  "其他支撑系统"
]
[
  "其他支撑系统"
]
[
  "其他支撑系统"
]
[
  "路由器"
]
null
[
  "服务"
]
[
  "打印机",
  "操作系统"
]

用编辑器，去除 , [ 和 ]

处理后的结果


  "其他支撑系统"


  "其他支撑系统"


  "其他支撑系统"


  "路由器"

null

  "服务"


  "打印机"
  "操作系统"

排序 > 去重->统计->再排序

cat category.txt | sort | uniq -c | sort -n >category_count.txt

说明：

uniq -c #去重并统计

sort -n # 正序排序

sort -r # 倒序排序

输出结果：

      1 null
      1   "操作系统"
      1   "打印机"
      1   "服务"
      1   "路由器"
      3   "其他支撑系统"
     12

posted @ 2021-08-09 15:21 HaimaBlog 阅读(82) 评论(0) 收藏举报

刷新页面返回顶部

HaimaBlog

人生是一种心境,生活是一种艺术,成功是一种心态,幸福是一种感觉,竞争是一种建构,情感是一种容合.学习是一种成长。

linux的统计实现

公告