聚合查询

聚合框架

MongoDB的聚合框架,可以处理进行类似于SQL的group,left outer join之类的运算,只不过可以来控制执行的顺序。
适应的范围目前看优势还是在分析(OLAP),当然OLTP也支持

基本格式

常规的用法是生成一个数组(Pipeline),然后在Pipeline里写处理的步骤(Stage),最后在表(Collection)上调用arrgegate来进行运算。

## 这个是伪代码
pipeline = [$stage1, $stage2, ... $stageN];
db.<COLLECTION>.aggregate(
    pipline,
    { options }
);

与MySQL的对比

与SQL运算符的比较

步骤作用SQL
$match 过滤 where
$project 投影 as
$sort 排序 order by
$group 分组 group by
$skip/$limit 结果限制 skip/limit
$lookup 左外连接 left outer join

步骤中运算符

$match,过滤

$eq/$gt/$lt/$gte/$lte
$and/$or/$not/$in
$geoWithin/$intersect


$project,选择字段

$map/$reduce/$filter
$range
$multiply/$divide/$substract/$add
$year/$month/$dayOfMonth/$hour/$minute/$second


$group,分组

$sum/$avg
$push/$addToSet
$first/$last/$max/$min


特有的步骤,SQL里没有对应的

步骤作用
$unwind 展开数组
$graphLookup 图搜索
$facet/$bucket 分面搜索

例子1

SQL

SELECT
  first_name as '名',
  last_name as '姓'
FROM users
WHERE gender = '男'
SKIP 100
LIMIT 20

MQL

db.users.aggregate([
  { $match: { gender: '男' } },
  { $skip: 100 },
  { $limit: 20 },
  { $project: 
    {
      '名': '$first_name',
      '姓': '$last_name'
    }
  }
]);

例子2

SQL

select
  department,
  count(null) as emp_qty
from
  users
where
  gender = '女'
group by
  department
having
  count(*) < 10

MQL

db.users.aggregate([
  { $match: { 'gender' : '女' }},
  { $group: {
      _id: '$department'
      emp_qty: { $sum: 1 }
  }},
  { $match: { emp_qty: { $lt: 10 }}}
])

例子3 $unwind

说白了就是将文档的树型结构展开成行的结构

文档
> db.students.findOne()
{
  name: '张三',
  score: [
    { subject: '语文', score: 84 },
    { subject: '数学', score: 90 },
    { subject: '外语', score: 69 }
  ]
}
展开成行
db.students.aggregate([$unwind: '$score'}])
{ name: '张三', score: {subject: '语文', score: 84 }}
{ name: '张三', score: {subject: '数学', score: 90 }}
{ name: '张三', score: {subject: '外语', score: 69 }}

$bucket,就是针对一个数据,按区间来统计

image.png

db.products.aggregate([{
  $bucket: {
    groupBy: '$price',
    boundaries: [0, 10, 20, 30, 40],
    default: 'Other',
    output: { 'count': { $sum: 1 }}
  }
}])

$facet,还是按区间统计,但可针对多个数据分别统计

image.png

 
db.products.aggregate([{
  $facet: {
    price: {
      $bucket: {...}
    },
    year: {
      $bucket: {...}
    }
  }
}])




使用的表

> db.orders.findOne()
{
    "_id" : ObjectId("5dbe7a545368f69de2b4d36e"),
    "street" : "493 Hilll Curve",
    "city" : "Champlinberg",
    "state" : "Texas",
    "country" : "Malaysia",
    "zip" : "24344-1715",
    "phone" : "425.956.7743 x4621",
    "name" : "Destinee Schneider",
    "userId" : 3573,
    "orderDate" : ISODate("2019-03-26T03:20:08.805Z"),
    "status" : "created",
    "shippingFee" : NumberDecimal("8.00"),
    "orderLines" : [
        {
            "product" : "Refined Fresh Tuna",
            "sku" : "2057",
            "qty" : 25,
            "price" : NumberDecimal("56.00"),
            "cost" : NumberDecimal("46.48")
        },
        {
            "product" : "Refined Concrete Ball",
            "sku" : "1738",
            "qty" : 61,
            "price" : NumberDecimal("47.00"),
            "cost" : NumberDecimal("47")
        },
        {
            "product" : "Rustic Granite Towels",
            "sku" : "500",
            "qty" : 62,
            "price" : NumberDecimal("74.00"),
            "cost" : NumberDecimal("62.16")
        },
        {
            "product" : "Refined Rubber Salad",
            "sku" : "1400",
            "qty" : 73,
            "price" : NumberDecimal("93.00"),
            "cost" : NumberDecimal("87.42")
        },
        {
            "product" : "Intelligent Wooden Towels",
            "sku" : "5674",
            "qty" : 72,
            "price" : NumberDecimal("84.00"),
            "cost" : NumberDecimal("68.88")
        },
        {
            "product" : "Refined Steel Bacon",
            "sku" : "5009",
            "qty" : 8,
            "price" : NumberDecimal("53.00"),
            "cost" : NumberDecimal("50.35")
        }
    ],
    "total" : NumberDecimal("407")
}

计算所有订单的总销售额

> db.orders.aggregate([{
$group:
  {
    _id: null,
    total: { $sum: "$total"}
  }
}])
{ "_id" : null, "total" : NumberDecimal("44019609") }
>

查询2019年第一季度(1月1日~3月31日)已完成订单(completed)的订单总金额和订单总数

 
>db.orders.aggregate(
[{$match: {
  status: "completed",
  orderDate: {
    $gte: ISODate('2019-01-01'),
    $lt: ISODate('2019-04-01')
  }
}}, {$group: {
  _id: null,
  total: {
    $sum: "$total"
  },
  shippingFee: {
    $sum: "$shippingFee"
  },
  count: {
    $sum: 1
  }
}}, {$project: {
  grandTotal: {
    $add: ["$total", "$shippingFee"]
  },
  count: 1,
    _id: 0
}}]
)

{ "count" : 5875, "grandTotal" : NumberDecimal("2636376.00") }






posted @ 2023-11-10 09:54  Cetus-Y  阅读(10)  评论(0编辑  收藏  举报