MongoDB实践八(聚合)

MongoDB之聚合

聚合操作：

db.collection.aggregate()
db.<collection>.aggregate(<pipeline>,options)

<pipeline> 文档定义了操作中使用的聚合管道阶段和聚合操作符
<options>文档声明了一些聚合操作的参数

字段路径表达式：

$<field> 使用 $ 来指示字段路径
$<field>.<sub-field> - 使用 $ 和 . 来指示内嵌文档字段路径

系统变量表达式

$$<variable> 使用$$来指示系统变量
$$CURRENT 指示管道中当前操作的文档

常量表达式

$literal : <value> 指示常量<value>
$literal : "$name" - 指示这里常量字符串"$name" 这里的$被当做常量处理，而不是字段路径表达式

聚合管道阶段

$project 对输入文档进行再次投影(用来灵活控制输出文档的格式)

db.accounts.aggregate([{$project:{_id:0,balance:1,clientName:"$name.firstName"}}]) //name字段中的firstName 赋值给了clientName
db.accounts.aggregate([{$project:{_id:0,balance:1,nameArray:["$name.firstName","$name.middleName","$name.lastName"]}}])

$match 对输入文档进行筛选

db.accounts.aggregate([{$match:{"name.firstName":"alice"}},{$project:{"_id":0,balance:1,"client.name":"$name.firstName"}}])

$limit 筛选出管道内前N篇文档
$skip 跳出管道前N篇文档

db.accounts.aggregate([{$match:{"name":"alice"}},{$skip:1},{$limit:1}])

$unwind 展开输入文档中的数组字段

db.accounts.aggregate([{$unwind:{path:"$currency"}}])
db.accounts.aggregate([{$unwind:{path:"$currency",includeArrayIndex:"ccyIndex"}}])

includeArrayIndex 展开数组时添加元素位置赋值给 ccyIndex
展开数组时保留空数组或不存在数组的文档

db.accounts.aggregate([{$unwind:{path:"$currency",preserveNullAndEmptyArrays:true}}])

$sort 对输入文档进行排序

db.accounts.aggregate([{$sort:{balance:1,"name.lastName":-1}}])

$lookup 对输入文档进行查询操作相当于MYSQL 联表操作 localField = foreignField 即ON, from 即 join, as即字段别名

1、db.accounts.aggregate([{$lookup:{from:"forex",localField:"currency",foreignField:"ccy",as:"forexData"}}]).pretty()

使用复杂条件进行查询
> $lookup:{

... from:<collection to join>,

... localField:<field from the input documents>,

... foreignField:<field from the documents of the "from" collection>,

... as:<output array field>

... };

不关联查询

db.accounts.aggregate([{$lookup:{from:"forex",pipeline:[{$match:{date:new Date('2018-12-21')}}],as:'forexData'}}]).pretty()

关联查询

db.accounts.aggregate([{$lookup:{ from:"forex",let:{bal:"$balance"},pipeline:[{$match: {$expr: {$and:[{$eq:["$date",new Date('2018-12-21')]},{$gt:["$$bal",100]} ] }}}],as :"forexData"}}]);

db.accounts.aggregate([{$lookup:{from:"forex",let:{bal:"$balance"},pipeline:[{$match:{$expr:{$and:[{$eq:["$date",new Date("2018-12-21")]},{$gt:["$$bal":{$gt,100}]}]}}}],as:"forexData"}}])

> $lookup:{

... from:<collection to join>,

... let:{<val_1>:<expression>, ..., <var_n>:<expression> }, //可选参数，对查询集合中的文档使用聚合阶段进行处理时，如果需要参考管道文档中的字段则必须用let参数对字段进行声明

... pipeline:[ <pipeline to execute on the collection to join >], //对查询集合中的文档使用聚合阶段进行处理

... as:<output array field>

... }

$group 对输入文档进行分组

使用聚合操作符创建数组字段

db.transactions.aggregate([ { $group:{ _id: "$currency", symbols:{$push:"$symbol"} } } ]);

db.transactions.aggregate([ { $group:{ _id:"$currency" } }]);
> db.transactions.aggregate([

... {

... $group: {

... _id: "$currency", // _id 换成null 可以统计该集合全部文档

... totalQty:{$sum:"$qty"},

... totalNotional:{$sum:{$multiply:["$price","$qty"]}},

... avgPrice:{$avg:"$price"},

... count:{$sum:1},

... maxNotional:{$max:{$multiply:["$price","$qty"]}},

... minNotional:{$min:{$multiply:["$price","$qty"]}}

... }

... ])

> $group: {

... _id:<expression>,

... <field1> : { <accumulator1> : <expression1> },

... ...

... }

$out 将管道中的文档输出

“将聚合管道中的文档写入一个新集合”

db.transactions.aggregate([ { $group:{ _id: "$currency",symbols:{$push:"$symbol"} } },{$out:"output"} ]);

将聚合管道中的文档写入一个已存在的集合

db.transactions.aggregate([ { $group:{ _id:"$symbol", totalNotional:{$sum:{$multiply:["$price","$qty"]}}} }, { $out:"output" } ]);

如果聚合管道操作遇到错误，管道阶段不会创建新集合或是覆盖已存在的集合内容

db.<collection>.aggregate(<pipeline>,<options>)

allowDiskUse: <boolean>

每个聚合管道阶段使用的内存不能超过100MB
如果数据量较大，为了防止聚合管道阶段超出内存上线并且抛出错误，可以启用allowDiskUse选项
allowDiskUse 启用之后，聚合阶段可以在内存容量不足时，将操作数据写入临时文件中
临时文件会被写入dbPath下的_tmp文件夹，dbPath的默认值为 /data/db

聚合操作的优化

聚合阶段顺序优化

$match + $project $match先执行
> db.transactions.aggregate([

... {

... $project:{

... _id:0,symbol:1,currency:1,

... notional:{$multiply:["$price","$qty"]}

... }

... },

... {

... $match:{

... currency:"USD",

... notional:{$gt:1000}

... }

... ]);

{ "symbol" : "AMZN", "currency" : "USD", "notional" : 1377.5 }

优化后：db.transactions.aggregate([ { $match:{ currency:"USD", } }, { $project:{ _id:0,symbol:1,currency:1,notional:{$multiply:["$price","$qty"]} } }, {$match:{notional:{$gt:1000}}} ]);

$sort + $match $match阶段会在$sort阶段之前运行

db.transactions.aggregate([ { $match:{ currency:"USD" } }, { $sort:{ price:1 }} ]);

$project + $skip $skip阶段会在$project阶段之前运行

$sort + $limit

如果两者之间没有夹杂着会改变文档数量的聚合阶段（例如$match），$sort和$limit阶段可以合并

$lookup + $unwind

连续排列在一起的$lookup 和$unwind阶段，如果$unwind应用再$lookup阶段创建的as字段上，则两者可以合并
> db.accounts.aggregate(

... [

... {

... $lookup:{

... from:"forex",

... localField:"currency",

... foreignField:"ccy",

... as :"forexData"

... }

... },

... {

... $unwind:"$forexData"

... }

... ]);

posted @ 2021-03-03 14:16 year12 阅读(59) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

year12

MongoDB实践八(聚合)

公告