MongoDB查询重复记录并保存到文件csv

 

客户1w用户记录,发现里面有小部分重复数据

需要查出,比对哪些信息不同

https://docs.mongodb.org/manual/reference/operator/aggregation/#aggregation-pipeline-operator-reference

https://docs.mongodb.org/manual/reference/operator/aggregation/group/#pipe._S_group

https://docs.mongodb.org/manual/reference/operator/aggregation/addToSet/#grp._S_addToSet

 

var keys = "";
db.users.find().limit(1).forEach(function(u){
    for(var p in u)
    {
        keys += p + ",";
    }    
});

keys = keys.trimRight(",");
print(keys); //输出csv列名
db.users.aggregate([{$group: { _id: "$prid", values: {$addToSet: "$$CURRENT"}, total: {$sum: 1}}}, {$match: {total: {$gt: 1}}}]).forEach(function(g){
    g.values.forEach(function(v){
        var line = "";
        for(var key in v)
        {
            line += v[key] + ",";
        }
        line = line.trimRight(",");
        print(line); //输出重复数据
    })
})

保存上述代码比如到D:\mongojs\aggregate.js

运行

mongo yourdb aggregate.js > repeated.records.csv

 

posted @ 2016-01-22 16:20  金天笔记  阅读(2720)  评论(0编辑  收藏  举报