pymongo查询技巧

from pymongo import MongoClient
mdb = MongoClient('120.xxx.xxx.xxx:20002', username='xxx', password='xxx')

# 数据240万
# no_cursor_timeout=True代表连接不中断，连续取
# batch_size = 2000代表每批次取2000条
# limit = 100限制100条
# skip代表跳过多少
# 比如在三台机器执行任务，一台直接取100万，第二台跳过100万限制取100万，第三台跳过200万
# find() 里面第一个花括号代表查询条件，第二个代表返回结果的字段（0不返回，1返回），在大量数据操作的时候很明显可以提升性能
images = mdb['testdb']['image'].find({"image_size.height": {"$exists": True}}, {"url": 1, "other": 0}, no_cursor_timeout=True).batch_size(2000).limit(100)
images = mdb['testdb']['image'].find({"image_size.height": {"$exists": True}}, {"url": 1, "other": 0}, no_cursor_timeout=True).batch_size(2000).skip(100)

根据图像ID批量返回数据：

image_ids = ["xxx", "yyy", "zzz"]

image_infos = mdb['testdb']['image_info'].find({"image_id": {"$in": image_ids}})

posted @ 2019-04-13 16:12 Adamanter 阅读(1262) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Adamanter

冷灯看剑，剑上几番功名？炉香无须计苍生，纵一川烟逝，万丈云埋，孤阳还照古陵。

pymongo查询技巧

公告