使用pymongo批量清空表数据

背景

性能测试环境,有些入侵的表数据特别多(可能超过1000W),如果使用SQL删除会很慢。但是如果直接把表drop表,索引会丢失。可以写一个脚本,批量清空mongo表数据。

实现思路

说明:以下 xx 为对应的表名

1、备份索引

db.xx.getIndexes();

2、删除表

db.xx.drop();

3、创建表

db.createCollection("xx");
4、使用上面备份的索引重建索引
db.runCommand(
   {
     createIndexes: "app_ids_detection",
     indexes: [
        {
            "v" : 2,
            "key" : {
                "uniqueKey" : 1
            },
            "name" : "idx_unique_key_1",
            "background" : true
        },
        {
            "v" : 2,
            "unique" : true,
            "key" : {
                "detectionId" : 1
            },
            "name" : "uniq_detection_id_1",
            "background" : true
        }
    ]
   }
 )

特别说明:创建表时,默认会有主键索引_id,所以第4步重建索引时需要排除掉主键索引。

 

脚本

新建一个 python 脚本:

#!/usr/bin/python3
#_*_encoding:utf-8_*_
 
"""
    清除表数据
"""
 
import pymongo
 
mongoclient = pymongo.MongoClient("mongodb://root:oK09J8foFJFOZwfASETKyjwB5CYbqvAw@10.106.18.217:57019/?authSource=admin&authMechanism=SCRAM-SHA-256&readPreference=primary&directConnection=true&ssl=false")
db_name = "tenant_b3afdca83bc00b013eda"
collection_names = [
    "app_ids_detection",
    "app_ids_detection_temp",
    "app_ids_element",
    "app_ids_activity",
    "app_ids_incident",
    "app_ids_incident_element",
    "app_ids_element_path",
    "app_ids_incident_statistics",
    "app_ids_incident_statistics_history",
    "app_ids_disposal_record",
    "app_ids_scan_result",
    "app_ids_detection_whiterule",
    "app_ids_white_rule",
    "app_ids_whiterule_hit_detection",
    "app_ids_login_event",
    "app_ids_notify_aggregate_event",
    "app_ids_script_dynamic_whitelist",
    "app_ids_success_login_event",
    "app_ids_memwebshell_enable_agent",
    "app_ids_memwebshell_inject",
    "app_ids_ioa",
    "app_ids_ioa_history",
    "app_ids_ioa_sync_send_status",
    "app_ids_ioa_sync_agent_history",
    "app_ids_yara_lib_version",
    "app_ids_element_operation_history"
]
db = mongoclient[db_name]
 
# 查询表的所有索引(不包含主键ID)
def get_collection_indexes(collection_name):
    collection = db[collection_name]
    index_info_dict = collection.index_information()
 
    index_list = []
    for index_key in index_info_dict:
        if index_key != "_id_":
            index_value = index_info_dict[index_key]
 
            # 增加索引名称
            index_value["name"] = index_key
 
            # 修改key结构:list->obj
            key_obj = {}
            key_list = index_value["key"]
            for field, field_sort in key_list:
                key_obj[field] = field_sort
            index_value["key"] = key_obj
 
            index_list.append(index_value)
 
    return index_list
 
# 重建表结构(清空表数据)
def re_create_collection(collection_name):
    collection = db[collection_name]
    collection.drop()
    db.create_collection(collection_name)
 
# 重建索引
def create_index(collection_name, collection_indexes):
    db.command({
        "createIndexes": collection_name,
        "indexes": collection_indexes
    })
 
if __name__ == "__main__":
    for collection_name in collection_names:
        collection_indexes = get_collection_indexes(collection_name)
        if len(collection_indexes) > 0:
            re_create_collection(collection_name)
            create_index(collection_name, collection_indexes)
            print(f"{collection_name} clear data success~")
        else:
            print(f"{collection_name} indexes is empty!")
清除表数据脚本

说明:

  • 执行脚本前,环境信息:根据具体环境,替换mongo账户名/密码,以及ip地址
    • python3环境
    • 安装pymongo模块:sudo pip3 install pymongo
  • 根据具体环境,替换mongo账户名/密码,以及ip地址
  • 根据具体需求,替换需要清除的库和表名称
 
posted @ 2023-05-23 09:23  仅此而已-远方  阅读(331)  评论(0编辑  收藏  举报