使用pymongo批量清空表数据
背景
性能测试环境,有些入侵的表数据特别多(可能超过1000W),如果使用SQL删除会很慢。但是如果直接把表drop表,索引会丢失。可以写一个脚本,批量清空mongo表数据。
实现思路
说明:以下 xx 为对应的表名
1、备份索引
db.xx.getIndexes();
2、删除表
db.xx.drop();
3、创建表
db.createCollection("xx");
4、使用上面备份的索引重建索引
db.runCommand( { createIndexes: "app_ids_detection", indexes: [ { "v" : 2, "key" : { "uniqueKey" : 1 }, "name" : "idx_unique_key_1", "background" : true }, { "v" : 2, "unique" : true, "key" : { "detectionId" : 1 }, "name" : "uniq_detection_id_1", "background" : true } ] } )
特别说明:创建表时,默认会有主键索引_id,所以第4步重建索引时需要排除掉主键索引。
脚本
新建一个 python 脚本:
#!/usr/bin/python3 #_*_encoding:utf-8_*_ """ 清除表数据 """ import pymongo mongoclient = pymongo.MongoClient("mongodb://root:oK09J8foFJFOZwfASETKyjwB5CYbqvAw@10.106.18.217:57019/?authSource=admin&authMechanism=SCRAM-SHA-256&readPreference=primary&directConnection=true&ssl=false") db_name = "tenant_b3afdca83bc00b013eda" collection_names = [ "app_ids_detection", "app_ids_detection_temp", "app_ids_element", "app_ids_activity", "app_ids_incident", "app_ids_incident_element", "app_ids_element_path", "app_ids_incident_statistics", "app_ids_incident_statistics_history", "app_ids_disposal_record", "app_ids_scan_result", "app_ids_detection_whiterule", "app_ids_white_rule", "app_ids_whiterule_hit_detection", "app_ids_login_event", "app_ids_notify_aggregate_event", "app_ids_script_dynamic_whitelist", "app_ids_success_login_event", "app_ids_memwebshell_enable_agent", "app_ids_memwebshell_inject", "app_ids_ioa", "app_ids_ioa_history", "app_ids_ioa_sync_send_status", "app_ids_ioa_sync_agent_history", "app_ids_yara_lib_version", "app_ids_element_operation_history" ] db = mongoclient[db_name] # 查询表的所有索引(不包含主键ID) def get_collection_indexes(collection_name): collection = db[collection_name] index_info_dict = collection.index_information() index_list = [] for index_key in index_info_dict: if index_key != "_id_": index_value = index_info_dict[index_key] # 增加索引名称 index_value["name"] = index_key # 修改key结构:list->obj key_obj = {} key_list = index_value["key"] for field, field_sort in key_list: key_obj[field] = field_sort index_value["key"] = key_obj index_list.append(index_value) return index_list # 重建表结构(清空表数据) def re_create_collection(collection_name): collection = db[collection_name] collection.drop() db.create_collection(collection_name) # 重建索引 def create_index(collection_name, collection_indexes): db.command({ "createIndexes": collection_name, "indexes": collection_indexes }) if __name__ == "__main__": for collection_name in collection_names: collection_indexes = get_collection_indexes(collection_name) if len(collection_indexes) > 0: re_create_collection(collection_name) create_index(collection_name, collection_indexes) print(f"{collection_name} clear data success~") else: print(f"{collection_name} indexes is empty!")
说明:
- 执行脚本前,环境信息:根据具体环境,替换mongo账户名/密码,以及ip地址
- python3环境
- 安装pymongo模块:sudo pip3 install pymongo
- 根据具体环境,替换mongo账户名/密码,以及ip地址
- 根据具体需求,替换需要清除的库和表名称
知识改变世界