rsync - find - perl - Super fast delete a folder with large number of files


http://www.dxulab.com/wiki/superfastdeleteafolderwithlargenumberoffiles


posted Jul 4, 2016, 12:08 AM by Dong Xu   [ updated Jul 6, 2016, 9:20 AM]

time perl -e 
'for(<*>){((stat)[9]<(unlink))}'

time find ./ -type f -delete (the BEST)

time find ./ -type f -
delete (the BEST)

time rsync -a --delete blanktest/ test/
time rsync -a --delete blanktest/ test/
time rsync -a --
delete
blanktest/ test/

TIME TAKEN
RM Command Is not capable of deleting large number of files
Find Command with -exec 14 Minutes for half a million files
Find Command with -delete 5 Minutes for half a million files
Perl 1 Minute for half a million files
RSYNC with -delete 2 Minute 56 seconds for half a million files
===============================================

Nice article. It inspired me to check results for find -delete, rsync and perl. I got another top. On my PC leader is find. Linux 4.2, Ubuntu 14.04, Intel i5 4 cores, Intel SSD 5xx series, EncFS encryption.

timeforiin(seq 1 500000); do echo testing >> $i.txt; done

real 1m13.263s
user 0m7.756s
sys 0m57.268s

Operation was repeated for each test with similar results.

$ time rsync --delete -av ../empty/ ./

real 4m5.197s
user 0m4.308s
sys 1m43.400s

$ time find ./ -delete

real 2m19.819s
user 0m1.044s
sys 0m59.100s

$ time perl -e 'unlink for ( <*> ) '
real 3m17.482s
user 0m2.524s
sys 1m29.196s

You can use this. You need to use glob for removing files:

unlink glob "'/tmp/*.*'";

These extra apostrophes are needed to handle filenames with spaces as one string.

Won't delete files with no "." in them. Won't delete files with a leading ".". No error reporting.

mkdir empty_dir
rsync -a --delete -P 
mkdir empty_dir
empty_dir/ your_folder/

Note "/" are needed!
===========================
Check a folder and list by sub-folder size
du -sh *|sort -h
============================


posted @   张同光  阅读(120)  评论(0编辑  收藏  举报
编辑推荐:
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
阅读排行:
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
点击右上角即可分享
微信分享提示