代码改变世界

多进程遍历目录并查找文件

2012-10-15 14:38  钱吉  阅读(723)  评论(0编辑  收藏  举报

有时候需要在一个深层次的目录下面查找某个类型的文件,这里利用遍历递归查找目录,并使用多进程操作提高效率。

如果现在只需要在一个目录下查找,脚本getfile_single.pl为:

use strict;
my @ARGV==2||die"usage:*.pl dir rec\n";
my $dir=$ARGV[0];##input directory
my $logfile=$ARGV[1];##result file to save path
my @all;
my $str;
my $grep = "";###this is the regular exression to match the file expected
open(FILE,">$logfile")||die"can't write the file:$!\n";
::ErgodicDirToGetFile($dir);
close(FILE);

sub ::ErgodicDirToGetFile
{
    my ($dir) = @_;
    if(-d $dir)
    {
        opendir(DIRHANDLE,$dir);
        my @dirs = grep(!/^\.\.?$/,readdir(DIRHANDLE));##delete the element "." and ".."
        closedir(DIRHANDLE);
    }
    
    foreach my $str(@dirs)
    {
        if(-d "$dir\\$str")
        {
            ::ErgodicDirToGetFile("$dir\\$str");
        }
        else
        {
            my $temp = "$dir\\$str";
            if($temp =~/$grep/)
            {
                print FILE "$temp\n";
            }
        }
    }    
}

 

如果有多个目录,或许多进程可以提升查找效率:

use strict;
my @all;
my $str;
my @pid;
my $i;
open(FILE,"dir.list")||die"can't open the file:$!\n";##dir.list is the file include all the directory you want to search for your desire file
@all=<FILE>;
chomp(@all);
close(FILE);

for($i=0; $i<@all; $i++)
{
    $str = $all[$i];
    defined($pid[$i]=fork())||die"can't fork\n";
    if($pid[$i]==0)##sub process begin
    {
        my $cmd = "perl getfilepath_single.pl $str result_$i.scp";##result_$i.scp is the result file to save path
        print "$cmd\n";
        system($cmd);
        exit(0);
    }
}
for($i=0; $i<@all; $i++)
{
    waitpid($pid[$i],0);
}