Re: readdir formulated badly? gives wrong count



On Dec 1, 1:47 am, rea...@xxxxxxxxxxx wrote:
Sorry to plop so much shabby code up here but I think I've stared at
this a little too long and am now incapable of catching my error.

This started out as just a helper script to help solve this problem:

Needing to cp files with number names from one directory to another.
Sometimes there might be some numbered files already in the target
directory. So the problem was really two fold.

1) Find the right files to copy by specific content.

I'm doing this job with grep -rl, it seems to be the fastest by
a hefty margin, to compile a list of the right files.

2) Once the list is compiled, copy the files over to a new directory
making sure not to clobber possible exisiting files.

I'll probably try to turn the perl script into something more general
that does the whole job eventually if this chore looks like it might
recur enough.

So (2) is all this script was meant to do at this point.

The script does the job and appears to run error free. But after the
work is done I attempt to report 2 things. The number of files copied
and the number of files now in the Target.

The second count is consistently wrong by a wide wide margin and I
am too blind to see why.

Here is a sample run (code encluded at the end) running the script
the way it was intended to be used (script.pl TargetDir SourceList):

./cpNoClobber.pl test3 SourceList

<16> Files copied to:
/home/reader/News/agent/nntp/news.gmane.org/gmane/comp/lang/perl/test3

<17> files now in:
/home/reader/News/agent/nntp/news.gmane.org/gmane/comp/lang/perl/test3

Now lets look at the results of `ls' on the target:

ls test3/[0-9]*|wc -l
336

The script gets numbering right with no clobbers I think, but the reporting
is way wrong.

=====* Code *===== =====* Code *===== =====* Code *=====

./cpNoClobber.pl

#!/usr/local/bin/perl -w


# The following lines are your friend
use strict;
use warnings;

if(!$ARGV[1]){
usage();
print "Too few arguments.. exiting\n";
exit;

}

if($ARGV[2]){
usage();
print "Too many arguments... exiting\n";
exit;}

use Cwd;
my $Cwd = getcwd;

use File::Copy;


Define variables in the smallest scope
Plus only define variables you use, otherwise it can get confusing

my ($HiNum, $Copied, $TrgDir, @DirContent, $Cnt, $Line);
$TrgDir = shift;
$HiNum = 0;
$Copied = 0;


#Use lexical file and directory handles:

opendir my $DIR, $TrgDir or die "can't opendir $TgtDir: $!;

opendir(DIR,$TrgDir) or die "can't opendir $TrgDir: $!";

## Find out if target directory has any numbered files in it.
## If it does get the highest number in there

To declare in smallest scope:
my @DirContent = grep { /^[0-9]/ } readdir($DIR);
@DirContent = grep { /^[0-9]/ } readdir(DIR);

This test add nothing. The for loop will fail if there is no content

if ($DirContent[0]){
for(@DirContent){

Do these files only contain numbers?

if($_ > $HiNum){
$HiNum = $_;
}
}

}

## If we already have numbered files, Start our counter with
## the highest one
if($HiNum){
$Cnt = $HiNum;

}

while(<>){
chomp;
$Cnt++ ;
## The first if clause here is unnecessary now... to be removed
if($TrgDir){
## Add the correct path if our line doesn't start with '/'

Use different delimiters if searching for a slash( also add some
whitespace
e.g. if ( !#^/#)

if(!/^\//){
$Line = $Cwd ."/". $_;
}
if($TrgDir !~ /^\//){
$TrgDir = $Cwd."/".$TrgDir;
}
## Do what we came here for: copy the target files into target
## directory with fnames that won't clobber any existing filenames
copy($Line,$TrgDir. "/". $Cnt) or die "can't copy $Line to <$TrgDir>: $!";
$Copied++;
}

}

## Report how many files were copied
print " <$Copied> Files copied to:\n $TrgDir\n\n";

print " ".CountTrgDir()."\n";

should be closedir($DIR);

close(DIR);

sub CountTrgDir {
my $DirCnt = 0;
## Recount the TargetDir

The main problem, I believe, is that you are using a stale directory
handle. closedir (in main) and opendir here.

@DirCnt = grep { /^[0-9]+/ } readdir(DIR);

## Just in case there was something wrong with my technique using
## the array to get a count.. also tried the formulation commented out
## below to see the actual count as it happened... still wrong
# @DirCnt = grep { /^[0-9]+/&& print "Counting <". $DirCnt++.">\n" } readdir(DIR);
return "<".(@DirCnt + 1)."> files now in:\n $TrgDir\n";

}


use File::Basename;

sub usage {

# use this line to get the name of your script
# perldoc File::Basename
my ( $myscript ) = fileparse( "$0", "" );


print <<EOM;

Purpose: To copy files from a list on disk to the directory of choice.
Numbering them with nonClobber names as we go (even if some
numbered files are already in the directory).
Usage: \`$myscript TARGETDIRECTORY SOURCEFILE'
(DIRECTORY= name of target directory)
NOTE: SOURCEFILE is likely to have been compiled like this:
grep -rl 'REGEX' SOMEDIRECTORY >SOURCEFILE (where SOMEDIRECTORY
is a directory full of email or posts. We are looking to skim
off those with REGEX placing the filenames in SOURCEFILE, and
copy them to another directory using \`$myscript ', so as to leave
mail or news directory unharmed.

EOM

}


Made a few comments in the code. None of them tested - off the top of
my head.
HTH, Ken

.



Relevant Pages

  • readdir formulated badly? gives wrong count
    ... Sorry to plop so much shabby code up here but I think I've stared at ... Sometimes there might be some numbered files already in the target ... I'll probably try to turn the perl script into something more general ... NOTE: SOURCEFILE is likely to have been compiled like this: ...
    (perl.beginners)
  • Re: readdir formulated badly? gives wrong count
    ... Sometimes there might be some numbered files already in the target ... I'll probably try to turn the perl script into something more general ... You are doing boolean tests on the contents of array elements when you ... sub usage { ...
    (perl.beginners)
  • Re: .old files being created by root
    ... The external program running on another server but connected to the overall system network apparently picks them up with an ftp program. ... the other end must be running a script (not a human logging in ... She should be in the target directory or some directory ... you never said if this is an SCO UNIX OS or Linux or some other UNIX. ...
    (comp.unix.sco.misc)
  • Re: kbuild make deb patch
    ... > target in any way. ... +# Simple script to generate a deb package for a Linux kernel. ... +# package install and removal. ... +cat <<EOF> debian/changelog ...
    (Linux-Kernel)
  • Re: Slightly more complex Copy script - a challenge perhaps?
    ... I'll start digging through this script ... have a new folder created within ... where the SOURCE is newer than the TARGET. ... Call DisplaySourceFileInformation ...
    (microsoft.public.scripting.vbscript)