Re: Problem with glob and filenames containing '[' and ']'



On 09/27/2006 06:33 AM, David Squire wrote:
Hi folks,

I'm having trouble using glob to find filenames that contain '[' and
']', even though I am escaping those meta-characters. Here is an example
script and output:

----

#!/usr/bin/perl

use strict;
use warnings;

use CGI::Deurl;

for my $EncodedFile (
'/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9%5B1%5D.txt',
'/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9.txt',

This creates two strings containing "Short \n assignment"

I think that's going to confuse glob big-time.

) {
my $OriginalFileBase = deurlstr($EncodedFile);
$OriginalFileBase =~ s/\.[^.]+$//; # trim extension
$OriginalFileBase =~ s/([\[\]{}?*~\ ,'`"])/\\$1/g; # escape
characters that are meta in glob;

Maybe quotemeta is better.

print "\$OriginalFileBase = $OriginalFileBase\n";
my @CandidateOrigFiles = glob ("$OriginalFileBase*");
print "\@CandidateOrigFiles:\n", join "\n", @CandidateOrigFiles;
print "\n###########################################################\n";
}

----

Output:

Sep 27 - 9:31pm % ./test.pl
<ENTER THE CGI QUERY. End with CTRL+D>
$OriginalFileBase = /damocles/documents/ENH1260/2006/2/Short\
assignment/20331975_week9\[1\]
@CandidateOrigFiles:

###########################################################
$OriginalFileBase = /damocles/documents/ENH1260/2006/2/Short\
assignment/20331975_week9
@CandidateOrigFiles:
/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9%5B1%5D.txt
/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9%5B1%5D.txt.webbed
/damocles/documents/ENH1260/2006/2/Short assignment/20331975_week9[1].doc
###########################################################


----

As you can see, the first iteration of the for loop produces no matches.
I have included the second, shortened filename, example to demonstrate
that the file I want really does exist. Likewise, at the bash prompt I
can do:

Sep 27 - 9:31pm % ls /damocles/documents/ENH1260/2006/2/Short\
assignment/20331975_week9\[1\]*
/damocles/documents/ENH1260/2006/2/Short assignment/20331975_week9[1].doc

I am at a loss...


DS

I changed your program slightly to run on my system, and I get different output:

#!/usr/bin/perl

use strict;
use warnings;


if (1) {
mkdir 'assignment';
system ('touch','assignment/file-1.txt',
'assignment/file[2].txt', 'assignment/file[3].txt');
}


for my $EncodedFile (
'assignment/file[',
'/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9%5B1%5D.txt',
'/damocles/documents/ENH1260/2006/2/Short
assignment/20331975_week9.txt',
) {
my $OriginalFileBase = deurlstr($EncodedFile);
$OriginalFileBase =~ s/\.[^.]+$//; # trim extension
#$OriginalFileBase =~ s/([\[\]{}?*~\ ,'`"])/\\$1/g; # escape characters that are meta in glob;
$OriginalFileBase = quotemeta $OriginalFileBase;
print "\$OriginalFileBase = $OriginalFileBase\n";
my @CandidateOrigFiles = glob ("$OriginalFileBase*");
print "\@CandidateOrigFiles:\n", join "\n", @CandidateOrigFiles;
print "\n###########################################################\n";
}

sub deurlstr {
my $url = shift();
$url =~ s/%([0-9a-f]{2})/chr(hex($1))/ige;
$url
}

----------
Output:

$OriginalFileBase = assignment\/file\[
@CandidateOrigFiles:
assignment/file[2].txt
assignment/file[3].txt
###########################################################
$OriginalFileBase = \/damocles\/documents\/ENH1260\/2006\/2\/Short\
assignment\/20331975_week9\[1\]
@CandidateOrigFiles:

###########################################################
$OriginalFileBase = \/damocles\/documents\/ENH1260\/2006\/2\/Short\
assignment\/20331975_week9
@CandidateOrigFiles:

###########################################################

-------

As you see, the $OriginalFileBase has a newline in it on the last two iterations. If you quote the newline (using quotemeta like I did), glob gets nothing because the directory "Short assignment" has no newline characters in it (and it also doesn't exist on my system :-) )

I don't know what a newline character in a directory name means to glob, but it probably doesn't help ;-)

Build up the original file name strings this way instead:

'/damocles/documents/ENH1260/2006/2/Short '
.. 'assignment/20331975_week9%5B1%5D.txt'


HTH


--
paduille.4058.mumia.w@xxxxxxxxxxxxx
.



Relevant Pages