my script crashes when I try to rename the file!
- From: He Who Greets With Fire <Entwadumayla@xxxxxxxxxxxxxxx>
- Date: Tue, 04 Mar 2008 20:40:37 GMT
This is a post from another perl group, but I thought I would post it
here, too.
I worked through the program and everything seems to work except the
file renaming code. See my comments at the bottom....
On Tue, 04 Mar 2008 11:13:44 -0600, He Who Greets With Fire
<Entwadumayla@xxxxxxxxxxxxxxx> wrote:
On Tue, 4 Mar 2008 16:15:20 +0000, Ben Morrow <ben@xxxxxxxxxxxx>
wrote:
Quoth He Who Greets With Fire <Entwadumayla@xxxxxxxxxxxxxxx>:
OK, thanks, but the script does not seem to rename the files.
I added some troubleshooting code, most of which I commented out. I
also moved a copy of the personalinjury folder and all its files
inside the C:\Perl directory so it can access it directly.
Don't do that. You can set the working directory from within your Perl
script using the chdir function. In any case, the working directory may
not be what you expect under Win32.
the directory/folder location is not a problem. Like I said, the
script is indeed able to access the files in the folder and open them
and increment through them. So, everything seems to be OK on that
front.
See below for my additional comments.
#!/bin/perl
Perl is *never* installed as /bin/perl.
but it already works in that regard--the script executes
#sleep 2;
print "here I am! \n";
Diagnostics like this are better given with warn, which will .a. print
them to STDERR, where they ought to be and .b. tell you where you are in
the script.
Well, I'm not actually a programmer, just someone trying to do some
organization of my files. So, that issue is not a concern right now.
#sleep 2;
my $counter =1;
foreach my $file ( glob 'personalinjury/*.htm' ) {
# print "here I am A \n";
# sleep 1;
open my $PI, '<', $file or die "could not open '$file' $!";
# print "here I am! B \n";
# sleep 1;
print $counter;
print "\n";
while ( <$PI> ) {
# print "\n inside whileloop";
I AM getting to this point here.
next unless /Citation: [\d-]+.*([\d.]+)/;
but I never get to this point here--apparently the regex never sees a
match for the "Citation:" etc string.
Here is a screen shot of the typical file, with a red arrow pointing
to the string in this particular file that I want to match.
I do not know why the regex does not see a match, because it looks
like it matches it???
See here:
http://img225.imageshack.us/img225/91/citationue2.jpg
*DON'T* do that.
Don't do what?
Had you done the right thing, and copy-pasted a small
section of the relevant file into your message, you would have found
that the file doesn't in fact contain the string 'Citation: whatever' at
all. It's an HTML file, so there is markup in there as well, and the
string may well be spread across several lines. Get into the habit of
looking at files in a text editor before you try parsing them with Perl.
That is a good point. When I wrote my financial news project that
parsed news stories for negative and positive words, I passed over all
words that were surrounded by html brackets.
Here are two excerpts from the source html for a typical file in that
folder:
here is the html source snippet that the script is looking for:
<td class="toolbar" align=right valign=top width="1%"
nowrap>Citation: </td>
<td class="toolbar" valign=top width="99%"><b>21-340 Dorsaneo, Texas
Litigation Guide § 340.02</b></td>
Yes, you are correct: the HTML code is throwing off the script.
Here is another snippet that looks much more promising. The TITLE of
the html page. This is not the instance of "citation....etc" that I
was looking for, but now that I see it, it looks like a good candidate
for use as a filename:
<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
Litigation Guide § 340.02</title>
Are the angle brackets special characters in perl so that they have to
be backslashed inside the regex?
I wonder if this regex would work?
next unless /\<title\>Get a Document - by Citation -
[\d-]+.*([\d.]+)\<\/title\>/;
my $newfile = $1;
rename $file, "$newfile.htm" or die "could not mv '$file' $!";
print "\n renamed a file";
sleep 1;
last;
}#end while
If you had used proper indentation, you would be able to see that
comments like this are completely useless.
Not sure what you mean?
Ben
well, I changed the program quite a bit so as to be able to target a
match with the title string shown above. And I am able to find the
title string and extract the needed numbers, and I have been able to
place those numbers in a string variable.
BUT the problem is that the program crashes whenever I try to rename
the file using the string that I extracted.
Here is the program.
#!/bin/perl
#sleep 2;
print "here I am! \n";
sleep 2;
my $counter =1;
foreach my $file ( glob 'personalinjury/*.htm' ) {
# print "here I am A \n";
sleep 1;
open my $PI, '<', $file or die "could not open '$file' $!";
# print "here I am! B \n";
# sleep 1;
print $counter;
print "\n";
while ( <$PI> ) {
print "\n inside whileloop";
sleep 1;
#<title>Get a Document - by Citation - 21-340 Dorsaneo, Texas
#Litigation Guide § 340.02</title>
warn;
next unless /\<title\>.+Guide\s+§\s+(\d+\.\d+).?\<\/title\>/;
my $newfile = $+;
this rename line below is what causes it to crash, so i commented it
out:
#rename $file, "$newfile.htm" or die "could not mv '$file' $!";
But I cannot read what the error message says because the dos window
just closes. Where can I read what was in the window before it
crashed? And how can I rename the file? What went wrong with the
renaming?
The $newfile variable DOES contain the accurate and desired
information at this point, as shown by the print statement below.
print "\n renamed file to ";
print $newfile, "\n";
sleep 1;
last;
}#end while
$counter++;
print "\n count is ";
print $counter;
print "\n";
#sleep 1;
close $PI;
} #end foreach
sleep 5;
any thoughts????
.
- Follow-Ups:
- Re: my script crashes when I try to rename the file!
- From: Jim Gibson
- Re: my script crashes when I try to rename the file!
- Prev by Date: Re: variable help
- Next by Date: Re: variable help
- Previous by thread: variable help
- Next by thread: Re: my script crashes when I try to rename the file!
- Index(es):
Relevant Pages
|