Re: Please critique this short script that scans a log file
From: Brian McCauley (nobull_at_mail.com)
Date: 12/17/03
- Next message: ngoc: "PAR error"
- Previous message: Brian McCauley: "Re: What CPU/Memory does a Wait cost?"
- In reply to: J.W.: "Please critique this short script that scans a log file"
- Next in thread: J.W.: "Re: Please critique this short script that scans a log file"
- Reply: J.W.: "Re: Please critique this short script that scans a log file"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 17 Dec 2003 18:18:14 +0000
J.W. <jwngaa@att.net> writes:
> I'm new to Perl and wrote a simple script to scan a database log file
> with embedded timestamps. This is similar to scanning other common
> text files with timestamped entries such web server utilization logs,
> transaction logs, etc.
>
> This script already works correctly and the -w doesn't report any
> warnings. However, I'd like to get any feedback on making it better.
Oooh.
First, "use warnings" rather than -w unless you need to run on Perl <5.6
> My experience has been with C++ and ksh/awk, so I'm not sure my brain
> is solving problems the "Perl" way yet.
> The example input file and resultant output file is at the bottom of
> this message.
>
> Notes:
> -- purposely didn't use Date::Calc (or any other non-standard module)
> -- could've used longer regex instead of nested if(regex) {if (regex)}
> but avoided it because of NFA regex engine
> -- the timestamp regex could've been simplified to /(...) (...) (..)
> (..):(..):(..) (....)/ but I deliberately wanted to use character
> classes to only extract valid date strings
Hmmm... OK I won't criticise that. (But it's a bad idea).
> -----
>
> #!/usr/bin/perl
> #
> # Read db2uext2.log.NODE0000 and calculate archive copy
> # durations (in seconds) between "Time Started" and
> # "Time Completed"
>
> use strict;
Add "use warnings"
> use Time::Local;
> use Time::Local 'timegm_nocheck';
>
> my $regex_ctime;
> my %MonthIndex;
> my $ts1;
> my $ts2;
> my $ts1_started;
> my $ts2_completed;
> my $rc;
Don't declare all your variables up-top, declare them in the smallest
applicable lexical scope as it defeats half the point of having
explicit declaration in the first place. 99% of the time this means
the declaration should be combined with the first assignment. This
has nothing particular to do with Perl, it applies equally in many
languages including C++.
> # $1 $2 $3 $4 $5 $6 $7
> #Time Started: Mon Dec 17 13:47:42 2001
> #Time Started: Wed Jul 23 17:35:35 2003 (20030723173535)
>
> $regex_ctime = qr/ (Sun|Mon|Tue|Wed|Thu|Fri|Sat)
> (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) ([ |0-3][0-9])
> ([0-2][0-9]):([0-5][0-9]):([0-5][0-9]) ([0-9][0-9][0-9][0-9])/o;
Never use the /o regex qualifier without understanding it. Since /o
does nothing in the line above I can reasonably assume you don't
understand it. :-)
> %MonthIndex = (
> "Jan", 0,
> "Feb", 1,
> "Mar", 2,
> "Apr", 3,
> "May", 4,
> "Jun", 5,
> "Jul", 6,
> "Aug", 7,
> "Sep", 8,
> "Oct", 9,
> "Nov", 10,
> "Dec", 11);
Consider using the "fat comma" => for readability.
> while (<>) {
> # chomp; # not needed for this type of processing
> if (/^Time Started:/) {
> if ($_ =~ $regex_ctime) {
IMHO this is better written as
if (/$regex_ctime/) {
> $ts1_started = timegm($6, $5, $4, $3,
> $MonthIndex{$2}, $7);
> $ts1 = "$1 $2 $3 $4:$5:$6 $7";
> }
You probably wanted an else clause to do something if Time Started:
wasn't followed by a valid time.
If not you may as well have combined the regex into
if ( /^Time Started:\s*$regex_ctime/)
> if (/^RC\s+:\s+(\d+)/) { # Return Code
> $rc = $1;
> }
Consider using post-condition syntax
$rc = $1 if /^RC\s+:\s+(\d+)/ # Return Code
> if (/^Time Completed:/ && defined($ts1_started)) {
Why are you checking defined($ts1_started)? Surely if the input was
invalid you'd _want_ your program to crash out or do something.
It is the bane of my life that inexperienced programmers often think
it is a good idea to make a program that continues without comment if
fed invalid data. Yes, there are times when this makes sense but in
general it is better that a program terminate or emit some sort of
error rather than do something undefined in the case that it is fed
invalid input. This has nothing to do with Perl.
> if ($_ =~ $regex_ctime) {
> $ts2_completed = timegm($6, $5, $4, $3,
> $MonthIndex{$2}, $7);
> $ts2 = "$1 $2 $3 $4:$5:$6 $7";
This code is a duplication of what you have above. Whenever you find
yourself writing the same thing twice you should always look to see
how it can be avoided. This has nothing to do with Perl, it applies
in all programming languages.
In this case you could have captured both started and completed in one
operation and stored the result in a hash.
if ( /^Time (Started|Completed):\s*($regex_ctime)/ )
$t{$1} = timegm($8, $7, $6, $5, $MonthIndex{$4}, $9);
$ts{$1} = $2;
)
> printf("$ts1 $ts2 %4d %3d\n", $rc ,
Do not compute the format string of printf() unless you
are really are computing a string of printf template elements.
Instead put %s in the format string and put the data to be
interpolated in the argument list. This has nothing to do with Perl,
it applies equally in all languages with printf().
> ($ts2_completed - $ts1_started));
> }
> }
> }
>
>
> -----
>
> #example log file excerpt used as input
> ********************************************************************************
> Time Started: Mon Dec 17 13:18:12 2001
>
> Parameters Passed:
> db2uext2 -OSAIX -RLSQL07010 -RQARCHIVE -DBKP1 -NNNODE0000
> -LP/db2/KP1/db2kp1/NODE0000/SQL00001/SQLOGDIR/ -LNS0007259.LOG
> System Action: ARCHIVE
> Target: DISK
> RC : 0
> Time Completed: Mon Dec 17 13:18:27 2001
>
> ********************************************************************************
You probably want a line of ******* to make your program clear all
its variables so that if one stanza is missing a particular data item
id does not carry forward from the previouis stanza.
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
- Next message: ngoc: "PAR error"
- Previous message: Brian McCauley: "Re: What CPU/Memory does a Wait cost?"
- In reply to: J.W.: "Please critique this short script that scans a log file"
- Next in thread: J.W.: "Re: Please critique this short script that scans a log file"
- Reply: J.W.: "Re: Please critique this short script that scans a log file"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|