Re: crisis Perl
- From: "jl_post@xxxxxxxxxxx" <jl_post@xxxxxxxxxxx>
- Date: Wed, 22 Oct 2008 12:48:02 -0700 (PDT)
On Oct 16, 6:48 am, cartercc <carte...@xxxxxxxxx> wrote:
As to writing good code to begin with, it's easier said than done.
When it's almost midnight, and you've been at work for about 14 hours,
and you have people (not just your manager, but his boss, the big
boss, and the guy in charge of the project) calling you every 15
minutes, and the people you had promised things mad at you because you
were tasked with doing a job THAT THEY KNEW ABOUT FOUR WEEKS AGO BUT
DIDN'T GIVE YOU A HEADS UP (!!!) ... well, pretty code takes a back
seat.
Okay, I know people have already beaten this thread to death, but I
still wanted to add my input.
I've discovered writing any code, including crisis code, is vastly
improved when the lines "use strict;" and "use warnings;" are used at
the top of the script. You might be saying, "I don't have enough time
to put those in!" but hear me out:
It really doesn't take any more time to program with those lines
than without, and they are HUGE timesavers. They basically point out
obscure errors that would take hours to find without them, and they
end up forcing you to write cleaner code to begin with (which doesn't
take any longer than writing "un-clean" code).
I once had a co-worker who was learning Perl. He was writing a
script, and later told me that his script wouldn't compile, but
eventually figured out how to fix it. I praised him for fixing the
bug and asked how he fixed his script, and he replied, "I removed the
'use strict;' line and the script ran just fine." I told my co-worker
that that didn't actually remove the real problem and offered to look
over his code. Sure enough, there was a simple little error that was
easily fixed (that he didn't have enough experience to identify).
Why am I telling you this story? Because a lot of people
mistakenly think that programming with "use strict;" and "use
warnings;" takes lots longer than programming without them, but in
practice, I've found just the opposite is true (they tell you exactly
where a potential error is). All you have to do is learn to program
using them, and then you'll see your development times plummet (or at
least significantly drop). And if you don't regularly program with
them, I suggest learning to as soon as possible, and not waiting for
the next crisis to learn!
As an added bonus, the "use warnings;" line won't actually change
the behavior of your program, but will spit out warnings when it
thinks something's wrong with your script. You might consider this a
bonus because if you discover weeks later that, due to the warnings
your script is giving, your input is in the wrong format, you may have
an easier time convincing your manager that the script needs to be
cleaned up. Otherwise, the script will be silent about your incorrect
input assumptions -- giving the managers the false impression that
everything is okay. (I've seen this a lot myself.)
I also recommend adding "use strict;" and "use warnings;" to any
script that doesn't already have it. I know it's easier said than
done when dealing with "crisis code" that you didn't write yourself,
but the effort you take to convert the script (such as declaring
variables) will be worth it. I've had to do it several times myself,
and I can tell you that it gets easier the more times you do it.
And if you contest that you don't have the time to convert your
script to run with "use strict;" then I seriously recommend that AT
THE VERY LEAST you still put in "use warnings;". Like I said before,
"use warnings;" won't change the behavior of your script -- but IT
WILL report many simple errors, like misspelled variable names (which
are often difficult to track down), uninitialized variables, and input
in the wrong format (like trying to add two non-numbers together).
Also, be aware that you can still use "strict" and "warnings" in
blocks of code that you add in yourself. If you find yourself adding
a new block of code (such as in a loop or a condition), you can still
put "use strict;" and "use warnings;" in that block to take advantage
of their special powers of error-finding. (If you use any outside
variables in that block, however, you'll have to declare those
variables (with "my") where they're first used in order to prevent
"use strict;" from complaining. But that should be fairly easy to
do.)
Also, I highly recommend that you use "or die "[error message goes
here]: $!\n";" (or something similar) after every open(), opendir(),
and chdir() statement you use. Now, several people I've worked with
shun them saying "I don't want the program to die, I want it to
continue." This is usually just a cop-out, but if their reasons are
legitimate, then use "or warn" instead of "or die". The program will
still continue, but it will give a nice warning when something doesn't
happen that it expected to happen.
Another excuse I hear is, "I don't have time to type out 'or die
"blah blah blah: $!\n";' hundreds of time in my code" That's a lousy
excuse, but if they insist, then have them use just " or die;" (or AT
LEAST " or warn;"). Those seven extra keystrokes should be easy
enough for anyone to type that it won't increase development times
(presuming they CAN type).
I mostly grab data from one place, massage it, and send it to another
place. The format of the data I get, and/or the format of the final
form, changes several times a year. It's not that the code breaks a
lot, but that the specifications change. Since we can't 'fix' the
specifications, we have to 'fix' the code, and guess who gets blamed
if the code isn't 'fixed.'
I do a lot of "data massaging" myself, and I can say that it's a
life-saver to be able to code with "use strict;" and "use warnings;".
Otherwise, a simple misspelled variable name will create a bug that
can take hours to track down, and a parsed out value that I expect to
be a number but isn't can go undetected for years (literally!).
And, like you, I have also had cases where the specifications
change. Especially in those cases, I will take care to DOCUMENT IN
THE FORM OF COMMENTS what the input is supposed to look like,
including GIVING EXAMPLES of the input. The examples are a
lifesaver! That's because when the specifications DO change, then
I'll have a record of what the OLD specifications looked like, and
have an easier time massaging the new input into the old data
structures.
Here's an example of how I'd document examples:
# Look for a line that looks like:
# MGAP 30N50W ...INCREASING PRESSURE...
# and store the latitude and longitude in the %location hash:
if (m/^(\w{4})\s+(\d+[NS])(\d+[EW])\b/)
{
$location{$1}{latitude} = $2; # $2 looks like "30N"
$location{$1}{longitude} = $3; # $3 looks like "50W"
}
That way, when the next maintainer encounters the code, they won't
have to decipher the regular expression to figure out what it was
meant to do. They may have to verify that the regular expression does
as advertized, but in my experience this is MUCH simpler to do than
deciphering the expression from scratch, as deciphering the regular
expression (with no intent known as to what it's supposed to be
looking for) makes it almost impossible to figure out if it contains a
bug -- because there is no way to verify if its actual behavior is
what the original programmer meant it to be.
In other words, placing comments of expected input gives the
maintainer (which I suspect will probably be you) an easier time of
following the input along the code, and of being able to tell if a
possible bug lies in one area of code (versus another).
One more thing:
Once I read a post by a poster named "A. Sinan Unur" that
recommended declaring variables IN AS SMALL AS SCOPE AS POSSIBLE.
That is, instead of declaring all your variables at the top of your
script, declare then inside the smallest block needed. So if a
variable isn't needed OUTSIDE a loop nor BETWEEN its iterations, go
ahead and declare it INSIDE the loop brackets.
So instead of code like this (which I've seen before and and had to
clean up):
$level = 0;
$station = "";
.
.
.
while (blah)
{
# some code here
# code that uses $level and $station here
# more code here
$level = 0; # re-initializing for the next loop
$station = $var2; # re-initializing for the next loop
}
do this instead:
while (blah)
{
# some code here
my $level = 0;
my $station = "";
# code that uses $level and $station here
# more code here
}
Note that the first form had to initialize the variables TWICE (once
at the top, and once at the end of a loop to "prepare" the variables
for the next loop-iteration) but the second form only initialized the
variables ONCE and nothing has to be done about them at the end of the
loop.
So if you follow A. Sinan Unur's advice of declaring your variables
in as small as scope as possible, you'll find that your variables
won't have to be set as often, and they'll be easier to follow (and
you'll avoid a lot of unnecessary bugs).
(Just so you know, I used to prefer declaring all my variables at
the top of a function, but once I decided to try A. Sinan Unur's
advice I discovered that both my Perl code and my C++ code became much
cleaner and easier to follow! Naturally, now I prefer declaring my
variables in as small as scope as possible. Nowadays I cringe when I
see Perl and C++ with dozens of variables declared at the top --
almost always a handful of those variables are never even used in the
code!)
In conclusion, I recommend you follow these principles, even when
writing and modifying "crisis code":
1. If you don't already, ALWAYS program with "use strict;" and "use
warnings;" and learn to fix the errors they give out.
2. If you're fixing a script someone else has written that doesn't
use "strict" and "warnings", ALWAYS add "use warnings;" and strongly
consider adding "use strict;".
3. ALWAYS handle open(), opendir(), and chdir() statements. I
recommend you use "or die "[error message]: $!\n"", but using "or die
$!;", "or warn;" or any other method of handling a failure is also
acceptable.
4. When massaging input, ALWAYS give examples of the expected input.
The actual input will eventually change, but having examples of
previous expected input will let the future maintainer know which new
input goes into which existing data structures.
5. Get in the habit of declaring your variables in as small as scope
as possible. This will lead to less-polluted namespaces, less
variables to keep track of at any one time, and fewer chances of like-
named variables writing over each other's values.
I hope this helps. I've been in your situation before, and I've
found that following all of these suggestions helps immensely.
Take care, CC.
-- Jean-Luc Romano
.
- Follow-Ups:
- Re: crisis Perl
- From: cartercc
- Re: crisis Perl
- References:
- crisis Perl
- From: cartercc
- Re: crisis Perl
- From: Charlton Wilbur
- Re: crisis Perl
- From: cartercc
- crisis Perl
- Prev by Date: FAQ 4.67 How can I make my hash remember the order I put elements into it?
- Next by Date: Re: greping a value from a file
- Previous by thread: Re: crisis Perl
- Next by thread: Re: crisis Perl
- Index(es):
Relevant Pages
|