Re: trying to understand fork and wait

From: Ben Morrow (usenet_at_morrow.me.uk)
Date: 11/21/03


Date: Fri, 21 Nov 2003 01:26:22 +0000 (UTC)

jguad98@hotmail.com (John) wrote:
> Ben Morrow <usenet@morrow.me.uk> wrote in message
> news:<bpj1t3$7te$1@wisteria.csv.warwick.ac.uk>...
> > You do need
> >
> > use warnings;
> > use strict;
>
> I understand that 'use strict' forces 'good programming' but I don't
> know where to find more thorough explanation of the precise 'good
> programming' structures that 'strict' is enforcing ... i.e., I think
> it would be nice to know to some degree what is legal and illegal
> when using 'strict' before I write & execute (even though my
> tendency is usually to write first and debug later. :-)

perldoc strict

> > > $mypid=$$;
> >
> > my $mypid = $$;
> >
> > and so on throughout.
>
> is this to say I should use whitespace in or around variable
> assignments?
> that would cost me 2 more keystrokes! (j/k)

:) no, no, the important addition is the 'my'.

> > > open(PATTERNS,$patternfile);
> >
> > open my $PATTERNS, $patternfile or die "can't open pattern file: $!";
>
> is the use/lack of use of parentheses here a personal style issue,
> or is there a reason for this choice?

Sorry, personal style. It seems to be generally preferred in the Perl
community... but really doesn't matter. The important thing is the 'or
die'.

> > > foreach $domain (@domainlist) {
> >
> > for my $domain (@domainlist) {
>
> why "for" and not "foreach" ?

Again, simply style... 'for' and 'foreach' are precise synonyms, so I
prefer to save typing :).

> > > $return=`grep 'known end of session message' $logfile`;
<snip>
> > > if (! $return) {# grep failed, ergo file is active
> >
> > grep exits with 0 if it succeeds, so this test is precisely backwards
> > :).
>
> hmmm... variable $return is being instantiated with output from `grep`
> (note backticks?) so I would expect that if grep fails, there is no
> output, so $return will be empty.

Sorry, yes, brain not switched on... I would have used

  my $return = system "grep 'known end'";

which will return the exitcode of 'grep': 0 for success or 1 for
failure, so I assumed you had :).

> Is there a better way to test for the "EOS" string in $logfile?

I would always open the file and grep it in Perl, rather than forking
an external process. This can be as simple as: <untested>

  open my $FILE, "< $file" or die "horribly: $!";
  my @file = <$FILE>;
  my $live = not grep /end of session marker/, @file;
  close $FILE;

[I've re-ordered the text below as I think (hope:) it makes things clearer]

> > > $SIG{'CHLD'} = sub { wait(); };
> >
> > When a process exits, it returns a status code (like grep exitted with
> > 0 or 1 above). The parent process can collect this by calling wait or
> > waitpid. Until it does, the process has to sit around occupying a slot
> > in the process table, just to keep a hold of the exit code.

> hmmm ... okay, I understand that I need to reap(?)

To 'reap' a child process is to call 'wait' or 'waitpid' on it. The
analogy is with the Grim Reaper collecting up dead men's souls.

> my children to prevent zombies.

A zombie is one of these processes that has died but is still hanging
around to return its exit code.

> But I am still hazy on the mechanism and it's effects on the rest of
> the program. As you see, the main body is designed to loop around
> looking for user logs to read ... if at some loop iteration I find
> an active log and fork the child, I need some tool to wait on the
> child I just spawned.

> The thing that has been bothering me is this: if at loop iteration
> 34 I spawn a child and issue "wait", will my main program continue
> on to loop iteration 35, or does it pause there on loop 34 until the
> child exits to satisfy the wait function?

When you call wait, your process stops until a child dies. So in your
program, the main parent loop never wants to call wait: that would
stop the loop, which isn't what you want.

This is what the CHLD signal is for. Whenever a child process dies,
the parent process is sent SIGCHLD, which means that perl will stop
whatever it is doing and jump (asynchronously) to the piece of code
you put in $SIG{CHLD}. If you put the wait in there, then your parent
process will never try to wait for a child until one has just died,
which means that it'll never hang around waiting for one to die.

One small caveat is that if several children die 'at the same time',
i.e. another child dies before the parent gets a chance to handle the
SIGCHLD from the first, you will only receive one signal for the
lot. So a CHLD handler has to keep wait()ing for dead children until
they're all gone, which is what

> > the discussion under 'Signals' in perldoc perlipc.

is about. However,

> > The easy way to solve this is $SIG{CHLD} = 'IGNORE';, which says you
> > don't care about exit codes.

...i.e. all children will be automatically reaped as soon as you die,
and you don't need to worry about it any more :).

> > # From this point on we have two almost identical copies of the
> > # program running. The only difference is that one has $kidpid set to
> > # 0, the other to some positive number.
> >
> > if($kidpid) {
> > # parent
> > } else {
> > # child
> > }
> >
>
> My intent was to aid my own newbie understanding by explicitly testing
> 'kidpid == 0' as opposed to the implicit "exists" test indicated above
> in order to identify the child process.

This is not an 'exists' test. This is a test of truth: a value is
false in Perl if it is undef, 0, or ""; true otherwise. So

  if($kidpid != 0)

is equivalent to

  if($kidpid)

provided $kidpid is always going to be numeric.

> I want to ensure that the children do not step on each other (i.e.,
> no more than one child per userlog),

This is the tricky part: your business with grepping the
logfile *should* ensure that is the case. If it isn't reliable, or
anyway, you may rather keep a hash in the parent program with files
you have created processes for. So you want

  use POSIX qw/:sys_wait_h/; # for WNOHANG, which says 'don't wait if
                             # we've run out of dead children'
  my %pids, %files;

  $SIG{CHLD} = sub {
      while( (my $dead = waitpid -1, WNOHANG) > 0 ) {
          delete $pids{$files{$dead}}; # remove the dead child from
          delete $files{$dead}; # our records
      }
  };

at the top (if your system uses SysV signal semantics, see the
aforementioned section of perlipc), replace the 'unless' below with

  if($kidpid) { # parent
      $pids{$file} = $kidpid; # record the new child
      $files{$kidpid} = $file;
  }
  else { # child

and then test $pids{$file} to see if you have a child reading that
file. The reason for the two hashes is that we need to go both ways:
from a pid to a file, and from a file to a pid.

> and I want to ensure that the one original parent image is the only
> one to spawn children (hence the little quip about no
> grandchildren).

Yes: here, the child process can never escape the 'unless' block
(because of the exit at the end) so it will never fork again.

> > unless($kidpid) { # if we are the parent, just loop back
>
> so if $kidpid is > 0 we are the parent ... the "unless kidpid" will
> fail if "$kidpid ge 0", am I getting that correct?

Yes, except that you mean $kidpid > 0: ge in Perl is string
comparison, the opposite of shell.

> > open my $LOG, $file or die "can't open logfile $file: $!";
> >
> > while ( (my $line = <LOG>) !~ /known EOS/ ) {

Sorry, typo: ^^^^^ <$LOG>

> is that "my $line = <LOG>" an implicit loop?

No. The loop is the 'while'. The stuff inside the brackets of the
while is equivalent to

  my $line; # declare $line
  $line = <$LOG>; # read the next line from $LOG into $line
  $line !~ /known EOS/; # this is true iff the line doesn't match

so it will keep reading a line at a time into $line until it hits one
which matches.

> > my $now = localtime;
> >
> > system "logger $now $domain $user $line"
> > for grep { $line =~ /$_/ } @patternlist;
> > # or you would probably be better off using Sys::Syslog
> > }
>
> setting $now on each line of LOG is unnecessary for my purposes

True... you could do

  my $now = localtime, system "..." for grep {...} @patternlist;

if you'd rather, or

  if( grep {...} @patternlist ) {
      my $now = localtime;
      system "...";
  }

which may be better anyway as the others will make multiple log
entries if more than one pattern matches.

> > > exit 0;
> >
> > No need for this: 'falling off the end' is a perfectly valid way to
> > end a Perl program.
>
> the explicit exit, along with a bit more flower-boxing is a habit I
> developed to ensure that I know where the intended bottom of the
> script is ... I've had bad experiences with cut-n-paste'd code and
> incomplete copies where a hunk of a script has gone missing without
> me realizing it.

The standard way to indicate the end of a Perl script is __END__,
which perl itself will also understand. It makes anything after that
available as the DATA filehandle, so if you really want you can check
it's there at the top of the script with

  die "End of script missing!" unless *DATA{IO};

DATA and the *foo{THING} notation are described in perldoc perldata,
or *foo{THING} rather better in the Camel, chapter 8, the section
entitled 'Symbol table references'. Or you can just treat it as black
magic :).

> > See wait(2).
>
> Is "wait(2)" different from "wait"?

wait(2) is a standard way to refer to the manpage 'wait' in section 2
(which deals with syscalls). Read it with 'man 2 wait'. The reason I
wrote wait(2) was to show I specifically meant your system's manpage,
rather than the perl documentation.

Ben

-- 
'Deserve [death]? I daresay he did. Many live that deserve death. And some die
that deserve life. Can you give it to them? Then do not be too eager to deal
out death in judgement. For even the very wise cannot see all ends.'
 :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: ben@morrow.me.uk