Re: Wait for background processes to complete



On 2008-01-18 17:47, pgodfrin <pgodfrin@xxxxxxxxx> wrote:
This was a lot of fun. I would like to respond to the various
observations, especially the ones about the Perl Documentation being
misleading.
[...]
To restate the original task I wanted to solve:

To be able to execute commands in the background and wait for their
completion.

The documentation I am referring to is http://perldoc.perl.org/.

If you search on the concept of "background processes" this
documentation points you to the following in the Language reference >
perlipc > Background Process :
<begin quote>
You can run a command in the background with:

system("cmd &");

The command's STDOUT and STDERR (and possibly STDIN, depending on your
shell) will be the same as the parent's. You won't need
to catch SIGCHLD because of the double-fork taking place (see below
for more details).
<end quote>
There is no further reference to a "double fork" (except in the
perlfaq8 and only in the context of zombies, which it says are not an
issue using system("cmd &") ). This is confusing.

I find it more confusing that this seems to be in the section with the
title "Using open() for IPC". I fail to see what one has to do with the
other.

There is a general problem with perl documentation: Perl evolved in the
Unix environment, and a lot of the documentation was written at a time
when the "newbie perl programmer" could be reasonably expected to have
already some programming experience on Unix (in C, most likely) and know
basic Unix concepts like processes, fork(), filehandles, etc. So in a
lot of places the documentation doesn't answer the question "how can I
do X on Unix?" but the question "I already know how to do X in C, now
tell me how I can do it in perl!". When you lern Perl without knowing
Unix first, this can be confusing, because the Perl documentation
generelly explains only Perl, but not Unix.

I am not sure if that should be fixed at all: It's the perl
documentation and not the unix documentation after all, and perl isn't
unix specific, but has been ported to just about every OS.


The documentation for wait(), waitpid() and fork() do not explain that
the executing code should be placed in the "child" section of the if
construct.

Of course not, that would be wrong. There can be code in both (if you
wanted one process to do nothing, why fork?). What the parent should do
and what the child should do depend on what you want them to do. It just
happened that for your particular problem the parent had nothing to do
between forking off children.

Some of the examples in perlipc show code occurring in both
the parent and the child section - so it is still not clear. If it
were, why was I insisting on trying to execute the "cp" command in the
parent section?

I don't know. What did you expect that would do?

Your problem was:

I want a process to gather a list of files. Then, for each file, it
should start another process which copies the file. These processes
should run in parallel. Finally, it should wait for all these processes
to terminate, and then terminate itself.

Even this description makes it rather implicit, that the original
process creates children and then the children do the copying. If you
add the restriction that a process can only wait for its children, any
other solution becomes extremely awkward.

Besides it is exactly the same what your shell script did: The shell
(the parent process) created child processes in a loop, each child
process executed one cp command, and finally the parent waited for all
the children.


Furthermore, while the perlipc section is quite verbose, nowhere is
the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
variation of that wait() call mentioned.

Again, this is a solution to your specific problem. You said you wanted
to wait for all children, so Xho wrote a loop which would do that. This
is a rather rare requirement - I think I needed it maybe a handful of
times in 20+ years of programming outside of a child reaper function.


There are references to a
while loop and the waitpid() function, but being in the context of a
signal handler and 'reaping' - it is not clear.

This is the one situation where this construct is frequently used. It
can happen that several children die before the parent can react to the
signal - in this case the reaper function will be called only once, but
it must wait for several children. This isn't obvious, so the docs
mention it.

However, in your case you *know* you started several children and you
*want* to wait for all of them, so it's obvious that you need a loop. This
hasn't anything to do with perl, it's just a requirement of your
problem.


So, once again many thanks for the help. I would like to know, Peter,
are you in a position to amend the documentation? Also, the perlfaq8
does come closer to explaining, but it is simply not clear how the
fork process will emulate the Unix background operator (&). So how can
we make this better? How's about this:

fork doesn't "emulate &". fork is one of the functions used by the shell
to execute commands. Basically, the shell does this in a loop:

display the prompt
read the next command
fork
in the child:
execute the command, then exit
in the parent:
if the command did NOT end with &:
wait for the child to terminate

So the Shell always forks before executing commands and it always
executes them in a child process. But normally it waits for the child
process to terminate before displaying the next prompt. If you add the
"&" at the end of the line, it doesn't wait but displays the next prompt
immediately.

(ok, that's a bit too simplistic, but I don't want to go into the arcana
of shell internals here - that would be quite off-topic).



In the perlipc, under background tasks. make the statement - "the call
system("cmd &") will run the cmd in the background, but will spawn a
shell to execute the command, dispatch the command, terminate the
shell and return to the calling program. The calling program will lose
the ability to wait for the process as is customary with the shell
script command 'wait'. In order to both execute one or more programs
in the background and have the calling program wait for the executions
to complete it will be necessary to use the fork() function."

I think that would confuse just about anybody who doesn't have exactly
your problem for exactly the same reason you were confused by the
"double fork". It's very specific and should be clear to anyone who
knows what system does and what a unix shell does when you use "&" - I
suspect that sentence about the double fork was added after a discussion
like this one. But I agree that fork should definitely be mentioned here.
system("cmd &") almost never what you would want to do.


Then in the fork section. Show a simple example like the ones we have
been working with AND show a simple approach to using the wait
function.
Furthermore - add sample code to the wait() and fork()
functions that are simple and realistic,

fork and wait are very simple building blocks which can be combined in a
lot of different ways. Any sample code here can cover only basic
constructs like

my $pid = fork();
if (!defined $pid) {
die "fork failed: $!";
} elsif ($pid == 0) {
# child does something here
# ...
exit;
} else {
# parent does something here
# ...
# and then waits for child:
waitpid($pid, 0);
}

More complex examples should go into perlipc (which needs serious
cleanup, IMHO).

unlike the code same in the waitpid() function.

That code is simple and realistic.


In closing, it is perhaps non-intuitive to me that a fork process
should have the child section actually executing the code,

If you don't want the child process to do anything, why would you create
it in the first place?

but I ask you how one can intuit that from the sample in the camel
book and the samples in the http://perldoc.perl.org/. To really drive
the point home, Xho's code:

fork or exec("cp $_ $_.old") ;
do {$x=wait;} until $x==-1 ;

Is STILL not intuitive that the child is executing the code!

True. This code isn't meant to be intuitive. It's meant to be short.
I wouldn't write that in production code, much less in an answer to a
newbie question.

hp

.