Re: question about forked processes writing to the same file




Gunnar Hjalmarsson wrote:
> it_says_BALLS_on_your forehead wrote:
> > i think things are going well with the flock. slowed things down a
> > bit, but what can you do? :-).
>
> I have to ask: Did flock() actually make a difference (besides the
> execution time)?
>
> --
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl

previously there would be web log records that had bits of another web
log record spliced in the middle of them.

after loading into a table, there were many (much more than normal)
that were thrown into an error table. after the flocking, we seem to
have reduced the number of error records to normal.

it's difficult to look at the processed flat file that was produced to
check for these splices--they appear sporadically, and may be
infrequent relative to the size of the data set (50 million), but once
it was loaded into a table, we could query against it and see if things
turned out ok.

we re-ran, and examined the error table real-time up until 25 million
records were loaded, and the number of errors seemed normal. it's
possible that the last 25 million records could have many many errors,
but unlikely.

also, when i used the non-flocking method, we got high error counts for
particular servers (some high volume ones), and since i'm parsing these
files based on reverse sorted size, i know that the these will be in
the 1st half of the total number of records. after the flocking, the
biggest files were still processed first, so when we looked at the 1st
25 million records, most of the big servers were done, so we know that
they, at least, had many less errors. so we think this solved the
problem...

still, i have the code written in such a way that i'm opening, locking,
writing to, and closing a file handle for every record. i wish there
were a way that lock and release without having to open/close every
time. i don't know if that would save much time or not. maybe the extra
time is not from the extra opening/closing, but from the locking and
having to wait to write. i will do more research and share if i find a
solution.

.



Relevant Pages

  • Re: string substitution problem
    ... >> Does that mean that if a Perl program, ... > addressed via semaphore locking or one of my locking mechanisms ... flock() works on some Win32 systems. ... Gunnar Hjalmarsson ...
    (comp.lang.perl.misc)
  • Re: UW imap-2006b: 64 bit problem
    ... UW imapd is obliged> to do considerable more work on SVR4 systems than it does on BSD and> Linux systems which offer flock() locking as an alternative to POSIX> locking. ... IEEE Std 1003.1-1988 requires that all fcntl() locks associated with a file for a given process are removed when *any* file descriptor for that file is closed by that process. ... Put another way, before any library routine opens a file, it must be aware of what files the application and any other libraries have open and locked; otherwise the library routine will remove the lock unexpectedly when it closes the file. ...
    (comp.mail.imap)
  • Re: Whats up with our stdout?
    ... both fcntllocking and flockfail in my ... nfs-mounted without nolockd, and also without rpc.lockd or rpc.statd. ... nfs without the rpc daemons really doesn't support remote locking, ... flock() from failing and other times gave a hung flock. ...
    (freebsd-arch)
  • Re: question about forked processes writing to the same file
    ... >> Gunnar Hjalmarsson wrote: ... I do use flock() to set an exclusive lock before ... > perldoc -f flock ...
    (comp.lang.perl.misc)
  • Re: Meet the flocker
    ... Always use the constants in Fcntl.pm for the flags to flock and seek. ... separate lockfile but then using flock on it is when the resource you're ... locking is not a regular file, so you can't flock it directly. ... sure to unlock by unlinking the lockfile, unlike a flock lock which the ...
    (comp.lang.perl.misc)