Re: question about forked processes writing to the same file
- From: "it_says_BALLS_on_your forehead" <simon.chao@xxxxxxx>
- Date: 22 Oct 2005 16:48:03 -0700
Gunnar Hjalmarsson wrote:
> it_says_BALLS_on_your forehead wrote:
> > i think things are going well with the flock. slowed things down a
> > bit, but what can you do? :-).
>
> I have to ask: Did flock() actually make a difference (besides the
> execution time)?
>
> --
> Gunnar Hjalmarsson
> Email: http://www.gunnar.cc/cgi-bin/contact.pl
previously there would be web log records that had bits of another web
log record spliced in the middle of them.
after loading into a table, there were many (much more than normal)
that were thrown into an error table. after the flocking, we seem to
have reduced the number of error records to normal.
it's difficult to look at the processed flat file that was produced to
check for these splices--they appear sporadically, and may be
infrequent relative to the size of the data set (50 million), but once
it was loaded into a table, we could query against it and see if things
turned out ok.
we re-ran, and examined the error table real-time up until 25 million
records were loaded, and the number of errors seemed normal. it's
possible that the last 25 million records could have many many errors,
but unlikely.
also, when i used the non-flocking method, we got high error counts for
particular servers (some high volume ones), and since i'm parsing these
files based on reverse sorted size, i know that the these will be in
the 1st half of the total number of records. after the flocking, the
biggest files were still processed first, so when we looked at the 1st
25 million records, most of the big servers were done, so we know that
they, at least, had many less errors. so we think this solved the
problem...
still, i have the code written in such a way that i'm opening, locking,
writing to, and closing a file handle for every record. i wish there
were a way that lock and release without having to open/close every
time. i don't know if that would save much time or not. maybe the extra
time is not from the extra opening/closing, but from the locking and
having to wait to write. i will do more research and share if i find a
solution.
.
- Follow-Ups:
- Re: question about forked processes writing to the same file
- From: Gunnar Hjalmarsson
- Re: question about forked processes writing to the same file
- References:
- question about forked processes writing to the same file
- From: it_says_BALLS_on_your forehead
- Re: question about forked processes writing to the same file
- From: Gunnar Hjalmarsson
- Re: question about forked processes writing to the same file
- From: it_says_BALLS_on_your forehead
- Re: question about forked processes writing to the same file
- From: Gunnar Hjalmarsson
- Re: question about forked processes writing to the same file
- From: A. Sinan Unur
- Re: question about forked processes writing to the same file
- From: it_says_BALLS_on_your forehead
- Re: question about forked processes writing to the same file
- From: Gunnar Hjalmarsson
- Re: question about forked processes writing to the same file
- From: it_says_BALLS_on_your forehead
- Re: question about forked processes writing to the same file
- From: Gunnar Hjalmarsson
- question about forked processes writing to the same file
- Prev by Date: Re: question about forked processes writing to the same file
- Next by Date: Re: question about forked processes writing to the same file
- Previous by thread: Re: question about forked processes writing to the same file
- Next by thread: Re: question about forked processes writing to the same file
- Index(es):
Relevant Pages
|