Re: perl multithreading performance
- From: "Peter J. Holzer" <hjp-usenet2@xxxxxx>
- Date: Fri, 29 Aug 2008 21:28:40 +0200
On 2008-08-28 17:49, Leon Timmermans <fawaka@xxxxxxxxx> wrote:
On Wed, 27 Aug 2008 14:25:32 -0700, dniq00 wrote:
Thanks for the link - trying to figure out whattahellisgoingon there :)
Looks like he's basically mmaps the input and begins reading it starting
at different points. Thing is, I'm using <> as input, which can contain
hundreds of gigabytes of data, so I'm not sure how's that going to work
out...
Is your computer 64 or 32 bits? In the former case mmap will work for
such large files, but the latter it won't.
Assuming <> is actually referring to a single file (if it doesn't, you
can just process several files in parallel), the same approach can be
used even without mmap:
Fork $num_cpu worker processes. Let each process seek to position
$i * $length / $num_cpu, and search for the start of the next line. Then
start processing lines until you get to position ($i+1) * $length / $num_cpu.
Finally report result to parent process and let it aggregate the
results.
hp
.
- References:
- perl multithreading performance
- From: dniq00
- Re: perl multithreading performance
- From: Leon Timmermans
- Re: perl multithreading performance
- From: dniq00
- perl multithreading performance
- Prev by Date: Re: subprocesses lifecycle
- Next by Date: Re: Internal limit on variable length?
- Previous by thread: Re: perl multithreading performance
- Next by thread: Re: perl multithreading performance
- Index(es):
Relevant Pages
|