Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: mike3 <mike4ty4@xxxxxxxxx>
- Date: Sat, 03 Nov 2007 01:19:36 -0700
On Oct 31, 2:50 pm, rem6...@xxxxxxxxx (Robert Maas, see http://tinyurl.com/uh3t)
wrote:
Alert: For weeks, right up through Friday aftermoon, Google Groups
was showing search results only through Oct.13, so I had no way to
find more recent articles that mentionned my name, such as
followups to stuff I previously posted. Just Friday night GG
finally enabled me to find:> Date: Wed, 24 Oct 2007 14:35:00 -0700
From: mike3 <mike4...@xxxxxxxxx>
Since Friday I've been trying to catch up with backlog.
I'm less than a week behind at the moment.
I was doing the merge completely on disk.
Then I'm confused. From what you said elsewhere, it sounded like
you had enough RAM to do the whole task in RAM. Just load original
data sequentially from disk, do FFT in RAM, sequentially dump FFT
result to disk. So why would you do it any other way?
Testing. Trying to see how far the envelope can be pushed,
you know, that type of stuff. You know, like the guys pushing
to bust the sound barrier, pushing to drive a racecar at 400mph,
overclocking a CPU up to 5GHZ, etc. You think I'd stop at 128
megs of pi? :snicker: How about 512? :)
I have two drives. The matrix data is stored on one. I can't use
the second drive since it's not formatted for Linux use.
If your one usable drive has only a single seek mechanism (a single
set of heads that move all as a single unit), then you don't want
to do disk-to-disk merge (unless the whole dataset and all
temporary files reside in a single cylinder), because even if you
are reading each input file sequentially, and writing the output
file sequentially, you are jumping back and forth between the
various cylinders that are active for your current read/write
points in those three files at any one time.
Do you have a disk partition that you can completely empty of all
files then put your dataset there, to assure that it occupies
contiguous cylinders from the start of the partition, not
intermixed with any old files still scattered around that
partition?
These multidisk techniques seem odd in the light of the fact that
programs like PiFast designed from the ground up for high-efficiency
computation of pi, can do a pretty quick calculation using only the
one hard drive. I once tried doing an 800M pi run with PiFast when I
had only 512 MB of total RAM (of which a goodly chunk was also
given to the program for it's use), and it did not access the disk
anywhere near as nastily.
what about the large merge passes at the end where the size of
the data streams being merged exceeds the size of the ram buffers?
(Assuming you have at least three separate drives, preferably four:)
When you are merging, logically you need only one record from each
input stream in RAM, and then the smallest record gets written to
output stream and a new input record from that input stream is read
in to replace it. If records are smaller than disk blocks, then
you read a whole disk block into RAM, extract records from it until
it's exhausted, then replace that disk block in RAM from the next
block on the disk. For overlapped I/O, you use double buffering,
where you have the currently-being-processed input block from each
input file in RAM, and also you have a spare input block in RAM
being loaded in background from the next disk block so it'll be all
finished loading and sitting in RAM by the time you are ready to
start extracting records from it. So you need enough RAM for six
disk blocks (2+2 input and 2 output) plus two logical records plus
pointers/indexes/etc.
I've only got two hard drives, though. So I have to figure out
THE most efficient way (or at least half of that way's efficiency)
to calculate pi using that. I'd love to have access to the source
code for world-class pi calculators like PiFast and QuickPi so I
could see how they do what they do. Too bad it isn't available... :(
Oh, by the way, is this how they often do this type of stuff
on supercomputing farms where you've often got multiple
dedicated drives in each node? Although I'd bet those nodes
have got like gigs and gigs of ram in them, however when
you've filled THAT up... Man, with 4 drives I could use
similar strategies for the arithmetic! when doing like
an addition/subtraction I could store the two numbers to
be added/subtracted on 2 drives and use the 3rd to store
the sum, then just juggle the numbers between the
different disks! But I don't really like the idea of throwing
money at hardware just to do something as fluff as calculating
pi... Although I am planning on building a new computer
when I get enough money, to do more worthwhile things
such as 3D modeling and rendering. Maybe I'll put a bunch
of drives in it.
<snip>In fact I probably could have done it all in RAM.
In that case, you just load the input data once, do everything in
RAM, then write the output data once, so you only need one disk
drive. (For safety you write output as a new file, not overwriting
input file, in case an error crashes your system in the middle of
writing output. But you knew that, right?)
So you need to optimize paging between fast cache and RAM, whereby
for each merge pass the input files in RAM are swept just once
each, and output is written back into RAM sequentially, with your
working set of RAM pages small enough to fit into fast cache.
The logic is the same except you don't need multiple RAM drives. :-)
(Although if you *do* have multiple RAM devices, such that you can
read from one during the same clock cycle as writing to another,
that does speed up the algorithm by nearly a factor of
approximately 2.)
Ooooh, baby. It looks like I posted a link to the wrong file :(
That's my disk math package, not my transpose thingy. Here:
http://www.mediafire.com/?60c0cx4o6gn
I get a similar problem as with the other URL. I get a Web page
that looks like this, and I have no idea what to do next:
That's strange. Password protected file? I'm going to check
the settings... It does not appear to be protected. I don't know,
but I'm wondering if there's a problem on your end. Did you
try, say, just entering a blank password? Also, I'm not quite
sure what's going on with that disorganized text copy of the
site. Could you provide a graphical screenshot of your
web browser?
.
- Follow-Ups:
- Prev by Date: New benchmarks (was Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question))
- Next by Date: Re: New benchmarks (was Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question))
- Previous by thread: Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Next by thread: Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Index(es):
Relevant Pages
|
|