Re: fast stable sort
- From: seeWebInstead@xxxxxxxxxxxxxxxx (Robert Maas, http://tinyurl.com/uh3t)
- Date: Sat, 22 Nov 2008 02:13:30 -0800
From: CBFalconer <cbfalco...@xxxxxxxxx>
if you have an existing array, you can simply arrange an array of
pointers to the items, and sort that.
That is no good on large datasets that won't fit in actual RAM and
hence require swapping pages of virtual memory, because the
comparison function will be random-accessing the actual records in
each merge run instead of accessing them in sequential RAM
addresses, so swapping will thrash.
On small datasets, it doesn't matter, but on small datasets even
bubble sort is faster than you can blink, so there's no point in
doing anything more complicated, as you suggest.
On mid-sized datasets, where bubble sort is too slow, but the
entire dataset fits in actual RAM, your idea may be workable.
Another consideration is the CPU fast cache, which is much smaller
than the totality of RAM you have. Virtually any decent sized
dataset will thrash CPU fast cache with your algorithm, reducing
effective speed by an order of magnitude. Bubble sort might
actually beat out your idea on datasets larger than the CPU fast
cache but small enough for n*n behaviour of bubble-sort not to be
too bad, because bubble sort marches up and down consecutive RAM
addresses *all* the time, at worst sweeping an average of half the
entire array 2*n times, n times each direction, swapping the
left-most and right-most parts of the array in and out with each
sweep but keeping the middle part in the fast cache if the array
isn't *too* large.
IMO you really need to consider locality of reference when choosing
among algorthms that process large datasets.
By the way, the first sorting program I wrote, circa 1970-71,
implemented virtual memory at the user level to hold portions of
long variable-length data records, using a type of radix/key sort
(first 5 7-bit characters of each record were loaded into array,
with disk pointer kept also showing where to get next 5 characters,
packing those 5 characters into one 36-bit PDP-10 word in the usual
way). Heap-sort was used to sort just those 5-character words. Then
the array was traversed to find any duplicate words whereupon that
sub-array got the next 5 characters loaded from virtual memory and
that sub-array recursively sorted. If the data was mostly already
sorted, or if I was sorting the appendation of several
already-sorted files, then fetching those next five characters was
efficient, but if I was sorting a randomly shuffled dataset, such
as if the data was sorted on one key and I wanted to re-sort on
another key, then it thrashed horribly trying to load all those
next-5-characters from a subset of records scattered all over the
datafile.
I learned my lesson. My *second* file-merge program used
Huffman-tree (*) external merge sort instead of any type of
keysort, after making merge runs that each nearly filled RAM.
* (At each merge pass, merge the N smallest files into a single new
file, where N is the order of the merge. I think I used appx.
20-way merge, because my Macintosh allowed at most appx. 25 open
files at one time, so I had 20 input and one output and others
spare for other programs I might want to run in the middle of a
merge run. Thus dataflow is a fork-order-N Huffman coding tree.
The very last merge run was usually of smaller order because
fewer total merge-run files existed at that point.)
Note: My algorithm for knowing when to stop building a merge run
and write it out to disk and start building the next merge run was
really crufty: Macintosh Allegro Common Lisp 1.2.2 provided *no*
way to check how much free storage was still available, and no way
to throw an exception if I try to allocate more data when no free
memory is available. The only way to find out when you have filled
memory was *after* it ran out of memory causing a GC to be
triggered. So what I did was just before starting to build a merge
run I invoke GC myself to make sure all RAM is marked free and
available for loading data before next GC would be triggered, then
after parsing each record of input (and pushing it into a list of
not-yet-sorted records) I checked if another GC had occured.
Meanwhile, with each new record I also allocated and immediately
discarded some slack data, so when the GC occurred it would recover
all that slack data instead of crashing due to memory full. When
that second GC occured, I stopped reading records, called the
built-in SORT on the already-loaded data, and wrote it out to a new
merge-run file.
.
- Follow-Ups:
- Re: fast stable sort
- From: pete
- Re: fast stable sort
- References:
- fast stable sort
- From: subramanian100in@xxxxxxxxx, India
- Re: fast stable sort
- From: pete
- Re: fast stable sort
- From: CBFalconer
- fast stable sort
- Prev by Date: Re: Mergesort Vs Quicksort
- Next by Date: Re: fast stable sort
- Previous by thread: Re: fast stable sort
- Next by thread: Re: fast stable sort
- Index(es):
Relevant Pages
|