Re: Mergesort algorithm for linked lists
- From: Eric Sosman <esosman@xxxxxxxxxxxxxxxxxxx>
- Date: Sun, 28 Jan 2007 18:24:57 -0500
Joerg Schoen wrote:
Eric Sosman wrote:Joerg Schoen wrote:I disagree. In my tests, the natural merges are always faster or onlyBoth N-1 and floor(N/2) are O(N), so Big-Oh doesn't tell enough
slightly slower. Please note that the additional compares to check
for natural runs is only O(N) which looks affordable to me.
of the story.
I don't quite get this. Mergesort is O(N * log(N)), adding another O(N)
only increases the leading factor but the higher N becomes the more
irrelevant gets this additional contribution.
For sufficiently large N, only the leading term and its
coefficient are important. But N is not always "sufficiently
large," and the lower-order terms can be significant or even
dominant when N is small. That's why Quicksort implementations
usually resort to insertion sort for short sub-files: even though
insertion sort is O(N*N), its simplicity typically gives a much
smaller coefficient than that of Quicksort's O(N*ln(N)), so it
winds up being faster for small N.
My interest was not in sorting lists of four million nodes,
but in developing a "general-purpose" utility that works well
over a wide range of list lengths and doesn't behave too badly
if the caller-supplied comparison function is sluggish. The
exercise has given me a renewed appreciation of the difficulties
and compromises faced by the developer of "general-purpose"
routines (and has increased my already healthy skepticism about
the practicality of code re-use). It is likely that I'd have
come to different conclusions if I'd started with a different
notion of how a "general-purpose" list-sorter would be used.
In the tests I ran, one variation of natural merge performed
respectably. But three variations of straight merge (all based
on McCarthy) did just a little better. YMMV.
On a reversed list, natural merge makes N-1 comparisons to form
N one-node sub-lists. Thereafter, it follows exactly the same pattern
of merges a straight merge would. In the worst case, then, natural
merge makes N-1 more comparisons than straight merge.
Yes, but that's neglicible for an O(N * log(N)) algorithm and improves
it in real cases with partial order.
I once tried to calculate the break-even point, the average
run length where natural merge's shallower tree repaid the up-front
investment of N-1 comparisons. IIRC (this was a while ago, and the
details of the calculation elude me at the moment), the average run
length needed to exceed 3.2 or so for natural merge to win over
straight merge. That may not seem like a lot, but it is extremely
rare in "random" input permutations. Only 15111 of the 362880
permutations of nine distinct keys have three or fewer runs and
hence an average run length of three or more; that's a little less
than 4.2%. For ten keys, only 1.3% of the permutations have three
or fewer runs, and the percentages keep diminishing. For even a
moderate N, very few permutations have fewer than N/3.2 runs.
On the other hand, it must be admitted that real input lists
are probably not "truly random." We're back to the difficult
question of developing general-purpose software for "typical"
cases, without any solid notion of what "typical" is. YM, as
I said before, MV.
--
Eric Sosman
esosman@xxxxxxxxxxxxxxxxxxx
.
- References:
- Re: Mergesort algorithm for linked lists
- From: CBFalconer
- Re: Mergesort algorithm for linked lists
- From: jo65
- Re: Mergesort algorithm for linked lists
- From: CBFalconer
- Re: Mergesort algorithm for linked lists
- From: Eric Sosman
- Re: Mergesort algorithm for linked lists
- From: CBFalconer
- Re: Mergesort algorithm for linked lists
- From: Eric Sosman
- Re: Mergesort algorithm for linked lists
- From: Joerg Schoen
- Re: Mergesort algorithm for linked lists
- From: Eric Sosman
- Re: Mergesort algorithm for linked lists
- From: Joerg Schoen
- Re: Mergesort algorithm for linked lists
- From: Eric Sosman
- Re: Mergesort algorithm for linked lists
- From: Joerg Schoen
- Re: Mergesort algorithm for linked lists
- Prev by Date: Re: checking for numbers
- Next by Date: Re: Typecasting portability in C
- Previous by thread: Re: Mergesort algorithm for linked lists
- Next by thread: Re: Mergesort algorithm for linked lists
- Index(es):
Relevant Pages
|
Loading