Re: Dominance Tree Sorts

From: Richard Harter (cri_at_tiac.net)
Date: 08/28/04


Date: 28 Aug 2004 08:14:34 -0700

mwojcik@newsguy.com (Michael Wojcik) wrote in message news:<cglfau0kjv@news1.newsguy.com>...
> In article <41227327.365327442@news.sbtc.net>, cri@tiac.net (Richard Harter) writes:
> >
> > Each
> > comparison involves adding an item to a linked list, so the
> > cost of the inner loop operations will be higher than it is in
> > simpler sorts. On the other hand memory is often plentiful and
> > the inner loop costs should be no worse than those for an
> > ordinary tree sort.
>
> On the other other hand, all that linked-list traversal likely means
> poor locality of reference, which means bad cache effects. Which,
> come to think of it, is generally true of heap sorts. If you keep
> the children as an array rather than a linked list (overflows can be
> handled in various ways) that should help.

Performance problems with heapsort are often blamed on cache effects.
No doubt this is true; however the standard heapsort has further problems
that degrade perfomance. An obvious one is the complexity of the inner
loops.

A more subtle issue is that the heapify process throws information
away. For example suppose we have three elements x,y,z at locations
a[n], a[2*n], a[2*n+1] and we heapify them. We first compare y and z
and pick the dominant one, z for the sake of argument. We then compare
z with x, the top point in our little heap; if it dominates x we swap,
moving z up and x down. We now have a[n] > {a[2*n],a[2*n+1]} and that is
all we have. However if x was the dominant one we lose the information that
a[2*n+1] > a[2*n]. That is, the data structure doesn't distinguish between
x>z>y and z>x,z>y; in the former case it records x>z (true by transitivity)
and x>y, but not z>y. In effect 1/3 of a decision is lost. This information
loss occurs at several points in the standard heap sort. For this reason
the required number of comparisons for heapsort is n*[a*log2(n)+b] where a>1.

This kind of information loss does not occur either quicksort or mergesort.
Using a simplified analysis of the mergesort cost function we get
n*(log2(n)-1.0), which is pretty close to the optimal n*(log2(n)-log2(e)).
Although quicksort doesn't throw information away, it doesn't use it
quite as efficiently (partitions are equisized) so its 'a' is greater than 1.
As noted the dominance tree sort (my version anyway) has a comparison cost
function of ~ n*(log2(n)-1.27) or better (on random data), which is better
than mergesort but not by much.

I'm not sure that putting the linked lists in arrays helps particularly with
the cache problem although it is easy enough to put them in arrays. The
catch is that the references in the linked lists are scattered. There may
be something clever that one can do with moving the data so that cache misses
are reduced but I haven't thought it out though.

> Interesting idea and nice writeup, though.

Thanks.

>I'm still thinking about
> it. On casual inspection it does look like it might be quite
> amenable to being parallelized, and with SMP systems becoming ever
> more common that's certainly useful. (Of course, mergesorts are
> inherently parallel and very easy to code.)

Offhand it seems quite straightforward to parallelize it.

As a final note both mergesort and quicksort are intrinsically cache
friendly as long as there are at least two caches.



Relevant Pages

  • Re: Dominance Tree Sorts
    ... Performance problems with heapsort are often blamed on cache effects. ... This kind of information loss does not occur either quicksort or mergesort. ... Using a simplified analysis of the mergesort cost function we get ... catch is that the references in the linked lists are scattered. ...
    (comp.theory)
  • Re: fast stable sort
    ... Can you post C code for a stable in place mergesort function? ... How do you load a herd of strings into your ... in an array. ... place sorting/merging linked lists are irrelevent because they ...
    (comp.programming)
  • Re: hash tables, non-random keys
    ... That could be a showstopper for linked lists. ... Cost: one pointer per node. ... Another possibility is to use an array. ... In used slots, keep the values, and in unused slots, keep a ...
    (comp.programming)
  • Re: Questions on Using Python to Teach Data Structures and Algorithms
    ... linked lists produces cache misses rather often, ... arrays do not. ... Small dynamic arrays are faster than linked ...
    (comp.lang.python)
  • Re: Mergesort algorithm for linked lists
    ... Everyone knows how to sort arrays (e. ... For linked lists, mergesort is the typical choice. ... Does anybody know if Mcilroys optimization is ...
    (comp.lang.c)