Re: sorting algorithms

websnarf_at_gmail.com
Date: 02/03/05


Date: 3 Feb 2005 13:57:41 -0800

Randy Crawford wrote:
> starbuz wrote:
> > I was pondering over the general interest in sorting algorithms.
> > Quicksort seems to have close this chapter and now nothing
> > seems to be written. When I was young, finding the fastest
> > sorting algorithm was a competition like finding the longuest
> > prime numbers... Is the best sorting algorithm no more a
> > challenge ?

http://www.pobox.com/~qed/sort.html -- I think Heapsort is well worth
a second look.

> > I worked a lot on sorting problems two years ago, I saw nothing
> > new after quicksort. Only some algorithms that selects a better
> > pivot value... I have my own algorithm that uses 60% less time
> > than the classic quicksort. Do you think that a software company
> > or a research project could be interested by it ? Perharps int or
> > float values sorting is no more an intesting problem...
>
> The ideal sort must deliver a worst case runtime of NlgN, as well
> as efficient use of memory. No sort does both on all input set
> orderings and all sizes.

You haven't defined what you mean by "efficient use of memory". Heap
sort uses no additional memory and is O(n*ln(n)). The only real
problem with Heap sort is that it traverses the memory in a very
non-linear manner, which can hurt practical performance quite
dramatically for large n.

Also there exists an O(n) algorithm for finding the median of an
n-element array. Though, for this algorithm to be "memory efficient"
you must allow it to modify the array (in a non-destructive way) as a
side-effect -- fortunately these modifications are just entry swapping,
so it is compatible with the needs of the pivot picking algorithm of
Quicksort.

> [...] Because quicksort will perform optimally whenever you
> can select the right pivot every time, a lot of subsequent research
> went into improving QS's ability to select pivots.

Well after we've found the O(n) algorithm for finding the median of a
list of n numbers we are done. The reference I have at my fingertips
is Saraswat and Kapoor, but I don't know if they are the original
discoverers. This result is not very well known, and in fact was
pretty stunning to me when I first learned of it only a few weeks ago
(!) -- especially when you see the *details* of the algorithm.

The problem with this is that at best, finding the true median of n
elements requires 1.5*n comparisons on average, and has a worst case of
3*n comparisons. Compare this to a typical "median of 3" estimation
pivot choosing algorithm which is O(1) and requires, at most, *3*
comparisons.

In otherwords, along the path of making quicksort a guaranteed
O(n*ln(n)) algorithm, you give up its real world practical performance
advantage by making it at least twice as slow as the typical "naive"
implementations.

> [...] At the same time, a lot
> of work went into extending and improving other sorts (like radix),
> and creating hybrid sorts which select the most appropriate sort
> from among several, given some knowledge of the data -- size, bit
> distribution in the data, presort order, etc.

Introsort was probably the most interesting result in all this. The
idea is just to use a typical naive quicksort, and when the stack depth
goes too deep, to just switch over to heap sort.

> [...] Finally, much research has continued on
> building sorts for various parallel architectures (SMPs, clusters,
> trees, data-flow, vector, on-disk data, etc).

I have not followed the research in parallel searching, but this "shear
sort", posted by Percival above, looks quite remarkable -- a sort that
is actually less then O(n), now that is quite something!

> Perhaps a reason that not a lot of new sorts have been invented
> recently is that the right combination of existing sorts already
> delivers about 100% of the performance of the ideal sort.
>
> For more on sorting: http://www.devx.com/vb2themax/Article/19900

The performance of algorithms written in Visual Basic has absolutely no
relevance to anything.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/


Relevant Pages

  • Re: Fastcode poll - Should Sort include worst case situations
    ... median algorithm, not O), there exists sequences that will beat ... quicksort and make it O. ... by many compiler vendors is "Engineering a Sort ... If it detects that a portion of an array is responding badly, ...
    (borland.public.delphi.language.basm)
  • Re: "Sorting" assignment
    ... algorithm such as tbe bubble sort should be given a free pass because ... I would tailor which algorithm to start with by how the student thinks ... If you only learn the beautiful side of programming, ...
    (comp.programming)
  • Re: Fastcode poll - Should Sort include worst case situations
    ... Quite was regarded Hoare's original Quicksort. ... Reversely sorted data ... Anyway QuickSort is very simple algorithm and it's complexity is precisely analized. ... further detailed retrospective of many QuickSort implementations as well as other sort algorithm will take too much time and space - wikipedia is starting point and give some interesting links for further analyze who is interested. ...
    (borland.public.delphi.language.basm)
  • Re: sorting
    ... The algorithm you choose depends a lot on the number of ... sort will be quick enough. ... Quicksort or another method. ... use Bubble sort algorithm which is ...
    (microsoft.public.excel.programming)
  • Re: puzzle
    ... >> Christopher Barber wrote: ... >> or understanding among programming professionals. ... > algorithm "close to" O. ... bubble sort, it is almost always acceptable to use an interchange sort: ...
    (comp.programming)