Re: Median algorithm



Marcin wrote:
> How I can find median (middle element) in T(n) = O(n) int the worst
> case?

You can find the kth element using the following method (which will
actually move some elements as a side effect.)

1. Split the list into 5 sequential parts equal in length excepting
for the last part which may be off by up to 4. Then consder the
first element from each of the 5 parts and just directly sort them
(its 5 elements, so this is O(1).) Then do the same for the second
element from each list, and so on. On the steps where there are
only 4 elements just sort the 4 of them.
2. Using recursion, find the exact median entry of the middle part,
and call it m.
3. Now partition the whole list according to the pivot "m", in
quicksort style. It can be proven that this m will end up roughly
in the middle 3 fifths of the list (exercise to the reader.)
4. If m ends up positioned exactly at position k, then you are done
(return m.) If the position is less than k, then recurse on the
subparition after m, subtracting the position from k, otherwise
recurse on the subpartition before m.

This surprising algorithm is O(N) (again, exercise to the reader).
Using different part sizes (i.e., 3 or 7, say) doesn't work! As
promising as this sounds -- if you were to actually implement this as a
partitioning mechanism for quicksort it would easily be outrun by
heapsort, and merge sort in terms of performance. Part of the whole
reason that naive quicksorts do so well in terms of performance is
because pivot choice is typically done in some O(1) manner. There are
apparently ways of improving this algorithm (probably having to do with
being able to search for any element between the k1th and k2th
positions) but I have not thought too hard about it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

.



Relevant Pages

  • Re: Selection sort and bubble sort
    ... quicksort as a library function for the qsort() interface. ... median is estimated using the sample. ... selection sort, or a modified bubble sort), which is why I had not done so ...
    (comp.programming)
  • Re: Calculating Median on array of Longs
    ... there are duplicate array elements. ... Had a look at fixing that, but it is not that simple and ... maybe Jim Mack is right in that you really need some kind of sort. ... but can't think that I've ever had occasion to use median() in a spreadsheet. ...
    (microsoft.public.vb.general.discussion)
  • Re: any relation between median and mean?
    ... But there is in fact no need to sort the array: ... > median, and advance in from either end of the array swapping elements when ... The algorithm sketched above would be O. ...
    (sci.math.num-analysis)
  • Re: To Those Who Trash Bubble Sort
    ... > The only way FM's post made sense would be if the median meant ... since that is what quicksort does. ... mergesort, which is very often the most appropriate. ... I don't remember offhand whether or not heapsort is stable, ...
    (comp.programming)
  • Re: Pascal Compiler
    ... if one has var parameters. ... {find the median of three values from the data array} ... {sort part of the array} ...
    (comp.lang.logo)