Re: Ranking Without Sorting

From: Richard Harter (cri_at_tiac.net)
Date: 07/25/04


Date: Sun, 25 Jul 2004 15:47:02 GMT

On Sun, 25 Jul 2004 10:29:33 +0000 (UTC), Ian Woods
<newspub2@wuggyNOCAPS.org> wrote:

>doublea987@aol.com (Aarlo Stone Fish) wrote in
>news:c6894599.0407242216.bb0eee9@posting.google.com:
>
>> "Ricardo Gibert" <nospamgibert@cox.net> wrote in message
>> news:<tBDMc.9150$_K2.7568@lakeread02>...
>>> I was thinking about how one could rank the elements of an array
>>> without sorting the elements of the array. It's a simple program to
>>> do this in O(n*n), but my question is how can this be done better
>>> than this without sorting?
>>
>> Kind of an interesting question, how to rank things without
>> sorting...How about something like a bucket procedure (or a bucket
>> "sort," if you will), where you have an array "m" with 256 indices (or
>> whatever the range is of things you want to sort in array "a") and for
>> every index "i" into array "a", m[a[i]] = i? And then it's O(n +
>> p)--you have to go through array "a" to put things in their places in
>> array "m" and then go through array "m" to see which elements of "a"
>> are used... On second thought, this assumes that all values are
>> unique, but I guess you could use some kind of recovery feature. So
>> it begins to sound like a sorting issue.
>>
>> Aarlo Stone Fish
>
>It apparently must turn into a sorting issue: no matter what you do, at
>some point you want to use an ordering of the original array... and
>that's precisely what sorting provides.
>
>Personally, I'd simply use some data structor which referrs to the data
>in the original array (an array of pointers, reference, indices, whatever
>works nicest in the target language) and sort that. We're now back in
>very well trodden territory with a number of approaches to choose from.
>
>If you don't need the whole ranking but, say, merely the top 5 (or the
>top 'k' elements), then there are techniques which aren't as 'big' as
>doing a full sort. The obvious one is O(kn):
>o take a list of references into the original array of k elements in size
>o make the list refer to the first k elements of
>o iterate over the elements of the array starting at k+1
>o whenever an element is found in the original array which is 'better'
>than the 'worst' element referred to in the list, remove the reference to
>the worst element and insert a reference to the better element in the
>appropriate place.
>
>Using a different data structure can improve on the O(kn) time... O(n log
>k) for using a heap to hold the k elements if my memory isn't playing
>tricks on me about the properties of heaps.

Note 1: Since a ranking can be used to sort an array in time O(n),
and since (comparison) sorting is O(n*log n) it follows that ranking
is necessarily O(n*log n).

A simple way to do the ranking is to have an auxilliary array of
indices into the original array. Sort it using any old moldy sort
that suits your fancy with the comparison being done on the array
items being pointed to by the indices. When the sort is done you have
a ranking.

Note 2: IIANM the best k out of n task is
O(n + (log n) * k * (log k)).
The point is that the probability of the n'th item being larger than
the smallest of the currest set of k is k/n. So the costs are a fixed
cost on one comparison per item, and k/n*(log k) comparisons on
average. Since sum(1/) ~ log n, there we are.

 
Richard Harter, cri@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Whoever said money can't buy happiness
didn't know where to shop.



Relevant Pages

  • Re: "Sorting" assignment
    ... And many others prefer to call partition exchange because "quicksort" ... bin B depending on whether it is greater than, ... If the array is already sorted, this means that you end up ... attempt to sort them. ...
    (comp.programming)
  • Re: A Fast sorting algorithm for almost sorted data
    ... far my compressor has potential but is nowhere near ready. ... It does however make heavy use of sorting. ... which I am currently calling Run sort. ... entire selected run can be added to the sorted output array. ...
    (comp.compression)
  • Re: Save & Sort
    ... You can copy your array to a scratch ... "Heap" sort. ... Dim lst As Long ... Dim tmp As String ...
    (microsoft.public.excel.programming)
  • Re: fast stable sort
    ... if you have an existing array, you can simply arrange an array of ... pointers to the items, and sort that. ... hence require swapping pages of virtual memory, ... each merge run instead of accessing them in sequential RAM ...
    (comp.programming)
  • A Fast sorting algorithm for almost sorted data
    ... which I am currently calling Run sort. ... entire selected run can be added to the sorted output array. ... public class RunSort implements Comparator ... public static void sort(Comparable a, int start,int end) ...
    (comp.compression)