Re: Sorting algorithm problem
From: Richard Harter (cri_at_tiac.net)
Date: Sat, 13 Mar 2004 22:58:36 GMT
On 13 Mar 2004 10:42:03 -0800, firstname.lastname@example.org (Booser) wrote:
>email@example.com (Richard Harter) wrote in message news:<firstname.lastname@example.org>...
>> On 12 Mar 2004 19:22:51 -0800, email@example.com (Booser) wrote:
>> >firstname.lastname@example.org (Martin V. Cera) wrote in message news:<email@example.com>...
>> >> Let have an almost sorted array of 'n' integers. Each integer can be
>> >> as far as 'k' integers from its target position in sorted array. I
>> >> should suggest an algorithm that sorts this array in time O(n * log
>> >> k). I tried to modify heap sort, but I am not sure of this
>> >> modification... Do you have any ideas?
>> >Simple - use an iterative version of merge sort seen below.
>> >You can do it with heap sort but it is much too complex.
>> >I developed this algorithm. It is a derivation of mergesort. M is
>> >the number of items in the list and n is the number of sub lists.
>> [snip exposition]
>> See Knuth's section of merge sort; your algorithm is the first thing
>> that he considers. In general it is not more efficient than standard
>> mergesort. In the present instance where we are concerned with almost
>> sorted arrays the resulting initial runs may be long for a resultant
>> small n, but this needs to be proved.
>It is more efficient. Their are fewer merges. Standard iterative
>merge sort is:
>10 9 8 7 6 5 4 3 2 1
>9,10 7,8 5,6 3,4 1,2
>My algorithm takes the fact this some data naturally is already
>sorted. The savings is in the levels. It is the most efficient way
>of doing merge sort. Although, given truly random data, one might
>only save one or two levels out of 20. Given real world data, the
>significants is much greater.
>The overhead of grouping the numbers are n but it takes care of itself
>by "merging" these sections automatically in calculating it, thus n is
Your algorithm is Knuth's "Natural Two-Way Merge Sorting", algorithm N
in section 5.2.4 of volume III, sorting and searching, p161. Straight
two-way merging (artificial runs of length 1) is his algorithm S,
p164. The average cost of S is less than the average cost of N. (See
Knuth for analysis.) The essence of the matter is this, though:
The number of comparisons required to group the numbers is n-1; the
number of comparisons required in the first two passes (which
establish runs of length 4) of the straight two-way merge is 7n/6.
For random arrangements of data the average length of the runs in the
natural two-way merge is less than 4. It's a wash on average; N is
slower on average, but can be faster in partially sorted data.
Richard Harter, firstname.lastname@example.org
I used to do things that were good for me because they were fun.
Now I do them because they are good for me. Fun was better.