# Re: Sorting algorithm problem

From: Richard Harter (cri_at_tiac.net)
Date: 03/13/04

• Next message: Richard Harter: "Re: Sorting algorithm problem"

```Date: Sat, 13 Mar 2004 22:58:36 GMT

```

On 13 Mar 2004 10:42:03 -0800, jhall3@mix.wvu.edu (Booser) wrote:

>cri@tiac.net (Richard Harter) wrote in message news:<4052e63c.813469503@news.sbtc.net>...
>> On 12 Mar 2004 19:22:51 -0800, jhall3@mix.wvu.edu (Booser) wrote:
>>
>> >tramp@volny.cz (Martin V. Cera) wrote in message news:<5ff4db01.0403060847.588e6a93@posting.google.com>...
>> >> Let have an almost sorted array of 'n' integers. Each integer can be
>> >> as far as 'k' integers from its target position in sorted array. I
>> >> should suggest an algorithm that sorts this array in time O(n * log
>> >> k). I tried to modify heap sort, but I am not sure of this
>> >> modification... Do you have any ideas?
>> >
>> >Simple - use an iterative version of merge sort seen below.
>> >
>> >You can do it with heap sort but it is much too complex.
>> >
>> >I developed this algorithm. It is a derivation of mergesort. M is
>> >the number of items in the list and n is the number of sub lists.
>>
>> [snip exposition]
>>
>> See Knuth's section of merge sort; your algorithm is the first thing
>> that he considers. In general it is not more efficient than standard
>> mergesort. In the present instance where we are concerned with almost
>> sorted arrays the resulting initial runs may be long for a resultant
>> small n, but this needs to be proved.
>
>It is more efficient. Their are fewer merges. Standard iterative
>merge sort is:
>
>10 9 8 7 6 5 4 3 2 1
>
>9,10 7,8 5,6 3,4 1,2
>
>7,8,9,10 3,4,5,6
>
>3,4,5,6,7,8,9,10
>
>1,2,3,4,5,6,7,8,9,10
>
>
>My algorithm takes the fact this some data naturally is already
>sorted. The savings is in the levels. It is the most efficient way
>of doing merge sort. Although, given truly random data, one might
>only save one or two levels out of 20. Given real world data, the
>significants is much greater.
>
>The overhead of grouping the numbers are n but it takes care of itself
>by "merging" these sections automatically in calculating it, thus n is
>subtracted.

Your algorithm is Knuth's "Natural Two-Way Merge Sorting", algorithm N
in section 5.2.4 of volume III, sorting and searching, p161. Straight
two-way merging (artificial runs of length 1) is his algorithm S,
p164. The average cost of S is less than the average cost of N. (See
Knuth for analysis.) The essence of the matter is this, though:

The number of comparisons required to group the numbers is n-1; the
number of comparisons required in the first two passes (which
establish runs of length 4) of the straight two-way merge is 7n/6.
For random arrangements of data the average length of the runs in the
natural two-way merge is less than 4. It's a wash on average; N is
slower on average, but can be faster in partially sorted data.

Richard Harter, cri@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
I used to do things that were good for me because they were fun.
Now I do them because they are good for me. Fun was better.

• Next message: Richard Harter: "Re: Sorting algorithm problem"

## Relevant Pages

• Re: Is C99 the final C? (some suggestions)
... thread about who has access to the standard, ... >>Could you provide a reference to the description of the algorithm ... these rare cases) a very loose upper bound. ... > the worst case and say that you only have as much stack space of the ...
(comp.lang.c)
• Re: E96 Series Computation
... >>>3) multiply remaining fraction by 64 ... >>>absolute closest standard value to arbitrary input R are hopelessly lost ... >>>as noise compared to tolerance. ... >> I guess I don't understand what your algorithm is supposed to do. ...
(sci.electronics.design)
• Re: AES with SslStream
... I can't remember which version of Windows is supposed to get that support, ... my understanding is that the AES types are still ... Aes The Advanced Encryption Standard algorithm. ...
(microsoft.public.dotnet.security)
• Re: Is C99 the final C? (some suggestions)
... Well the few people in here who have the standard have chimed in. ... it may cause the occasional portability problem, ... >> Reread the Lamport's bakery algorithm. ... Dealing with the stack is an area that clearly ...
(comp.lang.c)
• Re: a rand array
... the standard deviation of the sample mean will be the standard ... sequences as a naive "fill and sort" algorithm, ... A similar call to a naive "fill and sort" algorithm gives: ...
(comp.lang.c)