# Re: Find second minimum number

*From*: Richard Heathfield <invalid@xxxxxxxxxxxxxxx>*Date*: Fri, 02 Jun 2006 00:50:10 +0000

Mark P said:

Googmeister wrote:

You can find the second minimum (along with the minimum)

in n + log n comparisons. Hint: the solution requires no fancy

techniques or mathematical analysis.

Do tell, how do you do this with only n + log n comparisons?

He's right, and it's easy if you know how. :-)

Donald Knuth discusses maximum-finding. Clearly, this is the same problem,

wearing a false moustache and ridiculously large spectacles.

He first proves that the search for a maximum amongst n items involves at

least n - 1 comparisons.

He then discusses quadratic selection. It works like this (and there is a

hidden message in the data I've chosen):

q w e r t y u i o p a s d f g h j k z x c v b n m

Rearrange in a square, as best you can:

I: q w e r t

II: y u i o p

III: a s d f g

IV: h j k z x

V: c v b n m

Group I: q < w, e < w, r < w, t < w : 4 comparisons. w is largest.

Group II: y > u, y > i, y > o, y > p : 4 comparisons. y is largest.

Group III: a < s, s > d, s > f, s > g : 4 comparisons. s is largest.

Group IV: h < j, j < k, k < z, z > x : 4 comparisons. z is largest.

Group V: c < v, v > b, v > n, v > m : 4 comparisons. v is largest.

That's 20 comparisons so far. We can now pull out the largest with 4 more

comparisons:

w < y, y > s, y < z, z > v. So z is the largest value in the group. So far

so good, and we've made 24 comparisons, which is equal to n - 1.

Now, consider the possibilities for the second largest. EITHER it's one of

the winners in the other four groups, OR it's one of the losers in z's

group.

Strictly speaking, we already know that w < y, y > s, so we don't need to

compare those again, but if z had been in Group I, we wouldn't have known

this, so let's pretend we don't.

w < y, y > s, y > v, so y is the second-greatest of the group winners. Now

we have to trawl through Group IV's losers: y > h, y > j, y > k, y > x

This gives us y as the second-largest member, in seven more comparisons - a

total of 31, which is just 1 higher than n + sqrt(n).

Okay, what about cubic selection? Here, we split the data up into a cube

(I've added @ as a rather sad glyph for "omega", arbitrarily in the middle

somewhere - we will assume it has a value > z):

q w e r t y u i o p a s d f @ g h j k l z x c v b n m

Dice it!

q w e r t y u i o

p a s d f @ g h j

k l z x c v b n m

q < w, w > e : 2, w

p > a, p < s : 2, s

k < l, l < z : 2, z

r < t, t < y : 2, y

d < f, f < @ : 2, @

x > c, x > v : 2, x

u > i, u > o : 2, u

g < h, h < j : 2, j

b < n, n > m : 2, n

18 comparisons so far.

w > s, w < z : 2, z

y < @, @ > x : 2, @

u > j, u > n : 2, u

24 so far.

z < @, @ > u : 2, @

26 comparisons, for 27 items. Perfect. Now for second place. Clearly, z is

not the highest (because @ is), and yet it is the highest out of its

original 3x3 group. Likewise u. So the second highest is either z, or u, or

y, or x (the two losers to @ in the semi-final), or d, or f (the two losers

to @ in the first round).

d < f, f < y, y > x, y < z, z > u. So z is the second highest, and we

learned this after a further five comparisons: 31 in all, for 27 items.

This is just 1 higher than n + pow(n, 1.0 / 3.0).

You can extend this scheme indefinitely. Consider n = 16.

q w e r t y u i o p a s d f g h

Split into n^(1.0/4.0) groups:

q w e r t y u i

o p a s d f g h

Split each group into n^(1.0/4.0) groups:

q w e r

t y u i

o p a s

d f g h

Split each group into n^(1.0/4.0) groups:

q w

e r

t y

u i

o p

a s

d f

g h

Compare:

q < w : 1, w

e < r : 1, r

t < y : 1, y

u > i : 1, u

o < p : 1, p

a < s : 1, s

d < f : 1, f

g < h : 1, h

8 so far (n = 16, remember).

w > r: 1, w

y > u: 1, y

p < s: 1, s

f < h: 1, h

12 so far (n = 16)

w < y: 1, y

s > h: 1, s

14 so far.

y > s: 1, y

15 so far. The second place is either s, w, or u. This takes up two more

comparisons: s < w, w > u.

So we have 17 comparisons to get the top two from 16. This is as near to n +

log n as makes no odds.

As you increase the number of "dimensions" of the search, you approximate n

+ log n ever more closely.

Let us arbitrarily choose to have a 16-dimensional search, where n is 65536.

We predict that we will take 65535 + log2(65536) searches - i.e. 65551

altogether.

We divide the data into 32768 groups, and we have one comparison in each

group. That costs us 32768 comparisons so far. Clearly we keep halving the

number of comparisons, and I don't think anyone will quibble now when I say

that this comes to 65535 searches to establish the "winner".

To find the second place, we need only look at all the data items "beaten"

by the ultimate winner. There are 15 "rounds", and so 15 items in that

group. This means we need another 14 comparisons to establish second place.

And thus we actually come in at two under par!

So yes, it's certainly possible to do this in N + log N - much to my

surprise, I must admit.

--

Richard Heathfield

"Usenet is a strange place" - dmr 29/7/1999

http://www.cpax.org.uk

email: rjh at above domain (but drop the www, obviously)

.

**Follow-Ups**:**Re: Find second minimum number***From:*gw7rib

**Re: Find second minimum number***From:*Richard Harter

**Re: Find second minimum number***From:*Rob Thorpe

**Re: Find second minimum number***From:*Mark P

**References**:**Re: Find second minimum number***From:*Mark P

- Prev by Date:
**Matching algorithm** - Next by Date:
**Re: Anyone recognise this sort algorithm?** - Previous by thread:
**Re: Find second minimum number** - Next by thread:
**Re: Find second minimum number** - Index(es):