Re: On the complexity of determining whether n numbers are distinct



goanna wrote:
Patricia Shanahan <pats@xxxxxxx> writes:

[...]
Incidentally, the minimum number of comparisons for a sort also breaks
down if you can use the elements to index an array. Proofs of it depend
on the requirement that the only operation on the elements is pairwise
comparison.

Consider an array of integers, and increment vector element n by one for
each n in the input:

Initially, the vector is:

... 0 0 0 0 0

First number is 1, so we increment element 1:

... 0 0 0 1 0

Next number: 2; we increment element two:

... 0 0 1 1 0


Next number (i.e., 3):

... 0 1 1 1 0

Next number is 1, so increment element 1 again:

... 0 1 1 2 0

Now iterate through the vector in increasing index order. If element i
is k, write i out k times:

1 1 2 3

and the original data is sorted in O(n) operations, with no pairwise
comparisons.

Technically correct, perhaps, but all a bit silly really.
You've just replaced pairwise comparisons with increments,
comparisons to zero, and decrements, which all still need
to be counted. As your vector needs to be as large as the
largest number k, which may be very much larger than n,
the number of comparisons to zero may be much larger than
the number of pairwise comparisons you've eliminated.


Agreed. It is very silly. It has the same complexity as a good pairwise
comparison sort if k is O(log n). I would only use it if k were small,
and smaller than n. For example, sorting thousands of 8 bit bytes.

However, it was intended mainly to illustrate the fact that the element
comparison lower bound for sort is based on the limitation that the only
permitted operation on the elements being sorted is pairwise comparison
between them.

I suggest adding the pairwise comparison limitation to the main question
in this thread, proving a lower bound on comparisons for determining
whether there are duplicates.

Patricia
.