# Re: eliminate duplicate rows in 2D array

*From*: Gordon Sande <g.sande@xxxxxxxxxxxxxxxx>*Date*: Sun, 17 Apr 2005 13:33:57 GMT

`zyf wrote:`

Hi

I have a 2D real array A[1e5,2] where some rows are same like the following

| .... | | x y | | .... | | x y | | .... | | x y | | .... |

the number of duplicate rows is not known in advance. And the number of rows that is duplicated isn't known in advance too.

Are there any way to eliminate the duplicate rows very fast?

Seems like a rerun of a moderately recent question.

The usual answer is to sort and then compare the adjacent entries. Sorting is n log ( n ).

(There is a (foolish) complexity result that if arithmetic is free then the existence of duplicates can be determined in constant time by evaluating a suitable interpolating polynomial. No information on which are the duplicates. This result mostly just shows that the complexity model matters!)

Since you do not specify what you intend to do once the duplicates are removed one can guess that sorting will not upset the intended uses. If you need to keep the current order then do a "formal" sort where you set up an array of pointers (subscripts in practice) and just move the pointers around. The sorting will have to be a two field sort in which you sort on x and break ties by using y, or the other way around.

But then when n is small, most things a reasonable modern computer will do will be very fast so just comparing all pairs will get the job done. .

**References**:**eliminate duplicate rows in 2D array***From:*zyf

- Prev by Date:
**Re: eliminate duplicate rows in 2D array** - Next by Date:
**Re: Starting external program?** - Previous by thread:
**Re: eliminate duplicate rows in 2D array** - Next by thread:
**Re: eliminate duplicate rows in 2D array** - Index(es):