Re: efficient comparison
- From: Richard Heathfield <rjh@xxxxxxxxxxxxxxx>
- Date: Tue, 28 Aug 2007 08:26:42 +0000
bob@xxxxxxxxxxxxxx said:
Let's say you have a set of names called set A. You also have a
different
set of names called set B. What is the most efficient way of
comparing the sets?
By comparison, I mean finding the names that need to be added to set B
and
deleted from set B to form set A.
Here is an O(n log n) method. It may not be the most efficient, but it's
a useful benchmark, since anything *less* efficient than it is
necessarily not the *most* efficient! :-)
The O(n log n) complexity comes from the two sort steps.
(Query: the algorithm is actually O(a log a + b log b + a + b) - am I
right in thinking that this reduces to O(n log n) where n = a + b?)
This method assumes that each set is unique - i.e. contains no
duplicates. Adjusting for duplicates is not difficult, but I wanted to
keep the explanation as lucid and brief as possible.
Sort A.
Sort B.
Let Ac = current record in A (start with the first).
Let Bc = current record in B (start with the first).
Let C be the (initially empty) list of records in A but not in B.
Let D be the (initially empty) list of records in B but not in A.
If we try to read a record from A into Ac and it fails, we'll call Ac
null. Same with B into Bc - failure->null.
While Ac is not null and Bc is not null
If Ac is less than Bc
Ac does not exist in B - add Ac to C
Read record from A into Ac
Else if Bc is less than Ac
Bc does not exist in A - add Bc to D
Read record from B into Bc
Else
Records match
Read record from A into Ac
Read record from B into Bc
Endif
Endwhile
While Ac is not null (might not happen)
Ac does not exist in B - add Ac to C
Read record from A into Ac
Endwhile
While Bc is not null (might not happen)
Bc does not exist in A - add Bc to D
Read record from B into Bc
Endwhile
C now contains all records found in A but not B.
D now contains all records found in B but not A.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
.
- Follow-Ups:
- Re: efficient comparison
- From: Ben Bacarisse
- Re: efficient comparison
- References:
- efficient comparison
- From: bob
- efficient comparison
- Prev by Date: Re: efficient comparison
- Next by Date: Re: data passing
- Previous by thread: Re: efficient comparison
- Next by thread: Re: efficient comparison
- Index(es):
Relevant Pages
|