Re: How to check for duplicates?

Zoran
Date: 11/15/03


Date: Sat, 15 Nov 2003 09:45:50 -0500

Hi Axel;

Unfortunately NexusDB does not support stored procedures yet. I think I've
solved the problem.

This is what I did (if you are interested): on the existing table I
concatenated strings for name, street and zip. Then I lowercased it, and
took out all characters except a-z and 0-9. Then I made hash string (32
bytes long) using MD5 hash (what Ignacio recommended). I created an
additional key on that field. On 3 million rcds table I am not sure if I
have some duplicates or not. I don't know much about hash algorithms, but it
looks like unique strings to me. When inserting I create the same key out of
input fields and check against the table. If the key exists, then I go into
loop and check input fields (name, address, zip) against the same fields in
existing table.

No big deal, but it looks like it works for me.

If the hash string is guaranteed to be unique, then this looping makes no
sense. I have to learn more about hash procedures. Do you know some web site
where I can find some information about hashing?

Thanks for your time.

Zoran.

"aniesen" <axeln@interform.cc> wrote in message
news:3fb56841$1@newsgroups.borland.com...
> Zoran:
>
> I don't know anything about NexusDB but if it supports stored procedures
you
> can speed up the verification process immensely. Just use a nested loop
for
> comparison and leave the nested loop if there is no match returning FALSE.
> Return TRUE if it exits the finishes the loop.
>
> --
> Axel Niesen axeln@interform.cc
> interFORM Consulting Corp. http://www.interform.cc
> (866) 503-6005
>
> Don't always say what you know,
> but always know what you say!
> Matthias Claudius
> 1740-1815



Relevant Pages

  • Re: How to check for duplicates?
    ... > input fields and check against the table. ... > If the hash string is guaranteed to be unique, ... > Zoran. ... >> comparison and leave the nested loop if there is no match returning ...
    (borland.public.delphi.thirdpartytools.general)
  • Re: How to check for duplicates?
    ... > Unfortunately NexusDB does not support stored procedures yet. ... > bytes long) using MD5 hash. ... > If the hash string is guaranteed to be unique, ... >> comparison and leave the nested loop if there is no match returning ...
    (borland.public.delphi.thirdpartytools.general)
  • Re: Computing hash values
    ... You mean the bottom table is scanned once (for creating ... hash table) and then nested loop is needed for matching rows. ... >> I'm a little confused about the difference between Hash Match and Nested ...
    (microsoft.public.sqlserver.server)
  • Re: Computing hash values
    ... You mean the bottom table is scanned once (for creating ... hash table) and then nested loop is needed for matching rows. ... >> I'm a little confused about the difference between Hash Match and Nested ...
    (microsoft.public.sqlserver.programming)
  • Re: Computing hash values
    ... A nested loop is when the inner table is processed completely for each row ... For hash joins the inner table is read once to build the hash table, ... SQL Server MVP ...
    (microsoft.public.sqlserver.server)