Re: Loose comparison
- From: John Wester [Group W] <rot13.wjrfgre@xxxxxxxxx>
- Date: Wed, 27 Jul 2005 09:56:04 -0600
In article <42e7a7e2@xxxxxxxxxxxxxxxxxxxxxx>, Stevewarby@xxxxxxxxxxxxxx
says...
> We collect vehicle registration numbers eg V5LTG N806FCH etc from tickets
> and manually enter them into the database. We are now automating the system
> whereby clients enter the details online or are sent via pocket pcs.
>
> The problem is we are ending up with duplicate entries because some
> intuprits a 5 as an S an 0 as O etc. eg W520HYT becomes WS20HYT and we end
> up with two entries for the same vehicle.
>
> The data actually arrives twice. The top hard copy is sent by the client the
> carbon copy is sent by our staff therefore we to check to see if the
> registration is already on the database.
>
> Is there a fuzzy compare routine / component that could bring back results
> from the existing database and the client could then make an informed
> decision about the data.
>
See this entry at code central:
http://cc.borland.com/Item.aspx?id=20126
"Strcomp2k is an enhanced string comparison function that provides
better results than Levenstein or other methods. It takes into account
common typing and phonetic errors. It is ideally suited for record
matching in data scrubbing and other lookup applications where the
quality of the input data can not be guaranteed. This is a port of the
algorithm described by Bill Winkler of the US Bureau of the Census."
I wrote it and use it for name and address scrubbing...
--
John
Life is complex. It has real and imaginary parts
.
- Follow-Ups:
- Re: Loose comparison
- From: STEVE WARBURTON
- Re: Loose comparison
- References:
- Loose comparison
- From: SteveW
- Loose comparison
- Prev by Date: Re: NEW CompBAR!!!
- Next by Date: Re: Looking for a database
- Previous by thread: Loose comparison
- Next by thread: Re: Loose comparison
- Index(es):
Relevant Pages
|