algorithm for generating top fuzzy variations ...



Hello all,

I am interested in obtaining the top N fuzzy variations of an
string (a person or company name) using the same concept
as the Levenshtein distance. Ussually Levenshtein is used to
compute the distance between two given strings ... but I would
like to instead have an algortihm to generate the top N highest
scoring fuzzy variations for any given term e.g.

Giovanni - 100%
Giovann - 98%
iovanni - 98%
Govanni - 98%
....
anni - 55%

This way I can precompute this thing in advance and not
during online matching.

Can anyone recommend an existing implementation e.g. in Java ?

Many thanks in advance,
Best Regards,
Giovanni


.



Relevant Pages

  • Re: Fuzzy Lookups
    ... I wrote a Python ... so an ordinary string match just doesn't get ... which calculates the Levenshtein distance. ...
    (comp.lang.python)
  • Re: Fuzzy Lookups
    ... In addition the various online services that provide song info ... so an ordinary string match just doesn't get ... which calculates the Levenshtein distance. ...
    (comp.lang.python)