# Algorithm which returns rank of how similar 2 strings are

*From*: un1.3.CalRobert@xxxxxxxxxxxxxxx (Robert Maas, http://tinyurl.com/uh3t)*Date*: Sun, 18 May 2008 00:15:13 -0700

From: "Stephen Howe" <sjhoweATdialDOTpipexDOTcom>

I believe there are algorithms which return a rank for how

similar 2 strings are.

It all depends on what *definition* you use for a measure of similarity.

One possible measure is the cosine of the angle between two

vectors, where each string maps to such a vector.

Here's a simple math exercise for you: If you have two unit

vectors, and you know the length of the difference vector d between

them, what is the cosine of the angle C, expressed as a function of d?

Hint: Use either similar triangles or trigonometry to work out the

answer, but the answer comes out a really simple quadradic.

So my proposed answer is first to clearly define a vector space

model for your space of all possible strings, whereby you have a

metric that tells how *different* two vectors are, which is

computed simply as the Cartesian length of the difference vector.

Then to compute similarity, first divide each vector by its length

to get a pair of unit vectors, compute the difference between those

unit vectors, then plug d into that formula to compute C.

.

**Follow-Ups**:**Re: Algorithm which returns rank of how similar 2 strings are***From:*Stephen Howe

**Re: Algorithm which returns rank of how similar 2 strings are***From:*Phlip

**References**:**Algorithm which returns rank of how similar 2 strings are***From:*Stephen Howe

- Prev by Date:
**The spinoza papers: design of the extra-precision number object v1** - Next by Date:
**Re: interview question** - Previous by thread:
**Re: Algorithm which returns rank of how similar 2 strings are** - Next by thread:
**Re: Algorithm which returns rank of how similar 2 strings are** - Index(es):