Re: IsAnagram



CBFalconer <cbfalconer@xxxxxxxxx> wrote:

Martin wrote:

I am looking for the best algorithm to check if two words are
anagrams. Here the signature of the method :

- Sort both words and compare both arrays. O(n^2) considering that I
use BubbleSort and words are both short (less than 30 characters).
- Find a way to create a signature for each word (e.g. A = 1, B = 2,
etc.) However, I can have a problem here... The sum for both words
can be the same with different letters.

Sort both sounds like the most efficient. You should probably use
a simple sort such as shell sort, which is much faster than bubble,
and is better than O(N*N).

For a max of 30 letters? Insertion or selection. At that size it doesn't
pay to be complicated.

Alternatively, do create a signature, with unsigned long longs, using
not addition but multiplication. If the signatures don't match, they're
not anagrams. Since this is the most usual case, you win a little. If
they _do_ match, fall back to the sorting solution. Yes, the signature
may wrap around for large words, but that's not going to matter because
you need the final check anyway, and all large anagrams are going to
wrap around to the same signature. You do need _unsigned_ integers for
this, mind you. And do check whether it's actually faster or not.

And yet another: if you're going to do great numbers of these checks,
consider using a database. For each word you're being asked to check,
check to see if it's already in the DB. If not, sort it _once_, then add
it to the DB - next time you don't need to sort, you can just fetch the
sorted version from the DB. If you hit two words you've already seen
before, all you need is two DB checks and a single compare. Drawback:
for this to save you anything, you really do need _lots_ of checks, and
a good database.

Richard
.



Relevant Pages

  • Re: IsAnagram
    ... use BubbleSort and words are both short. ... Alternatively, do create a signature, with unsigned long longs, using ... not anagrams. ... There are other similar methods, for instance adding each letter's set bit to the test integer, and allow an extra bit space to expand into for the more common letters.) ...
    (comp.lang.c)
  • Re: Day one, good and bad.
    ... all anagrams of a given character string have the ... same signature, where the signature is generated by sorting the ... characters in that string such that the characters appear in some ...
    (alt.sysadmin.recovery)
  • finding anagram of words
    ... Find anagrams of words e.g. rinse, reins, siren, resin are all anagrams ... letters, sorting that array, and joining the letters back into a string. ... So the rinse examples all have signature 'einrs' ...
    (comp.lang.ruby)
  • Re: What Do You See ?
    ... You can have OE sort on anything you want and you don't have to go to any ... If I click on the "Insert Signature" icon on the toolbar it will insert this ... If I'm in the email portion of OE and click on the "Insert Signature" icon on ...
    (rec.sport.billiard)
  • Re: Reasoning behind the EventHandler(object, EventArgs) signature
    ... The signature reassures the developer that a certain behaviour is expected and that the sender of the event and some kind of event argument, either raw or derived, will be provided. ... Creating your own delegates is perfectly acceptable but, if you were writing any sort of commercial software, the choice would be nonsense because it wouldn't fit with the plan. ... Find great Windows Forms articles in Windows Forms Tips and Tricks ... Obviously I can create a custom delegate ...
    (microsoft.public.dotnet.languages.csharp)