Re: need an algo for string sorting

From: Paul E. Black (p.black_at_acm.org)
Date: 08/19/04


Date: Thu, 19 Aug 2004 12:08:15 -0400
To: paul.black@nist.gov

On Thu, 19 Aug 2004 10:59:32 +0530, Raghavendra Mahuli wrote:

> I have many strings. I have to sort them. ...
>
> But the complex part is "equivalent-values". it can be defined that "ai" and
> "ae" are equivalent-values. Then, while sorting two strings, i have to
> consider that "ai" and "ae" are equal......only if they come at same
> position in both the strings....
> str1 - dfahaesdhfkj
> str2 - dfahaisds
> ... ae [and ai] appears at position 4 and 5
>
> str3 - haehasdg
> str4 - hhsdaiskdn
> ... [ae and ai appear at different positions] ...

You might be able to replace all occurrences with equivalence class
markers, for instance:
         dfahQ7sdhfkj
         dfahQ7sds
         hQ7hasdg
         hhsdQ7skdn
where Q is some character that never appears. It is a prefix or
escape character.

If it is more complex than this, there are a number of algorithms for
string matching with errors
    http://www.nist.gov/dads/HTML/stringMatchwError.html

If the equivalents are always character-for-character, it is the
slightly simpler string matching with mismatches.

-paul-

-- 
Paul E. Black (p.black@acm.org)


Relevant Pages

  • Re: Ada bench : count words
    ... I think you should do parsing using FSM. ... It is either two character map ... Note EOL is not a character, but a string, because in some environments ... I have not tried with string matching (if that's what you mean with ...
    (comp.lang.ada)
  • Re: Weighty Second Acts
    ... Frequently they are markers for ... the progress of character. ... The fact that the reprise comes back in a ... how the character has evolved. ...
    (rec.arts.theatre.musicals)
  • What are "special characters"? (was: Newbie question about regular expression)
    ... How can I match special characters using regular expression? ... involve string matching, ... I tell whether a string contains any special character?" ...
    (comp.lang.tcl)
  • Re: Ada bench : count words
    ... What about Get (Item: out Character)? ... I think you should do parsing using FSM. ... > I have not tried with string matching (if that's what you mean with ... case Char is ...
    (comp.lang.ada)
  • Re: an unwanted page break
    ... There must be some non-table character after the table ends on page 2. ... Ctrl-End to go there, then select that line, whether or not it has paragraph ... Change the font size to 1 and the page should disappear. ... markers to delete. ...
    (microsoft.public.word.tables)