Re: KMP Algorithm supporting * notation
- From: Jym <Jean-Yves.Moyen+news@xxxxxxxxxxxx>
- Date: Sun, 07 Oct 2007 03:15:11 +0200
On Sun, 07 Oct 2007 02:52:09 +0200, Altu <altu59@xxxxxxxxx> wrote:
Hi,
Is it possible to extend the KMP algorithm to support * in patterns?
Like
ab*cd
and * can be any character but cannot be nothing.
I would not use '*' for "one caracter, any one".
Because '*' in regexp usually means "the previous caracter can be here 0, 1 or several times" so using it for another meaning will probably confuse a lot of people...
I would advice to use instead either '?' (as in shell wildcards) or '.' (sed regexp).
The problem is that when I am looking at the character with index i in
the text, if it is not a match, I don't want to look at the previous
characters of the text (just like KMP).
Is this possible?
Without thinking a lot about it, I would say "yes" because KMP basically build the automaton corresponding to the pattern and pass the string through it. And the automaton for this pattern is very easy to build.
I may be wrong, however, due to some issues that might occur while determinising the automaton or similar bad stuff.
--
Hypocoristiquement,
Jym.
.
- References:
- KMP Algorithm supporting * notation
- From: Altu
- KMP Algorithm supporting * notation
- Prev by Date: KMP Algorithm supporting * notation
- Next by Date: A question about regexes
- Previous by thread: KMP Algorithm supporting * notation
- Next by thread: Re: KMP Algorithm supporting * notation
- Index(es):
Relevant Pages
|