Re: Create DFA from regular expression?



"Matt Timmermans" <mt0000@xxxxxxxxxxxxxxxxxxxxxxxxxx> writes:

> "Paminu" <jadajada@xxxxxxx> wrote in message
> news:dli8r2$osu$1@xxxxxxxxxxxxxxxxxxxx
> > Is it possible to create a DFA directly from a regular expression or is it
> > necessary first to create the NFA?
>
> You can create it directly from the regular expression using "derivatives",
> where the derivative of a regular expression R w.r.t. some symbol 'a',
> written D_aR, is the expression that matches all strings x such that R
> matches ax. A DFA that accepts R will have a transition on 'a' from the
> start state to a state that accepts D_aR.
>
> Let \ be the expression matching nothing, and let E be the expression
> matching only the empty string.
>
> For all symbols a and b, and expressions R and S:
>
> D_a\ = \
> D_aE = \
> D_aa = E
> D_ab = \
> D_a(R|S) = D_aR | D_aS
> D_aR* = (D_aR)(R*)
> D_a(RS) = (D_aR)S | (D_aS) if R matches the empty string, and just (D_aR)S
> otherwise.

You neglected to mention two important parts of this construction:

1. Each regular expression is a state in the DFA, with theoriginal
DFA being the start state.

2. A state is accepting if the regular expression representing the
state accepts the empty string.

3. State D has transition to state D_a on the symbol a.

> Using these with some simplification rules is often the easiest way to find
> DFAs by hand.

The simplification rules are important. If you don't simplify, the
regular expressions you get will be larger and larger, and you will
never have backwards edges in the DFA. If your reduction rules are
sufficiently good to reduce equivalent regular expressions to the same
regular expression, the DFA will be minimal, but that is not trivial.
You can do with relatively simple simplification rules and
equivalences and with a reasonable number of reduction rules, the DFAs
tend to be smaller than those you get from the standard approach, but
you won't get minimality.

> If you're programming it, though, all that symbolic stuff is
> a pain, and it's much easier just to make the NFA first and convert it.

If you use a functional (or logical language) with pattern matching,
the "symbolic stuff" is fairly easy. I have used the above method in
a lexer generator I wrote. It has some advantages over the
traditional regexp -> NFA -> DFA approach, for example it can easily
be extended to handle regular expressions with intersection:

D_a(R&S) = (D_aR) & (D_aS)

Torben
.



Relevant Pages

  • Re: Henry Spencers "Tcl" Regex Library
    ... finite-state automata", because their machine is mostly like a DFA, ... but with a few NFA portions (and I mean true NFA not nonregular/regexp ... regular expression but doesn't describe a regular language. ... They probably use the hybrid term because the machine executes a DFA ...
    (comp.compilers)
  • Re: Create DFA from regular expression?
    ... > necessary first to create the NFA? ... You can create it directly from the regular expression using "derivatives", ... A DFA that accepts R will have a transition on 'a' from the ... and it's much easier just to make the NFA first and convert it. ...
    (comp.theory)
  • Re: Create DFA from regular expression?
    ... >> Is it possible to create a DFA directly from a regular expression or is ... >> it necessary first to create the NFA? ... A DFA that accepts R will have a transition on 'a' from the ... and it's much easier just to make the NFA first and convert it. ...
    (comp.theory)
  • Re: regular expressions and DFAs
    ... The regular expression: ... Every NFA has at least one DFA that accepts the same language. ... Look up "Subset Construction" for a proof of this, and also an algorithm to convert between NFAs and DFAs. ...
    (comp.theory)
  • Re: a problem on building a regular expression
    ... Indeed I do not know a *practical* reference, so I have written my own ... The first one transforms quite obviously to a regular expression, ... For the DFA in the second sketch, ...
    (sci.math)