Re: Named regexp variables, an extension proposal.



John Machin wrote:
On 13/05/2006 7:39 PM, Paddy wrote:
[snip]
Extension; named RE variables, with arguments
===================================
In this, all group definitions in the body of the variable definition
reference the literal contents of groups appearing after the variable
name, (but within the variable reference), when the variable is
referenced

So an RE variable definition like:
defs = r'(?smx) (?P/GO/ go \s for \s \1 )'

Used like:
rgexp = defs + r"""
(?P=GO (it) )
\s+
(?P=\GO (broke) )
"""
Would match the string:
"go for it go for broke"

As would:
defs2 = r'(?smx) (?P/GO/ go \s for \s (?P=subject) )'
rgexp = defs2 + r"""
(?P=GO (?P<subject> it) )
\s+
(?P=\GO (?P<subject> broke) )
"""

The above would allow me to factor out sections of REs and define
named, re-ussable RE snippets.


Please comment :-)


1. Regex syntax is over-rich already.

First, thanks for the reply John.

Yep, regex syntax is rich, but one of the reasons I went ahead with my
post was that it might add a new way to organize regexps into more
managable chunks, rather ike functions do.

2. You may be better off with a parser for this application instead of
using regexes.
unfortunately my experience counts against me going for parser
solutions rather than regxps. Although, being a Python user I always
think again before using a regexp and remember to think if their might
be a clearer string method solution to tasks; I am not comfotable with
parsers/parser generators.

The reason I used to dismiss parsers this time is that I have only
ever seen parsers for complete languages. I don't want to write a
complete parser for Verilog, I want to take an easier 'good enough'
route that I have used with success, from my AWK days. (Don't laugh, my
exposure to AWK after years of C, was just as profound as more recent
speakers have blogged about their fealings of release from Java after
exposure to new dynamic languages - all hail AWK, not completely put
out to stud :-)
I intend to write a large regexp that picks out the things that I want
from a verilog file, skipping the bits I am un-iterested in. With a
regular expression, if I don't write something to match, say, always
blocks, then, although if someone wrote ssignal definitions (which I am
interested in), in the task, then I would pick those up as well as
module level signal definitions, but that would be 'good enough' for my
app.
All the parser examples I see don't 'skip things',

- Hell, despite writing my own interpreted, recursive descent, language
many (many..), years ago in C; too much early lex &yacc'ing about left
me with a grudge!

3. "\\" is overloaded to the point of collapse already. Using it as an
argument marker could make the universe implode.

Did I truly write '=\GO' ? Twice!
Sorry, the example should have used '=GO' to refer to RE variables. I
made, then copied the error.
Note: I also tried to cut down on extra syntax by re-using the syntax
for referring to named groups (Or I would have if my proof reading were
better).

4. You could always use Python to roll your own macro expansion gadget,
like this:

Thanks for going to the trouble of writing the expander. I too had
thought of that, but that would lead to 'my little RE syntax' that
would be harder to maintain and others might reinvent the solution but
with their own mini macro syntax.


Cheers,
John

- Paddy.

.



Relevant Pages

  • Re: Announcement: bashcritic
    ... The canonical syntax is foo ... since is the most obvious choice for grouping several commands. ... "Bourne Shell": http://en.wikipedia.org/wiki/Bourne_shell ... descent parser with lookahead. ...
    (comp.unix.shell)
  • Re: SmallC
    ... heart of the parser, and the syntatic actions to produce the backend ... operator, address-of operator, or any other prefix or unary operators ... necessary code down to the low-level code. ... modify the parser syntax from the "dummy" postfix syntax to correct C prefix ...
    (alt.lang.asm)
  • Re: Interesting article: In the Beginning: An RDBMS history
    ... First a frame of reference. ... Human Communication. ... syntax: grammars, forms, interview-style talks. ... To me to book "Pragmatics of Human Communication", ...
    (comp.databases.theory)
  • Re: Determining EVERY month between 2 given dates
    ... Both come back with the error "Syntax error in Create Table statement" ... Database is an Access 2007 database looking at Access 2007 tables ... Is it because of the need to create a Reference to DAO? ... Dim ref As Reference, DAOfound As Boolean ...
    (microsoft.public.access.queries)
  • Re: What is the best book/way to teach my 12 year old kid to program
    ... > parser and the syntax are defined in terms of each other. ... > has far fewer than most other languages. ... > platform independence to be something learned early. ...
    (comp.programming)