Re: "string match" and "glob" pattern rules



On Tue, 27 Feb 2007 16:48:50 +0100,
Erik Leunissen <look@xxxxxxxxxxxxxxxxxx> wrote:

MartinLemburg@UGS wrote:
yes - you are right - this would be a probably code-breaking change.
So having none glob-style pattern matching engineen would only be
something for tcl 9.
I believe that separating out a common pattern matching proces
(engine or whatever name you give to it) is quite exactly what would
solve the observed inconsistencies. (I've stumbled across these
pattern syntax differences enough in the past to feel your need).

I've often thought that a centralised string matching engine would make
a huge amount of difference, especially if it could be picked up easily
by any commands that perform string matching. I'll write down what I've
been thinking, though I know not many people like my ideas because I
tend to go a little overboard. Still, maybe it'll invoke some
interesting chatter... :)


Add a new option -match=method to just about everything, with the
old-style -glob and -regexp and so forth being short-hand for their
favourite match style. [glob] would use for example, "glob:unix",
which supports the {,} notation, while plain "glob:string" (as used by
[lsearch/switch -glob]) may not. This choice could be over-ridden by
using -match=glob:XXX specifically.

"regexp" and "exact" would also be included as supported match-types,
with the matching mechanism allowing for some parameter passing;
-nocase, -all, -indicies, and friends, would get passed through to the
match engine, the pattern would be passed in a form allowing for caching
(preferably within the TclObj itself, so the compiled form lasts as
long as the pattern). Also supported would be a means for the match
function to return a list of ranges, ala [regexp -indicies].

This, of course, leads on to another of those internal namespaces (like
the "expression" namespace) wherein all these match engines reside.
Something like [match::regexp] would get you a function equivalent to
[regexp -all -indices -inline]. The C function would be aimed at
efficient invocation from the bytecode, allowing those original -glob
options to bypass the TCL front-end and go straight to the internal
representation, regardless of what you've done to the names within the
matchs namespace. The TCL presence would take, for example, an options
dict as a standard argument, and a variable name into which to drop the
list of returned values. [regexp] would re-process the returned list,
extracting string segments and distributing them among the passed
variables as needed. [regsub] could, presumably, do something similar.

A means to pass extra options through the -match argument would relieve
the burden of every command having to support every new match option
that someone dreams up for their custom match type, as well. The new
system could be considered "in flux" for a few releases, to allow
things to be moved around without being overly concerned at first, who
you upset.


Fredderic
.



Relevant Pages

  • Re: "string match" and "glob" pattern rules
    ... But - I think more of the time it will need to release tcl 9 and the ... UGS - a Siemens Company - Transforming the Process of Innovation ... I believe that separating out a common pattern matching proces (engine ...
    (comp.lang.tcl)
  • Re: Designing an expert system
    ... engine. ... >> pattern before they do you any good. ... >> lend itself to defining a pattern, but what about some plain text that ... kids at MIT and Google write the stuff, I just want to use it. ...
    (microsoft.public.vb.general.discussion)
  • Re: array or with non-array
    ... David A. Black wrote: ... > matching an IO object to a pattern. ... > there *should* be an explicit, intervening string representation. ... This works pretty well for every pattern without anchors. ...
    (comp.lang.ruby)
  • TrendMicro Interscan Viruswall v6.02 on Etch
    ... After some debugging and customising the installer went through and the software is running with great performance. ... Sometimes the HTTP engine fails to start after the automatic pattern updates. ... about item Virus Pattern ...
    (Debian-User)
  • Re: Regular Expression AND mach
    ... There are, of course, many exceptions, but the pattern is ... > matching techniques and is used by the Glimpse and Webglimpse search ... > There is a Python port of agrep available as a module called 'agrepy' ...
    (comp.lang.python)