Re: Question about Sun JAVAC

From: Edward G. Nilges (spinoza1111_at_yahoo.com)
Date: 09/09/04


Date: 8 Sep 2004 22:14:30 -0700

mwojcik@newsguy.com (Michael Wojcik) wrote in message news:<chn74g01qh9@news2.newsguy.com>...

Thanks for an interesting post. However, regular expressions remain a
language adopted eagerly in 1970 owing to their expressiveness
relative to hardware and today they gnomically perform functions that
could be performed by more expressive notations, notably BNF.

> In article <f5dda427.0409061533.46d0765@posting.google.com>, spinoza1111@yahoo.com (Edward G. Nilges) writes:
> > mwojcik@newsguy.com (Michael Wojcik) wrote in message news:<chi6i1027as@news1.newsguy.com>...
> > > In article <f5dda427.0409021555.748255dd@posting.google.com>, spinoza1111@yahoo.com (Edward G. Nilges) writes:
> > > >
> > > > This is because regular expressions are a form of machine language
> > > > that was designed around the limitations of input and output devices
> > > > circa 1970.
> > >
> > > Regular expressions are a mathematical construct that date back at
> > > least to the work of Turing and Kleene. That was significantly prior
> > > to 1970.
> >
> > Yes, but the current encoding is based on the limitations of input and
> > output devices circa 1970.
>
> The RE and ERE syntax used by various Unix utilities clearly was
> developed using the I/O devices common to Unix systems of that era,
> and operating within their limitations was certainly a requirement;
> but it employs a syntax not significantly different from the
> preexisting mathematical one (or ones, since there was some
> variation, for example in using "+" or "|" for alternation).
>
> Of course the symbols available for ASCII I/O devices (that is, ASCII
> itself, sometimes limited by unfortunately under-featured terminals)
> represented the ANSI and ISO committees' best efforts to arrive at
> the most useful compromise among the sometimes-conflicting needs of
> various potential users. Thus ISO initially made the tilde and hash
> signs alternates for the same ASCII code point - a situation which
> would have changed the syntax of C, for example, had it remained.
>
> In other cases, one symbol served multiple masters. Jukka Korpela
> notes that Bob Bemer told him that the asterisk was initially
> introduced into many pre- standard computer character sets for
> commercial purposes, such as "check protecting" (filling in white-
> space around numbers on checks to guard against alteration). Only
> later was it adopted by programming languages for multiplication and
> other mathematical functions.
>
> And some symbols were added for one (sometimes relatively obscure)
> purpose and later adapted to others. The classic ASCII example is
> the backslash, which Bemer initially advocated for use in concert with
> the slash to represent Boolean conjunction and disjunction (as "\/"
> and "/\") - there's a well-pedigreed ASCII syntax for your symbolic
> logic. It was quickly seized on to represent "reverse division" as
> well, though there apparently wasn't a consensus on just what
> "reverse division" might entail. (Apparently some languages used
> it to represent continued fractions: a\b\c...) Then, of course,
> Microsoft (in)famously used it as a path separator in MS-DOS 2.0,
> because the slash was already taken as option separator.
>
> But traditional mathematical syntax for regular expressions happened
> to use characters that were already in ASCII - not entirely a
> coincidence, of course, since the people who developed RE syntax
> elected to use already-established symbols rather than introducing
> novel ones, and ASCII incorporated symbols that were for the most
> part already widely used. The most notable feature of RE syntax, the
> Kleene closure operator, has always (AFAICT) been written with an
> asterisk (which is why it's often called the "Kleene star", or by
> wags the "Kleene X", though that pun depends on mispronouncing
> Kleene's name). Thus we cannot in fairness claim that ASCII I/O
> devices constrained Unix RE syntax; rather, they facilitated adopting
> the syntax already in use.



Relevant Pages

  • Re: Question about Sun JAVAC
    ... >> least to the work of Turing and Kleene. ... The RE and ERE syntax used by various Unix utilities clearly was ... Of course the symbols available for ASCII I/O devices (that is, ...
    (comp.programming)
  • Re: How to renumber files?
    ... > portion of the file name by the magic sed command whose syntax I don't ... nice reference for Perl-compatible regular expressions. ... > suppose I can try to make the script do this automagically for me. ... You can run NUM through a sed command ...
    (comp.os.linux.misc)
  • Re: How do I parse this string into int fragments?
    ... A good idea to use regular expressions for such parsing. ... Then you don't need to fiddle with the code too much when the syntax of the expression changes. ... Regex rx = new Regex; ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Great Superbowl. Worst advertisements ever
    ... the Ascii value of the year minus 1840 plus the ascii for the month ... I once wrote a compiler were the syntax was interpreted with a state ... I can't remember what language it was, ...
    (soc.retirement)
  • Re: findobj -regexp documentation error?
    ... Limiting the Search with Regular Expressions. ... all stemseries objects that do not have their property Tag set ... rather than the syntax for "anything ...
    (comp.soft-sys.matlab)

Loading