Re: for a laught (???)




"Rick Smith" <ricksmith@xxxxxxx> wrote in message
news:137h6ukjstdb84d@xxxxxxxxxxxxxxxxxxxxx

"Pete Dashwood" <dashwood@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:5dr8c2F367vg3U1@xxxxxxxxxxxxxxxxxxxxx

"Rick Smith" <ricksmith@xxxxxxx> wrote in message
news:137g387h00alt56@xxxxxxxxxxxxxxxxxxxxx

"Pete Dashwood" <dashwood@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:5dpuatF34gd3gU1@xxxxxxxxxxxxxxxxxxxxx

"Roger While" <simrw@xxxxxxxxxxxx> wrote in message
news:f5802v$8co$00$1@xxxxxxxxxxxxxxxxxxxx
Be aware of the limitations of regex.

I believe I am so aware :-)

I'll let you into a little secret.
We used to use regex in OC for the runtime component
of UNSTRING.
UNTIL we came across -
UNSTRING ... DELIMITED BY LOW-VALUE ...
:-)

Regex doesn't work too well with a null byte delimiter :-)

I think one of us is missing something here.

Yes, Mr Dashwood, you are! <g>

"regex" is the name of the header file used to include
the API (?) for a particular form of regular expressions
into C programs.

"RegEx" is, apparently, the function (or method) name
for processing a particular form of regular expressions
in C#.

After further investigation, I found that "RegEx" (spelled
Regex in a C# function a couple days ago) is a class, or
whatever is is called in C#, and not a function name as I
stated above.

It is indeed a Class (one of several concerned with Regular Expressions) in
the System.Text.RegularExpression namespace of the DotNET Framework Class
Library (FCL) but, not being of a pedantic nature, and realising what you
meant, I allowed your loose use of "function" as being near enough, didn't
correct you on it, and even used it myself for the sake of our
conversation....(I sometimes wish that people here would cut me the same
slack I cut them :-)). It doesn't matter, when what we were actually
discussing is the use of Regular Expressions with null terminated strings.

"regex" is not the same as "RegEx".

This assertion remains unchanged, however.

OK, point taken. But I was pretty clear about the fact that I was talking
about the MicroSoft implementation.

You may be able to keep this clear in your mind if you
remember to prefix your knowledge with MVO
(Microsoft's Version Of); thus, rephrasing your following
statement, "[(MVO) regular expressions] works fine with
null delimited, ..."

I qualified that below the statement, rather than at the start, with the
phrase: "...However, my experience is with the MS RegEx engine (which
some
consider to be a perversion :-)) and the engine you were using may have
been more limited...:-)"

Does any part of that NOT limit my discussion to MS?

Regardless of your expeience, to follow Mr While's
"Regex doesn't work ..." with your "RegEx works fine ..."
is a non sequitur that arises from "one of us is missing
something here." That is what I attempted to address.

No, I disagree. It is incorrect to state, without any qualification, that
Regex doesn't work with null terminated strings. I refuted it with a
specific example which DOES work with null-terminated strings. I then gave
examples of how you could get it to work with nulls in general. There is no
non-sequitur in that.

I think you are letting your anti-MicroSoft bias cloud your judgement.
You
certainly must have read the sentence, yet chose to ignore it and post a
diatribe about me confusing a function with a generic term (a very fine
point at best, given the context; it doesn't matter whether we are
talking
functions or generically, as I had qualified my discussion to the
MicroSoft
implementation and acknowledged that there are many other implementations
of
Regular Expressions), and your statement "> "RegEx" is, apparently, the
function (or method) name
for processing a particular form of regular expressions
in C#." ...is only partially correct anyway. (It is also the function
name
used in VB).


RegEx works fine with null
delimited, or even null embedded strings, PROVIDED you cater for nulls
in
the RegEx expression. (You may need to include an escape...\0x00 if
embedded, or $ if terminated by null. $ actually represents the "null
string
at the end of the string"; if you want just the end of the string
itself,
specify \z). Some implementations of RegEx allow you to specify
control
parameters to the RegEx engine, and one of these parameters is whether
or
not strings for matching are to be null terminated. >>


All usage of regex in the OC runtime has been removed :-)

That's a pity. It is very powerful.

Well, I guess you folks have your reasons.

Personally, I think that reflects more of a lack of understanding
RegEx
(or
maybe using a limited engine) than it does on Regular Expressions in
general.

With all due respect Mr Dashwood, here you compare
a function name with a concept and do so poorly.

Rather than "function name", I should have used
"implementation".

Well, don't beat yourself up. I understood what you meant :-).

I don't think it matters. If I wanted to discuss the concept, I would
have
done so. A quick search around the Web reveals a good number of
implementations of the concept. (Nearly every university has their own
one,
and numerous software packages implement their own interpretation also.
The
similarities far outweigh the differences.). That isn't the issue, and I
made it clear in the post you are referring to, and even in other posts
before that, that my experience of this concept is limited entirely to
the
MS implementation.

In "Regular Expressions in general", I took "in general"
to be somewhat broader than, say, the absence of any
qualifier or the qualifier "commonly" might have suggested.

If I recall correctly, and I could be mistaken, you claimed
to have done some work with compilers and my references
suggest that regular expressions have been used, regularly
and for some time, in compiler theory and development.

Yes, I did work on a specific compiler (for COBOL) in the 1960s. (My work
consisted of disassembling it and patching it to do various things it wasn't
written to do. As it was in 17 passes, this was a fairly non-trivial task
and I wouldn't have even attempted to do it if my Boss hadn't inspired me by
writing an Operating System, then encouraging me to have a go at the
compiler...:-))

(I have been fortunate to have worked with some very clever people during my
reckless career...) Anyway, at that time I was not aware of compiler theory
or Regular Expressions (and I doubt if anyone else was either :-))... there
was no such thing as Computer Science (at least, not in NZ) and "best
practices" were still being established. You couldn't just walk to a
bookshelf and take down a tome on compiler writing. I had never encountered
state changes and transition diagramming until I did this task, and was
amazed at the elegance of the syntax scan in pass 1 which used these
techniques.


If that was the case, then you might have been exposed to
regular expressions long ago. Thus I had some reason to
believe that your experience with the concept "in general"
was not as limited as you state, above.

I first encountered Regular Expressions two years ago, when modifying the
Rational Toolset provided by IBM, in VB. The work I am currently doing in C#
for my AVS Web Site needed to do some complex searching and matching (the
validation of email addresses is just one requirement I have) so I read up
on them using an O'Reilly reference on C# ("C# in a Nutshell"), overviewed
some Web Pages (mostly from Acadaemia), and browsed some samples from a
MicroSoft forum. That's why I was careful to state that my understanding
applies to the MS interpretation. That is the total extent of my experience
with "Regular Expressions" in general, or specifically. I am satisfied that
they are a very powerful device for matching and processing strings. I am
also satisfied that the C# implementation (which is really DotNET FCL) is a
very comprehensive one and if it extends the standard, I really don't care
about that. It gives me a very powerful tool which I have already used to
good effect, and will certainly use in the future.

I really see no point in arguing this further, so I won't :-)

I appreciate you backing your statements with COBOL code, and I think that
code is worthy of exploration, and will be of use to many people here,
irrespective of our argument about Regular Expressions.

However, I am no way persuaded that there is equivalent power in 190 lines
of COBOL, to that contained in one single expression, using a Regular
Expression, whether you write it in C#, Java, C++, PHP, Perl, or anything
else that supports Regular Expressions.

Pete.































.



Relevant Pages

  • Re: Construction of a non-regexable subset of the set of all strings
    ... I have a question which I am afraid might seem "overly scientific" (not ... If we look at the set S of all finite strings, then each regex ... regular expressions, ...
    (comp.lang.perl.misc)
  • Re: Construction of a non-regexable subset of the set of all strings
    ... I have a question which I am afraid might seem "overly scientific" (not ... If we look at the set S of all finite strings, then each regex ... regular expressions, ...
    (comp.lang.perl.misc)
  • Re: Search for multiple things in a string
    ... >> I also feel that Regular Expressions, being an object in asp.net (not ... So using Regex is not really like using another language (as C# is different ... I agree with you that readability is important. ... And I was not saying experiment with it. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Search for multiple things in a string
    ... >>> As far as readability, it has nothing to do with Regular Expressions ... > and Regex. ... >> characters in the string, perhaps even writing your own state machine ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: I need a test to see that a string is a valid path to a file
    ... I need a test to see that a string is a valid path to a file. ... "does point to an existing file": you can't check that with a regex. ... of folder names, but I could add that for the test. ... I don't know anything about regular expressions so I looked on the ...
    (microsoft.public.dotnet.languages.csharp)