Re: A note on computing thugs and coding bums
- From: spinoza1111 <spinoza1111@xxxxxxxxx>
- Date: Fri, 11 Jan 2008 05:05:47 -0800 (PST)
On Jan 11, 1:08 pm, Ben Bacarisse <ben.use...@xxxxxxxxx> wrote:
spinoza1111 <spinoza1...@xxxxxxxxx> writes:
<snip>
This was a technical post describing my concern, with references, to
code presented as Beautiful which seemed to me Ugly because it didn't
even work for modern strings [...]
Fairer to say that it is "old" code. Your C# "modern string" version
would not have been possible at the time. Technology marches on and
we stand, to some extent, on the shoulders of giants. Incidentally,
(talking of giants) the UTF-8 encoding of Unicode that has so helped
to make "modern strings" possible was designed and first implemented
by Ken Thompson -- prompted by Rob Pike.
and had numerous other technical flaws.
Allegedly. In another post I tried to cut though the verbiage to find
them and I think they are minor.
I then bench-marked it in a C++ [...]
I don't think is was C++. I don't know the details but it looked like
something else -- I suggested C++/CLI elsethread -- but you probably
know and could tell us.
wrapper against a C sharp version
which fixed the flaws to discover that the C sharp version is about
3..5 times as slower...
You'd be better off porting the exact original to C# as well (alleged
bug and all). You could then compare algorithms. I suspect the
slowdown you see is simply C# doing its stuff. You would then have
comparable numbers to see if your "fixes" have any real cost.
[See below for why "fixes" is in "quotes".]
but actually works, which the Kernighan code
doesn't owing to Kernighan's and Pike's unfortunate, and C-psychology,
mindset.
No your re-hash of the original has broken it. You have so messed up
the neat C code it is hard to find where your bugs are but they are in
there.
I'll give you two examples. Try to match "a*ab" against "aab". You
should get a match (the Pike code does) but both the C++ and the C#
versions you posted disagree. The other is more dramatic: on my C# vm
matching a pattern like "a*" gives me a helpful crash and an array
bounds violation so the error was quite easy to find. For the fix for
both, refer to Rob Pike's "ugly" code.
The bug originated in Rob Pike's ugly code.
Rob Pike's code makes a final call at the point of failure to
matchStar with the pointer to regex pointing past the end of the
string. However, in C, this is "usually" a Nul character, and
matchStar executes with a null string.
However, C makes no provision for preventing a pathological address of
a "string" or memory block with no terminating Nul prior to a memory
fault, and this in fact a major reason why C should not be used for
modern development. You've demonstrated a flaw in C.
It was my error not to spot the fact that for a wide class of regular
expressions, this error would occur in the C sharp transliteration of
the C code, a transliteration made as close as possible to the
original simply to get a comparative performance number, not to be
"perfect"...to determine, in fact, whether the supposed advantages of
a globally buggy claim that the Pike code "does regular expressions",
handles "strings", and is at all Beautiful are worth poor
practice...including relying on addresses of strings being passed, and
on all strings being representable in 8 bit bytes.
In fact, my error demonstrates why I don't believe in code snippets,
and why the 26000 lines of code I shipped for my book "Build Your
Own .Net Language and Compiler" included complete self- and stress-
testing facilities that were praised by a reviewer. I don't expect to
be always able to post bug free code when merely making a point, that
being here that the "beauty" of Pike's code is an
illusion...especially when being deliberately and with malign and
libelous intent harassed by jerks.
I will of course fix the problem you have pointed out, and post the
next version, with an additional feature to get and use the handle as
I have described.
Thanks for pointing this out, and also for a reasonable amount of
civility and self-restraint, as well as hard work, that isn't being
shown by Heathfield or Howard.
.
- Follow-Ups:
- Re: A note on computing thugs and coding bums
- From: Ben Bacarisse
- Re: A note on computing thugs and coding bums
- References:
- A note on computing thugs and coding bums
- From: spinoza1111
- Re: A note on computing thugs and coding bums
- From: Richard Heathfield
- Re: A note on computing thugs and coding bums
- From: user923005
- Re: A note on computing thugs and coding bums
- From: Rui Maciel
- Re: A note on computing thugs and coding bums
- From: spinoza1111
- Re: A note on computing thugs and coding bums
- From: Ben Bacarisse
- A note on computing thugs and coding bums
- Prev by Date: Re: A note on computing thugs and coding bums
- Next by Date: Re: A note on computing thugs and coding bums
- Previous by thread: Re: A note on computing thugs and coding bums
- Next by thread: Re: A note on computing thugs and coding bums
- Index(es):
Relevant Pages
|