Re: A note on computing thugs and coding bums



spinoza1111 <spinoza1111@xxxxxxxxx> writes:
<snip>
This was a technical post describing my concern, with references, to
code presented as Beautiful which seemed to me Ugly because it didn't
even work for modern strings [...]

Fairer to say that it is "old" code. Your C# "modern string" version
would not have been possible at the time. Technology marches on and
we stand, to some extent, on the shoulders of giants. Incidentally,
(talking of giants) the UTF-8 encoding of Unicode that has so helped
to make "modern strings" possible was designed and first implemented
by Ken Thompson -- prompted by Rob Pike.

and had numerous other technical flaws.

Allegedly. In another post I tried to cut though the verbiage to find
them and I think they are minor.

I then bench-marked it in a C++ [...]

I don't think is was C++. I don't know the details but it looked like
something else -- I suggested C++/CLI elsethread -- but you probably
know and could tell us.

wrapper against a C sharp version
which fixed the flaws to discover that the C sharp version is about
3..5 times as slower...

You'd be better off porting the exact original to C# as well (alleged
bug and all). You could then compare algorithms. I suspect the
slowdown you see is simply C# doing its stuff. You would then have
comparable numbers to see if your "fixes" have any real cost.
[See below for why "fixes" is in "quotes".]

but actually works, which the Kernighan code
doesn't owing to Kernighan's and Pike's unfortunate, and C-psychology,
mindset.

No your re-hash of the original has broken it. You have so messed up
the neat C code it is hard to find where your bugs are but they are in
there.

I'll give you two examples. Try to match "a*ab" against "aab". You
should get a match (the Pike code does) but both the C++ and the C#
versions you posted disagree. The other is more dramatic: on my C# vm
matching a pattern like "a*" gives me a helpful crash and an array
bounds violation so the error was quite easy to find. For the fix for
both, refer to Rob Pike's "ugly" code.

--
Ben.
.