Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- From: spinoza1111 <spinoza1111@xxxxxxxxx>
- Date: Fri, 28 Dec 2007 04:05:12 -0800 (PST)
On Dec 28, 12:55 am, spinoza1111 <spinoza1...@xxxxxxxxx> wrote:
Here is a summary of what I think are the flaws in the code Rob Pike
wrote for Brian Kernighan, and which Brian presents as Beautful Code
in a book of that name, published this year by O'Reilly, on page 3 (an
unindented copy of Rob's code as keyed by me, and wrapped in a C++
class, is at my developerDotStar blog). I read C but have long since
abandoned it because I don't think Beautiful Code can be written in C:
the Pike code, I think, is an example of code that only seems
Beautiful.
(1) Pike's code doesn't implement a regular expression interpreter
insofar as "regular expressions" have a formal, mathematical meaning.
For example, it makes no provision for a character which must occur at
least once.
(2) It is idiomatic C, which uses a parameter passed by value as the
changeable index into the string it points-at and it expects the user
to grok the fact that she has a useful value in that value parameter
(the start of the text conforming to the regular expression),
something unexpected outside of C. If the user passes the text as a
string literal, the useful point at which the regular expression
occurs is lost.
(3) While advertised as a string processor, it does not handle modern
Unicode or double-byte strings containing international input.
(4) Its comments are unilluminating especially as regards need-to-know
the fact that its value parameter contains a useful result (the start
position of the regular expression), and Brian's discussion in
Beautiful Code is unhelpful. Of course, as noted above, this address
is lost when a string literal is passed.
(5) The length of the substring of characters that satisfies the
regular expression is lost.
(6) Kernighan does express, in the text of the essay, the concern that
the code, which does heavy recursion proportional to the string
length, might run slowly or overrun stack limits.
But no consideration exists in the code that a semi-beautiful way
exists to avoid most recursion. If the regular expression does not
start with a string start carat or string end dollar sign, its first
character, if not followed by an asterisk, is a "handle" which which
can be scanned-for in a nonrecursive loop or by a strcspn function.
Even if the first character (that's not dollar or carat) is followed
by an asterisk, if the asterisked character occurs frequently, the
code can search for the first character, and return a match
immediately upon finding it, or, entering the recursive code upon
failure.
What would be Beautiful about this? The fact that it's not the sort of
thing an optimizing compiler would "think" of, whereas the apparent
efficiency of Pike's code is often discovered by an optimizing
compiler (although an optimizing compiler would not "see" the utility
of recursion in an iterative solution, to be sure).
It would, I think, amortize the cost of using a platform, whether Java
or C Sharp, that handles real strings through an abstraction.
Please feel free to point out where I've gone off the rails, since I
have such respect for Kernighan that I'm astonished that he feels that
this snippet is Beautiful Code.
If I imply anywhere that "you get the starting address of the
satisfier string unless you pass the test string in a named variable"
I was incorrect. The test variable in Pike's code is modified to point
to the satisfier string or the end of the string marker, but only on
the runtime stack, to which it is copied as part of the subroutine
call. My bad, but it means that the original objection was valid: Rob
Pike's "beautiful" code doesn't even deign to return the starting
address of the satisfier string.
It troubles me that I come here to make sure my objections to Pike's
code are on target, given that I have refused to use C for 20 years
owing to its limits, only to be involved in History Channel type
issues as to who invented OO, to be stalked by a known cyber-bully,
and to find my own errors after all. It troubles me that I also went
to the site http://programming.reddit.com/info/63vc1/comments/ only to
get amateur literary criticism from nonspecialists for the most part,
who think that "pseudo-intellectual" is a bon mot, but "anarcho-
capitalism", like not, for no discernable reason.
The clarification as to Stroustrup's role in OO was a useful and
friendly discussion. Things were civil until Randy Howard walked in:
things are now going to get rowdy since I'm sick of cyber-bullying,
whether of myself or Kathy Sierra (http://en.wikipedia.org/wiki/
Kathy_Sierra).
But I've been debugging my statements about Pike's code online, and
there seems to be disinterest in this code amongst programmers. This,
and Kernighan's apparent fall-off from the stance he took in the
1970s, in The Elements of Programming Style, troubles me,
It troubles me that such deficient code is considered by such a
distinguished man to be Beautiful Code.
.
- Follow-Ups:
- Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- From: spinoza1111
- Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- References:
- Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- From: spinoza1111
- Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- From: spinoza1111
- Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- Prev by Date: Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- Next by Date: Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- Previous by thread: Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- Next by thread: Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum
- Index(es):
Relevant Pages
|