Re: Brian Kernighan, maybe I'm not worthy, maybe I'm scum



On Dec 28, 12:55 am, spinoza1111 <spinoza1...@xxxxxxxxx> wrote:
Here is a summary of what I think are the flaws in the code Rob Pike
wrote for Brian Kernighan, and which Brian presents as Beautful Code
in a book of that name, published this year by O'Reilly, on page 3 (an
unindented copy of Rob's code as keyed by me, and wrapped in a C++
class, is at my developerDotStar blog). I read C but have long since
abandoned it because I don't think Beautiful Code can be written in C:
the Pike code, I think, is an example of code that only seems
Beautiful.

(1) Pike's code doesn't implement a regular expression interpreter
insofar as "regular expressions" have a formal, mathematical meaning.
For example, it makes no provision for a character which must occur at
least once.

(2) It is idiomatic C, which uses a parameter passed by value as the
changeable index into the string it points-at and it expects the user
to grok the fact that she has a useful value in that value parameter
(the start of the text conforming to the regular expression),
something unexpected outside of C. If the user passes the text as a
string literal, the useful point at which the regular expression
occurs is lost.

(3) While advertised as a string processor, it does not handle modern
Unicode or double-byte strings containing international input.

(4) Its comments are unilluminating especially as regards need-to-know
the fact that its value parameter contains a useful result (the start
position of the regular expression), and Brian's discussion in
Beautiful Code is unhelpful. Of course, as noted above, this address
is lost when a string literal is passed.

(5) The length of the substring of characters that satisfies the
regular expression is lost.

(6) Kernighan does express, in the text of the essay, the concern that
the code, which does heavy recursion proportional to the string
length, might run slowly or overrun stack limits.

But no consideration exists in the code that a semi-beautiful way
exists to avoid most recursion. If the regular expression does not
start with a string start carat or string end dollar sign, its first
character, if not followed by an asterisk, is a "handle" which which
can be scanned-for in a nonrecursive loop or by a strcspn function.

Even if the first character (that's not dollar or carat) is followed
by an asterisk, if the asterisked character occurs frequently, the
code can search for the first character, and return a match
immediately upon finding it, or, entering the recursive code upon
failure.

What would be Beautiful about this? The fact that it's not the sort of
thing an optimizing compiler would "think" of, whereas the apparent
efficiency of Pike's code is often discovered by an optimizing
compiler (although an optimizing compiler would not "see" the utility
of recursion in an iterative solution, to be sure).

It would, I think, amortize the cost of using a platform, whether Java
or C Sharp, that handles real strings through an abstraction.

Please feel free to point out where I've gone off the rails, since I
have such respect for Kernighan that I'm astonished that he feels that
this snippet is Beautiful Code.

If I imply anywhere that "you get the starting address of the
satisfier string unless you pass the test string in a named variable"
I was incorrect. The test variable in Pike's code is modified to point
to the satisfier string or the end of the string marker, but only on
the runtime stack, to which it is copied as part of the subroutine
call. My bad, but it means that the original objection was valid: Rob
Pike's "beautiful" code doesn't even deign to return the starting
address of the satisfier string.

It troubles me that I come here to make sure my objections to Pike's
code are on target, given that I have refused to use C for 20 years
owing to its limits, only to be involved in History Channel type
issues as to who invented OO, to be stalked by a known cyber-bully,
and to find my own errors after all. It troubles me that I also went
to the site http://programming.reddit.com/info/63vc1/comments/ only to
get amateur literary criticism from nonspecialists for the most part,
who think that "pseudo-intellectual" is a bon mot, but "anarcho-
capitalism", like not, for no discernable reason.

The clarification as to Stroustrup's role in OO was a useful and
friendly discussion. Things were civil until Randy Howard walked in:
things are now going to get rowdy since I'm sick of cyber-bullying,
whether of myself or Kathy Sierra (http://en.wikipedia.org/wiki/
Kathy_Sierra).

But I've been debugging my statements about Pike's code online, and
there seems to be disinterest in this code amongst programmers. This,
and Kernighan's apparent fall-off from the stance he took in the
1970s, in The Elements of Programming Style, troubles me,

It troubles me that such deficient code is considered by such a
distinguished man to be Beautiful Code.
.



Relevant Pages

  • Re: Brian Kernighan, maybe Im not worthy, maybe Im scum
    ... abandoned it because I don't think Beautiful Code can be written in C: ... Pike's code doesn't implement a regular expression interpreter ... it makes no provision for a character which must occur at ... changeable index into the string it points-at and it expects the user ...
    (comp.programming)
  • Re: Checking last character of string for punctuation
    ... I'm a newbie with a newbie question. ... the string should be kept as is. ... This matches your string against the regular expression that you need to put ... Given the description of your problem, you might be interested in character ...
    (perl.beginners)
  • Re: Reading a variable line by line with while loop
    ... actual duplicates there were in the file to begin with. ... regular expression to say "if this string exists within the larger ... and maybe someone can correct my regular expression so it works to weed ... inside of brackets treats it as a literal character ...
    (Ubuntu)
  • Re: Get regular expression
    ... own tree structure. ... Expression compares a string character-by character, ... regular expression solution, which was about as close as one could get to ... the structure of the hierarchy can be inferred by using ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Get regular expression
    ... regular expression solution, which was about as close as one could get to ... first string. ... explode "ABLATION" and see subnodes of "ENDOMETRIAL ... "Heart 27.33/2" ...
    (microsoft.public.dotnet.languages.csharp)