Re: How to name variables in a program?



Arthur J. O'Dwyer wrote:
> On Sun, 29 May 2005, Joe Butler wrote:
> > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs600/htm
> > l/hunganotat.asp
>
> > If you take the time to read the code line-by-line (as you would if you
> > were going to actually work on it), then the naming convention is the
> > least of your problems. It's interesting to note that there are at least
> > 2 cut/past[e] errors in it. This has gone unnoticed because the general
> > layout, regardless of nameing convention, is extremely bunched up and
> > ugly. Limited use of whitespace, lack of bracing for single-line loops,
> > etc.
>
> Yes. I counted three cut-and-paste errors, not counting several
> undefined identifiers that don't look like typos. Also note that line 5
> is missing, and <> is not an operator in C.

Indeed, this has got to be the best example of why you should stay away
from hungarian notation, and/or Microsoft source code in general. I
don't think I could even construct an example that was a greater
indictment of just how
horrible these ideas are without it seeming to be an artificial joke.
Personally I think this example, above all others should be held up as
the most
representative of exactly what hungarian notation, and MS code in
general is
all about.

> > I'd be interested in seeing how each poster here would have named and
> > laid out that code example. And then again, how they would have done
> > it in a production environment with the schedule pressure.
>
> Here's my attempt to prettify the code. I didn't fix the cut-and-paste
> errors, since even with the "hints" at the bottom of the page, I wasn't
> sure how some of them should be resolved. Line numbers preserved.

Indeed a great idea! Here is my complete recoding of that
embarassment:

/* A name like "sy" is extremely unfortunate. But I have no idea what
it
was supposed to be (and still don't) so I guess I will just leave
it. */

#include "sy.h"

/* The line descriptions indicates that the next two should be named as

follows: */

extern int * dict;
extern int syEndOfs;

/* This should be declared either with "extern" or "static" in front.
Again
I cannot determine that from the sparse context of the example, so
like
them I will make the mistake of leaving off whichever its supposed
to be.
Unless this is a static function, this should *REALLY* be declared
as
extern in some header somewhere. The Psy prefix means something;
with the
P just being redundant with the *. The original declaration of ()
for the
parameters should really be outlawed. */

struct SY * PsyCreate (int);

/* Random unnamed constants should not appear in your code. Define and
name
them. The only reason not to do this is if the constant necessarily
only
appears once, and is accompanied by a comprehensive inline comment.
*/

#define BOTTOM_15_BITS (077777)

/* I have no idea what module, this code is in. So I am just prefexing
it
with the name "module", under the assumption that the code is meant
to
be an external function and the name of this module is "module". If
it
is supposed to be static, then just drop the prefix. I chose the
name
Insert, since this is what it seems to be doing. Now, char sz[] is
a
*HORRIBLE* way of declaring an read-only '\0' terminated string.
Just
const char * will be sufficient for that, and the name "key" to help
us
understand that this is a hash table, or other ADT, entry key. */

struct SY * moduleInsert (const char * key) {
struct SY *currSy;
int keyLen;
int *currEntry;

{
char * s = key;
unsigned int hashValue = 0;

/* The original code was just syntactic garbage. But it seems
that
it was supposed to be some really terrible hash function. */

while (*s) hashValue = (hashValue << 11) + *s++;
keyLen = s - key; /* Determine length of string as a
side-effect */
currEntry = &Hash[(hashValue & BOTTOM_15_BITS) % hashSz];
}

/* Search for a match, or find the tail of the current chain */
for (;*currEntry;currEntry = &currSy->next) {
char *szSy = (currSy = (struct SY *) &dict[*currEntry])->entry;
if (0 == strcmp (key, szSy)) return currSy;
}

{
int len = (keyLen < 2) ? 0 : (keyLen - 2 / sizeof(int)) + 1;
*currEntry = ((int *) (currSy = PsyCreate (cwSY + len))) -
dict;
}
Zero ((int *) currSy, cwSY);
bltbyte (key, currSy->entry, keyLen + 1);
return currSy;
}

Ok, notice the definitions of "s" and "len". Scopes were opened up in
the
short code regions where those variables were used. The point is that
the
other main variables are not narrowly used, and help us understand the
substance of the inner loops even before we get started.

Also notice that I used the standard C library function "strcmp"
instead of
trying to inline the code. The major value add being the people know
instantly what is going on, upon reading it.

I still don't really know what "Zero()" or "PsyCreate()" *really* do,
so we
can't quite know what is going on. We have an idea that they are
initializers
and creator functions, but without knowing more about them -- I mean
casting
should be minimized, and there's one right there for the Zero()
function. The
type of the parameter for PsyCreate() ... I am *guessing* that its an
int or
similar type, but it can be any numeric type that is compatible with "+
int".

The casting back and forth between struct SY * and int * is indicative
that
there is something really wrong with the core design.

Some of the other names like or "cwSy" etc, I am just too lazy to
decode what
the hell they are trying to say with such names.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

.



Relevant Pages