Re: Histogram of character frequencies



Groovy hepcat rajash@xxxxxxxxxxxxxxxxxxxxxxxx was jivin' in comp.lang.c
on Mon, 3 Dec 2007 1:39 am. It's a cool scene! Dig it.

James Kuyper wrote:
rajash@xxxxxxxxxxxxxxxxxxxxxxxx wrote:
Johannes Bauer wrote:
rajash@xxxxxxxxxxxxxxxxxxxxxxxx schrieb:

int x[256]; // frequencies
Global.

It's completely acceptable to have variables defined at file scope
in C!

What's acceptable is not always a good idea. Global objects have many
disadvantages; they should be avoided except when necessary; they
aren't necessary in this case.

In this case they help simplify the code - the array gets initialized
to 0 at compile-time, instead of needing extra code for an
initialization loop bad for efficiency!

Others have shown you how to initialise block scope arrays. The
generated object code may simply be a loop in which elements are
assigned a given value. In that case initialisation may be no more
efficient than a loop you write yourself. This is also true of the
implicit initialisation of a file scope array.
Code that makes extensive use of things like global variables is often
called "spaghetti code". Only meat ball programmers write code like
that.

Why does everyone have this hangup about this? I took a class in C
a while back and my teacher always used void main() { ... }. I can
confirm that it works fine with both MicroSoft compiler and
BorLand.

That doesn't make it legal. A conforming implementation of C is
allowed to reject a program which declares main() that way.

But no "conforming implementation" on Windows rejects it! I don't
believe any C compiler anywhere would reject it.

Let's test that assertion, shall we? I reboot to Windoze (because I'm
using Linux), open a console window and enter these lines:

-----------------------------------------------------------------------
copy con testing.c
#include <stdio.h>

void main(void)
{
puts("Hello, World!");
}
^z
bcc32 -A -etesting.exe -w testing.c
-----------------------------------------------------------------------

The resulting output from Borland Builder is as follows:

-----------------------------------------------------------------------
Borland C++ 5.3 for Win32 Copyright (c) 1993, 1998 Borland International
testing.c:
Error testing.c 4: main must have a return type of int in function main
*** 1 errors in Compile ***
-----------------------------------------------------------------------

That's an error message (which halts compilation), not merely a warning
(which allows compilation to continue). When invoked with the -A
command line option (which forces it to be standard compliant) the
Borland compiler rejects void as a return type for main(). Not only has
"any C compiler anywhere" rejected it, but a "'conforming
implementation' on Windows" has rejected it.
Clearly, therefore, you are wrong.

I read the answers but mostly people only comment on trivial things
that aren't even errors! I'll be glad to have substantial comments
on my code.

Several of the "trivial" things people have commented on ARE errors,
and serious ones - you don't seem to understand how serious. Most
importantly, #inclusion of the appropriate standard headers is
absolutely essential for your code to even compile, at least under
most implentations of C. If what you've given us is the complete text
of your program, and if you are using a compiler which accepts your
code as written, junk it - it's teaching you some very bad habits.

OK you're right I should remember that. However I don't think it's the
end of the world - the standard library is always linked in so the
right functions will be found in the end by the linker.

Who says? The library may not be linked without the compiler magic
contained in the headers. Or they may be linked, but functions not
called properly. The point is that failing to include the proper
headers is a very serious error, and you must understand this.

Also, you're using feof() incorrectly, and until you understand why
the way that you're using it is incorrect, I would not recommend
relying upon any of your programs to function properly.

I don't really understand the problem with feof - it just checks if
the EOF indicator is set in a given FILE * struct. Anyway I'll read
about it.

Many newbies think feof()'s purpose is to indicate when the end of a
file is reached by a read function. This is incorrect. Its purpose is
to indicate when a file stream's end of file indicator is set. This
only happens when you try to read from a stream that has *already*
reached the end. To explain this more clearly, consider the following
situation.
Suppose you have a stream containing three bytes, and you are reading
one byte at a time, using getchar(), in a loop, like so:

while(!feof(stdin))
{
c = getchar();
putchar(c);
}

On the first iteration of the loop you test the end of file indicator
for the input stream (stdin in this example), and it is not set, so you
then read the first byte and write this out. On the second iteration of
the loop you test the end of file indicator again, and it is not set,
so you then read the second byte and write this out. On the third
iteration of the loop you test the end of file indicator again, and it
is not set, so you then read the third byte and write this out. So far
so good. But the end of file indicator is still *not* set, and feof()
will return false. So you iterate a fourth time and try another read.
The read fails, of course, because there are no more bytes in the
stream. This is the perfect time to exit the loop; and since getchar()
returns EOF to indicate that it failed to read a byte, you have the
perfect way to detect this situation. However, you ignore this value
and simply continue processing the (now invalid) data you think you've
read from the file. You send EOF to stdout. *Now* the end of file
indicator is set, and feof() returns true. But it's too late. You don't
test for this until the beginning of the fifth iteration, *after*
you've used the invalid data. You've read in three valid bytes and
written out four bytes, one of which is not valid.
What you should be doing is this:

int c; /* c must be an int so we can detect EOF. */

while(EOF != (c = getchar()))
{
putchar(c);
}
if(!feof(stdin)) /* Or we could use if(ferror(stdin)). */
{
/* File read error: handle it somehow. */
}

Here we attempt to read a byte with getchar(), and only enter the loop
if the return value does not indicate a failure to read a byte. After
the failure code (EOF) has been detected, the loop is exited, and we
then attempt to determine whether the failure occurred due to an error
or an end of file condition. Here's a breakdown of how it works (using
the same 3 byte example input as before).
On the first iteration we read the first byte and test whether the
read was successful. It was, so we output the byte. On the second
iteration we read the second byte and test whether the read was
successful. It was, so we output the byte. On the third iteration we
read the third byte and test whether the read was successful. It was,
so we output the byte. On the fourth iteration we read a byte, but the
stream is exhausted and the read fails; getchar() returns EOF. We
detect this and exit the loop. We've read in three valid bytes and
written out three bytes, all of which are valid. *Now* we call feof()
to test whether the failure was due to an end of file condition, and,
if so, skip the error handling code. However, if feof() returns false,
then the read failure must have been due to an error, in which case we
handle the error somehow (perhaps by emitting a diagnostic message and
quitting).

--
Dig the sig!

----------- Peter 'Shaggy' Haywood ------------
Ain't I'm a dawg!!
.



Relevant Pages