Re: Bug analysis



cri@xxxxxxxx (Richard Harter) writes:

On Mon, 12 Jan 2009 00:15:18 +0000, Richard Heathfield
<rjh@xxxxxxxxxxxxxxx> wrote:

jacob navia said:

C unleashed page 266, Listing 8.2
---------------------------------
char *ReadTextFile(FILE *fp,int *Error)
{
size_t size=0;
size_t len;
char *p = NULL;
char *q;
char buffer[120];

*Error = 0;
while (fgets(buffer, sizeof buffer, fp) {
len = strlen(buffer);
q = realloc(p,size+len);
if (q != NULL) {
p = q;
strcpy(p+size,buffer);
size += len;
}
else *Error = 1;
}
return p;
}
---------------------------------------
Let's follow this program.

The first time, the fgets function fills our buffer with a line or
119 characters and the terminating zero. In any case, we have a
well formed string in buffer.

The reallocation asks for "len" more characters, forgetting that
we will copy the terminating zero with strcpy into the string.

Thank you for bringing this code to my attention. In fact it has two
bugs, neither of which is to do with the realloc, which is actually
correct.

Firstly, size should be initialised to 1, not 0. This is a mistake
also made elsewhere in the book.

Secondly, the strcpy should deduct 1 from size+len when working out
where to write the new data.

Note that if the realloc fails, the program will go on trying to
reallocate the buffer, what probably will always fail. This is
not really a bug, but it would have been obviously better to add
a "break" statement to stop reading after we can't reallocate.

It's a fair point, and in my view it /is/ a bug.

I have modified the relevant errata page accordingly.

Well, no, it is better to change the realloc. The issue is, what
is "size". It should be one of two things, either the amount of
space currently allocated or the size of the text already read.
If size is the size of the text already read, then the realloc
argument should be size+len+1. If size is the amount allocated
then initializing size to 1 is a falsehood because no space has
yet been allocated.

What it comes down to is that setting size=1 is a hack. IMO it
is a serious error in coding for variables not to have well
defined meanings that are used consistently.

All of that said, in my view the code is intrinsically broken.
As I and others have noted it will lose text if there are any \0
characters in the file. This is the sort of thing that leads to
unpleasant surprises down the road. It's not the sort of thing
that one should do in tutorial code or in production code.

May I suggest that the right way to write that routine is to use
fgetc and do the error checks. Not only is it more rigorous, it
would illustrate the proper use of error checking.

Doesn't Chuck normally suggest some library or other that he wrote in
Pascal and converted to C at this stage? ggets or something?

--
I'm not a person who particularly had heros when growing up.
- What Dennis Ritchie could potentially say when asked about the hero worship coming from c.l.c.
.



Relevant Pages

  • Re: Bug analysis
    ... char *ReadTextFile ... while (fgets(buffer, sizeof buffer, fp) { ... and in my view it /is/ a bug. ... Well, no, it is better to change the realloc. ...
    (comp.lang.c)
  • Bug analysis
    ... char *ReadTextFile ... the fgets function fills our buffer with a line or 119 ... The reallocation asks for "len" more characters, ... this bug can very well go completely undetected in many occasions giving ...
    (comp.lang.c)
  • Re: Bug analysis
    ... char *ReadTextFile ... while (fgets(buffer, sizeof buffer, fp) { ... a "break" statement to stop reading after we can't reallocate. ... and in my view it /is/ a bug. ...
    (comp.lang.c)
  • Re: Bug analysis
    ... char *ReadTextFile ... while (fgets(buffer, sizeof buffer, fp) { ... "if realloc is NOT successful". ...
    (comp.lang.c)
  • Continuously concatenating binary data
    ... This data comes from the Windows Multimedia wave input ... I need to add 22050 bytes to an ever expanding buffer. ... Just make the buffer a void* (or char*), and realloc it every 2 seconds, ...
    (comp.lang.cpp)