Re: Is this string input function safe?



ais523 wrote:
I use this function that I wrote for inputting strings. It's meant to
return a pointer to mallocated memory holding one input string, or 0 on
error. (Personally, I prefer to use 0 to NULL when returning null
pointers.) It looks pretty watertight to me, but my version of lint
complains about use of deallocated pointers, etc. Is this code
completely safe on all input, or have I missed something?

You've forgotten about end-of-file (and I/O errors).

/* Header files included in the program containing malloc_getstr */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>

/* This function gets a string of any length from stdin,
mallocating an appropriate amount of memory. It returns
0 on string length overflow or out-of-memory. Note that
slightly more space is allocated than neccesary (from
1 to 10 characters). */
char* malloc_getstr(void)
{
size_t s=0;
char* c;
char* t;
c=malloc(s+10);
if(!c)
{ /* out of memory */
return 0;
}
for(;;)
{
fgets(c+s,10,stdin);
if(strchr(c+s,'\n')) break; /* \n lets us know if the input has
ended. Check only the most recent
input. */

... except that if fgets() detected end-of-file or error,
the contents of the buffer are indeterminate (for different
reasons in the two cases, but indeterminate anyhow). The
buffer might not contain a valid string, so the behavior of
strchr() is undefined.

if(s>SIZE_MAX-9)
{ /* new string length too long to pass as realloc's argument */

Looks like the test is wrong: the argument you're about
to pass to realloc will be s+19, not s+9. Another point of
view is to say that the test is right (approximately), but
in the wrong place: since you've already got s+10 characters
in the buffer, the test as it stands is too late to avoid
trouble. If you want to make the test, I'd suggest making it
right before allocating the memory.

free(c);
return 0; /* error return value */
}
s+=9;
t=realloc(c,s+10);
if(!t)
{ /* out of memory */
free(c);
return 0; /* error return value */
}
c=t;
}
*strchr(c+s,'\n')=0; /* replace the \n with a \0 */
return c;
}

General impression: Salvageable.

A few observations:

- Everybody writes a "better fgets()" eventually, and my
first used a strategy not unlike yours: call fgets(),
check for '\n', grow the buffer, lather, rinse, repeat.
However, I found that the function became much simpler
when I discarded fgets()/strchr() in favor of getc()
and testing each character as it arrived.

- The magic number 10 appears in five places in the code
(sometimes disguised as 9). I'm not as rabid as those
who insist on using #define for every number other than
0, 1, and perhaps 2, but when an essentially arbitrary
value shows up five times ...

- The test against SIZE_MAX shows praiseworthy diligence
(I failed to include a similar test in my own function,
so my hat's off to you.) The test will probably never
ever fire, but "better safe than sorry."

- Instead of limiting the function's usefulness by hard-
wiring stdin as the data source, why not accept a FILE*
function argument?

- Figuring out a good growth pattern for the buffer is a
matter of guesswork, and there's no one "best" way.
However, you might want to consider growing the buffer
more aggressively as the line gets longer and longer,
the idea being that a line that's already proved to be
at least 200 characters long may be more likely to grow
to 300 than to stop short of 210. (Of course, this
requires you to manage the 10/9 stuff differently.)

- Similarly with the size of the initial allocation: it
seems you're expecting very short input lines most of
the time. I'd suggest a larger starting size, not only
to avoid the work of copying the first part of the line
over and over and over again, but also so realloc() and
its friends won't have to deal with huge numbers of
itty-bitty memory fragments.

--
Eric Sosman
esosman@xxxxxxxxxxxxxxxxxxx

.



Relevant Pages

  • Re: Discovering variable types...
    ... >- but I suppose MS expect us to use wrappers ... memory allocations for your variables from disk as well. ... >They most certainly are of fixed size, changing the size of a String ... >>me to keep buffer size and current postion right in the memory block. ...
    (comp.lang.pascal.delphi.misc)
  • Re: strange behaviour
    ... Pointer to the buffer to receive the null-terminated string containing ... if the Windows directory is named Windows ... string copied to the buffer, not including the terminating null character. ... You must supply memory for windows to put the directory into. ...
    (comp.lang.pascal.delphi.misc)
  • Re: Buffer or Realloc?
    ... better to allocate memory and realloc it for the size of the what is ... between deciding to use a fixed size buffer or allocating memory ... so for the string I've got to prepare as part of a message to the UK Government gateway where the specification says the string has a maximum length of 10 characters I should not use a fixed size buffer but a reallocating buffer? ... with the realloc() approach -- obviously ...
    (comp.lang.c)
  • Re: Something wrong in my program
    ... what becomes of the memory block starting at this address is no ... our text buffer can contain 15 characters ... a string is a char array *terminated ...
    (comp.lang.c)
  • Re: get text from listbox
    ... The first character is the low byte of the low word of the DWORD item data ... addressing memory in a virtual address space of 2 GB (memory allocation ... To prove that these are pointers get the item data of the list items as ... string pointer; check it by reading some bytes from the memory to which the ...
    (microsoft.public.vb.winapi)