Re: malloc + 4??
From: Kevin Torr (kevintorr_at_hotmail.com)
Date: 04/05/04
- Next message: Ben Pfaff: "Re: macros question"
- Previous message: Al Bowers: "Re: structs with fields that are structs"
- In reply to: Chris Torek: "Re: malloc + 4??"
- Next in thread: John Tsiombikas (Nuclear / the Lab): "Re: malloc + 4??"
- Reply: John Tsiombikas (Nuclear / the Lab): "Re: malloc + 4??"
- Reply: Keith Thompson: "Re: malloc + 4??"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 5 Apr 2004 11:55:58 +1000
"Chris Torek" <nospam@torek.net> wrote in message
news:c4pdra031ga@news2.newsguy.com...
> In article <4070182a$0$27645$61ce578d@news.syd.swiftdsl.com.au>
> Kevin Torr <kevintorr@hotmail.com> writes:
> >http://www.yep-mm.com/res/soCrypt.c
>
> In general, it is better to post the actual problematic code
> (preferably after shrinking it down to a "problematic nub", as it
> were), but a URL reference can work if the one reading netnews
> bothers to follow the link. :-)
>
> >I have 2 malloc's in my program, and when I write the contents of them to
> >the screen or to a file, there aren addition 4 characters.
>
> >As far as I can tell, both the code to register the malloc and to write
> >information into the malloc is solid. Why then ismy program returning an
> >additional 4 characters?
>
> It is not quite as solid as one might hope, although the problem
> has nothing to do with malloc() per se. Here are excerpts from the
> code (quoted with ">" as usual, although I had to insert the markers
> myself):
>
> >// soCrypt 1.0
> >
> >#include <stdio.h>
> >#include <stdlib.h>
> >#include <string.h>
> >#include <time.h>
> >// #include <md5.h>
>
> OK so far, although //-comments are specific to C99. You have
> included necessary headers, so you will not need to cast malloc()'s
> return value.
>
> >// Global variables
> >
> >int statCode = 0; // Status code
> >int mode; // Mode variable (1 = enc, 2 = dec)
> >int i; // Looper variable
> >int inSize = 0; // Input filesize
> >int intRand; // Random int
> >char tmp_char; // Temporary char
>
> Many of these should not be file-scope external-linkage ("global")
> variables, although this is mostly a style issue (at least in a
> program this small).
>
> Note that tmp_char has type "char"; on a typical PowerPC, it would
> hold values between 0 and 255 inclusive, because there plain "char"
> is unsigned. The variable inSize is a plain (signed) int and has
> at least the range [-32767..+32767] (although most systems, today,
> have an even wider range, about +/- 2 billion).
>
> >char *pMasterKey; // Malloc pointer to the master key
> >char *pInputData; // Malloc pointer to the input data
>
> Skipping forward, we have:
>
> >// Reads the input file
> >
> >int readFile()
> >{
> >
> > rewind(inFile);
> > i = 0;
> > tmp_char = 'a';
> > while(tmp_char != EOF)
> > {
> > i++;
> > tmp_char = getc(inFile);
> > }
> > inSize = i-1;
>
> This loop tries to count the size of the file by calling getc()
> until getc() returns EOF. The problem is that EOF is some sort of
> negative number -- typically -1, but perhaps even -2000 or some
> such -- and tmp_char is a plain "char". If tmp_char is unable to
> hold the value EOF, which will be true if plain char is unsigned
> or if EOF is less than CHAR_MIN (e.g., -2000 vs -128 for instance),
> the loop will never terminate.
>
> This is why getc() returns a value of type "int" in the first place,
> so that it can return all possible "char"s (having first converted
> any negative ones to positive values as if via "unsigned char"),
> yet also return the special marker value EOF. If you want to store
> both "any valid character" *and* EOF, you need something with a
> wider range than "any valid character".
>
> Of course, there is really no need for tmp_char at all, nor for
> correcting for the off-by-one error produced by counting inside
> the loop *before* getting a character. Just change the loop to,
> e.g.:
>
> while (getc(inFile) != EOF)
> i++;
>
> Combine this with using local variables, and perhaps a "for"
> loop to collect up the initialization, test, and increment, we
> might get something like:
>
> int readFile() {
> int i;
>
> rewind(inFile);
> for (i = 0; getc(inFile) != EOF; i++)
> continue;
> inSize = i;
>
> (although I would also pass the "FILE *" parameter to readFile,
> and probably return the allocated memory rather than an "int"
> status code).
>
> > if ((pInputData = (char *)malloc(inSize * sizeof(char))) == NULL)
> > {
> > statCode = 8;
> > return(statCode);
> > }
>
> This is OK in and of itself, but there are two important things to
> note. First, the cast is not required. It does no harm, but also
> does no help. It is a bit like saying "tmp_char = (char)getc(inFile)",
> when tmp_char is already a char. The assignment will do the
> conversion for you -- and you do not use a cast below, so why use
> one above?
>
> Second, and the actual source of the observed problem later, note
> that this allocates just enough space to store all the characters
> you intend to read from the file.
>
> Consider the C string "hello world". How many characters are in
> it? How many characters does it take to *store* it? Why, after:
>
> char hello[] = "hello world";
>
> is there a difference of 1 between "sizeof hello" and "strlen(hello")?
>
> The answer is: because C strings require a '\0' marker after all
> their valid "char"s. The array hello[] has size 12, not size 11,
> because it stores the 11 "char"s that make up the two words and
> the blank, and then one more to store the '\0' marker.
>
> The variable inSize might (for instance) hold 5 if the file contents
> are "word\n" (perhaps followed by an EOF marker, if your system
> actually uses such markers in files), but if you want to use the
> sequence {'w', 'o', 'r', 'd', '\n'} as a C string, you need *six*
> bytes: {'w', 'o', 'r', 'd', '\n', '\0'}.
>
> Of course, there is no requirement that you treat the file as
> a C string -- that part is up to you. In any case:
>
> > rewind(inFile);
> > i = 0;
> > tmp_char = 'a';
> > while (i < inSize)
> > {
> > tmp_char = getc(inFile);
> > *(pInputData + i) = tmp_char;
> > i++;
> > }
> > return 0;
> >}
>
> There is nothing *wrong* here, but the code can be simplified
> enormously. First, tmp_char is never inspected without first
> calling getc(), so there is no need to "prime the pump" -- the
> loop tests "i < inSize". Second, there is no need for tmp_char
> at all; you can just assign the value from getc() directly into
> pInputData[i]. Third, you can write pInputData[i] that way,
> rather than using the equivalent unary-"*" sequence, and again
> perhaps a "for" loop might express the whole sequence better:
>
> rewind(inFile);
> for (i = 0; i < inSize; i++)
> pInputData[i] = getc(inFile);
> return 0;
> }
>
> Skipping forward to the source of the observed problem:
>
> >// Writes the output file
> >
> >int writeFile()
> >{
> > fputs(pInputData, outFile);
> > fputs(pMasterKey, keyOut);
> > return 0;
> >}
>
> The fputs() function demands a string -- a sequence of "char"s
> ending with a '\0' termination marker. pInputData points to the
> first of a sequence of "char"s, but not one that has the "stop here
> at the \0" mark in it.
>
> Without making any other changes, you can either allocate one extra
> byte and put in the '\0', or you can change the method you use to
> write the final output. The two simply have to agree as to whether
> pInputData (and pMasterKey -- but I did not even look at that code)
> is a counted string (length inSize, for pInputData) or a C-style
> '\0'-terminated string.
>
> Note that a '\0'-terminated string cannot *contain* a '\0', so if
> you want (for whatever reason) to allow embedded '\0' bytes, you
> will have to choose the counted-string method. You could write
> out a counted string with a loop:
>
> int writeFile() {
> int i;
>
> for (i = 0; i < inSize; i++)
> putc(pInputData[i], outFile);
> /* optional:
> if (fflush(outFile) || ferror(outFile))
> ... handle output failure ... */
> /* and repeat for pMasterKey */
> }
>
> or you can use fwrite(), which essentially does the loop for you.
> (Note that fwrite() works just fine with ordinary text, and in
> fact, fputs() can be implemented internally as:
>
> int fputs(char *s, FILE *stream) {
> size_t len = strlen(s);
>
> return fwrite(s, 1, len, stream) == len ? 0 : EOF;
> }
>
> Here strlen() looks for, but does not count, the terminating '\0',
> then fwrite() loops over all the valid bytes, putc()ing each one
> to the stream. The only remaining problem is that the return value
> from fwrite() does not match that from fputs(), so it has to be
> converted.)
Wow, thanks for all that. I will have to get my head around it all.
So when do I need not cast mallocs? when I include a header? which header?
Or are you saying that I don't need to cast a malloc if I've already defined
the pointer as a data type?
What would be a possible bad thing if I did cast a malloc when I didn't need
to? Is there something that could go wrong or is it just redundant?
- Next message: Ben Pfaff: "Re: macros question"
- Previous message: Al Bowers: "Re: structs with fields that are structs"
- In reply to: Chris Torek: "Re: malloc + 4??"
- Next in thread: John Tsiombikas (Nuclear / the Lab): "Re: malloc + 4??"
- Reply: John Tsiombikas (Nuclear / the Lab): "Re: malloc + 4??"
- Reply: Keith Thompson: "Re: malloc + 4??"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|