Re: Code fails with Segmentation Fault





Vlad Dogaru wrote On 11/28/06 12:07,:
Hello,

I am trying to learn C, especially pointers. The following code
attempts to count the appearences of each word in a text file, but
fails invariably with Segmentation Fault. Please help me out, I've
already tried all my ideas. Also, please do comment on my coding style
or other aspects. Thank you.

You haven't quite understood that creating a pointer
does not automatically create something for it to point at.
See my remarks in-line; some are "important" and some are
just "stylistic hints."

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define INITLEN 5
#define DELTA 5

int main(int argc, char **argv)
{
char **words, *word, *string;
int *apps, max, found, i, n;
FILE *input;

if (argc != 2) {
printf("Usage: %s <file>\n", argv[0]);

It is customary to write "error messages" (as opposed to
"normal output" to stderr instead of to stdout). Consider
using the fprintf() function instead of printf().

return 0;

EXIT_FAILURE might be a better value than 0 here.

}

max = INITLEN;
n = 0;
words = (char **) calloc(max, sizeof (char *));

This is correct, but a better way to write it (because
it avoids any chance of getting the types mixed up) is

words = calloc(max, sizeof *words);

Two points: Yes, it's all right to use `*words' even though
`words' doesn't point at anything yet, because the sizeof
operator doesn't evaluate its operand. Also, there is no
need for the `(char**)' cast, and some small advantage in
leaving it out. (The rules are different in C++, though.)

An even better alternative would be

words = malloc(max * sizeof *words);

.... because your program does not need to have the array
filled with zero bits. You use `n' to keep track of how
many array positions are in use, and you always store a
value in the [n] position before increasing `n'. Thus,
you never actually "see" the zeroes provided by calloc() --
and besides, they wouldn't be useful even if you used them.

Whether you use malloc() or calloc(), it's important to
understand that either one can fail. That's unlikely so
early in the program, since you're requesting a fairly small
amount of memory and there hasn't been much activity yet that
might have exhausted the available supply. Still, it's just
possible that even this early malloc() or calloc() call could
fail, and you should check whether the returnes pointer is
NULL before you start trying to use it.

apps = (int *) calloc(max, sizeof (int));

Same remarks as above.

input = fopen(argv[1], "r");

You should check whether the fopen() succeeded, with
something like

if (input == NULL) {
fprintf (stderr, "Unable to open %s\n", argv[1]);
return EXIT_FAILURE;
}

while (fgets(string, 1000, input)) {

Here is your first "serious" error. Question: What
does `string' point to, and what makes you think the area
it points to is (at least) 1000 characters long?

word = strtok(string, " ");

You might want to use " \n" instead. Remember, fgets()
puts the line-ending '\n' character in the buffer, and if
you don't do something about it you'll find that the input

Now is the time for all
good men to come to the
aid of the party.

.... will give you two counts for the word "the" and one
count for the word "the\n".

while (word) {
found = 0;
for (i=0; i<n; i++)
if (!strcmp(words[i], word)) {
found = 1;
apps[i]++;
break;
}
if (!found) {
words[n] = strdup(word);

You should be aware that strdup() is not a Standard C
library function. Many systems provide strdup() as an
extension, but some might not. If you move your code to
another machine, you might find that there's no strdup()
there.

apps[n] = (int) malloc(sizeof(int));

Here is your second serious error. The variable `apps'
is a pointer-to-int, so apps[n] is an int (assuming n is
in a reasonable range). Read that again: apps[n] *is* an
int. Trying to allocate something extra is pointless: you
already have what you need, and can go ahead and use it.

apps[n] = 1;

.... just as you do here.

n++;
if (n == max) {
max += DELTA;
words = (char **) realloc(words, max);

Two errors here. First, you don't want a memory region
that is `max' bytes long, you want a region with enough room
for `max' pointers. That is, you want `max * sizeof *words'
bytes in all, and that's what you should request.

The second error is that realloc() might be unable to
find the amount of memory you've asked for, in which case it
will return NULL. As above, you *must* check for failure!

apps = (int *) realloc(apps, max);

Same remarks.

}
}
word = strtok(NULL, " ");

Same observation as for the earlier strtok() call.

}
}

for (i=0; i<n; i++)
printf("Word '%s' appears %d time(s).\n", words[i], apps[i]);

return 0;
}

--
Eric.Sosman@xxxxxxx

.



Relevant Pages

  • Re: Could you please give me some advice on GC in C?
    ... char *ptr; ... pointer to the memory area. ... The access by *ptr will fail. ... It must consider the one-past-the-end pointer to be a valid ...
    (comp.lang.c)
  • Re: Is this math test too easy?
    ... > communications glitch; one of the more laughable cartoons ... it was loaded into physical memory and, ... > Or one can interpret the character string as one of the values ... A pointer to an integer? ...
    (sci.math)
  • Re: grow list by tail, pointer example recipe -- please comment
    ... manufacturing a pointer with that address. ... the next cons cell. ... believe these lists are in consecutive memory locations. ...
    (comp.lang.lisp)
  • Re: some unanswered questions on C
    ... A pointer variable that's never been given a value. ... you don't know what memory you're modifying. ... >what i want to ask is that when i declare my buffer for fgets as ... "char *buffer" creates a pointer, ...
    (comp.unix.programmer)
  • Re: "Mastering C Pointers"....
    ... all means go ahead and dive right into the C language. ... Memory is a separate unit which just stores bits. ... A pointer at the hardware level _is an integer_. ... since loops make your logic more much ...
    (comp.lang.c)