Re: Question regarding fgets and new lines



mellyshum123@xxxxxxxx wrote:
I need to read in a comma separated file, and for this I was going to
use fgets. I was reading about it at http://www.cplusplus.com/ref/ and
I noticed that the document said:

"Reads characters from stream and stores them in string until (num -1)
characters have been read or a newline or EOF character is reached,
whichever comes first."

My question is that if it stops at a new line character (LF?) then how
does one read a file with multiple new line characters?

One line at a time. Read a line, process it as you see fit,
and then proceed to the next line. Lather, rinse, repeat.

Another question. The syntax is:

char * fgets (char * string , int num , FILE * stream);

but you have to allot a size for the string before this. Would you just
use the same num as used in the fgets? So char stringexample[num] ?

Yes. The problem of how big to make `num' can be a
vexing one: If you make it 80 you can handle lines of up
to 78 "payload" characters plus a newline and a '\0', but
if the input stream supplies a longer line you've got a
bit of a problem. You could make `num' 1000000, but do you
really want to spend a megabyte as insurance against long
lines? (And there's still the nagging possibility that the
input might hold a 1000001-character line ...)

One plausible way to proceed is to make `num' moderately
larger than the longest line you expect to encounter, call
fgets(), and then check whether the buffer contains a '\n'.
If it does not (and if neither end-of-input nor an I/O error
occurred, which you can test with feof() and ferror()), then
the file contains a longer-than-anticipated line. The first
part of that line has been stored in the buffer, and the tail
end is still "pending," available to be read.

What to do next? If you were expecting lines of up to
around 100 characters and you used a 1000-character buffer
just to be on the safe side and you ran into a line longer
than 1000 characters -- more than ten times what you thought
the maximum length would be -- you might well conclude that
there's something wrong with the input: Maybe the file you've
been handed really isn't a CSV file at all. It would be
perfectly plausible to blurt out an error message and stop
processing, or to blurt an error and throw the offending line
away (remember to "drain" the unread tail by reading until
you get '\n' or EOF).

If you've used malloc() to obtain memory for the buffer,
another possibility is to use realloc() to make the buffer
larger (preserving the already-read portion) and call fgets()
again to read the tail of the line into the tail of the expanded
buffer. If necessary, you can expand again and again until you
finally get a big enough buffer (or run out of memory). In my
opinion it's a little easier to implement this scheme by using
getc() to read a character at a time instead of using fgets()
to read a batch of characters, but either way it's fairly
straightforward.

--
Eric Sosman
esosman@xxxxxxxxxxxxxxxxxxx
.



Relevant Pages

  • Re: scanf problem with intaking a string *PLS. HELP*
    ... >> if (fgets(buffer, sizeof buffer, stream)) ... > when fgets knew perfectly well that it truncated the buffer? ... where N is the number of characters in the ...
    (comp.lang.c)
  • Re: 2D array of structures
    ... The way gets works is that you pass it a pointer to a buffer but it has absolutely no knowledge of how long the buffer is. ... So you pass a pointer to a buffer 10 characters long and the user enters 10 character and gets writes off the end of the buffer stomping over some random piece of data. ... fgets, fgetc or getc are generally good starting points. ... So I'm the newbie and I have to get used to ...
    (comp.lang.c)
  • Re: sorting the input
    ... In this case the last two characters ... There is a '\n' in the string returned from fgets only when the amount ... of data read leaves at least two unused bytes in the buffer. ... (a frequent student assignment), you could process the partial line ...
    (comp.lang.c)
  • Re: string length questions related to fseek
    ... > take you up to 257 characters required to store all the data. ... fgets() is the total size of the buffer, ... not store anything at all outside the buffer. ... That's a C feature, not a Unix feature. ...
    (comp.lang.c)
  • Re: Is this string input function safe?
    ... return a pointer to mallocated memory holding one input string, ... See my comment after your call to fgets. ... char* malloc_getstr ... before any characters are read, then the ...
    (comp.lang.c)