Re: Null-terminated strings: the final analysis.



Keith Thompson wrote:
Flash Gordon <smap@xxxxxxxxxxxxxxxxx> writes:
Harald van Dijk wrote:
On Sun, 12 Apr 2009 22:57:30 +0100, Flash Gordon wrote:
Harald van Dijk wrote:
On Sun, 12 Apr 2009 13:32:39 -0700, Keith Thompson wrote:
On the systems I use, if I write a '\a' character (ASCII BEL) to a
text file, I can reasonably expect to see a '\a' character when I read
it back. The same is not true of '\0' if I use fgets() to read it
(though I think can see the '\0' if I use fgetc()).
If you use fgets, you can see any '\0' that you had previously written,
but you've got to be careful to make sure you don't treat it as a
terminator. You cannot reliably determine whether the '\0' is a
terminator,
I think you can almost all the time, but it takes a little work...
[snip pseudo-code]
I stand corrected. It may even be simpler than you suggested: ignoring the
possibilities of EOF and errors (which you've already handled), after
prefilling the buffer and calling fgets, you can scan the buffer backwards
to find the last '\0' byte. Everything before, including any other
'\0' bytes, were read from the file.
My excuse for the long and convoluted method is to show I was handling
each possible case ;-)

I agree that filling will '\0' then back searching for the first
non-'\0' is better. Error still has to be handled seperately as it
leaves the buffer indeterminate. I think end-of-file after a byte has
been read is OK, but it is not explicitly stated as being OK.

Whether a text file requires a new-line character on the last line is
implementation-defined. If it's not required, and if the last line
happens to end with '\0', it's going to be very difficult to tell just
what happened after calling fgets().

It is clear that NUL in a text stream corrupts the string. Period.

Handling the last line without '\n' is trivial: If fgets returns non-null, the first '\0' in the buffer will have been placed there by fgets after the last character read from the stream. Find it with..

char *s;
s = strchr(buffer, '\0');

Check for '\n' with..

if (*(s-1) == '\n')

--
Joe Wright
"Memory is the second thing to go. I forget what the first is."
.



Relevant Pages

  • Re: Null-terminated strings: the final analysis.
    ... If you use fgets, you can see any '\0' that you had previously written, ... prefilling the buffer and calling fgets, you can scan the buffer backwards ... Whether a text file requires a new-line character on the last line is ...
    (comp.lang.c)
  • Re: sorting the input
    ... of data in the "line" leaves at least two unused bytes in the buffer, ... buffer to mean the "array" pointed to by the first argument to fgets. ... contents of the stream buffer. ... new-line character or after end-of-file. ...
    (comp.lang.c)
  • Re: Using MBCS in a UNICODE defined project
    ... if(fgets(buffer, sizeof(buffer)/sizeof, f)) ... because buffer is always char ... Just wanted to show "the right way" of calculating character count. ... The MSDN meaning is "what the programmer understands by character" (code unit ...
    (microsoft.public.vc.mfc)
  • Re: Null-terminated strings: the final analysis.
    ... I can reasonably expect to see a '\a' character when I read ... If you use fgets, you can see any '\0' that you had previously written, ... terminator. ... prefilling the buffer and calling fgets, you can scan the buffer backwards ...
    (comp.lang.c)
  • renee.rtf.xaa
    ... renee is RTF parser/macro processor I wrote. ... Character Stream\ ... Write Output Buffer to Files\ ...
    (comp.lang.tcl)