Re: Read only last line-
- From: websnarf@xxxxxxxxx
- Date: 27 Feb 2006 05:46:35 -0800
Eric Sosman wrote:
Jordan Abel wrote On 02/22/06 14:37,:
On 2006-02-22, Mark McIntyre <markmcintyre@xxxxxxxxxxx> wrote:
On 19 Feb 2006 21:20:00 -0800, in comp.lang.c , websnarf@xxxxxxxxx
wrote:
No value of offs can lead
to UB so long as fp is a valid and active file.
pipes too?
A "request that cannot be satisfied" results in a nonzero return, not
UB.
Well in my proposal the error return is specifically -1. I hadn't
considered file streams like stdin and stdout where clearly you can't
fseek, but obviously they would just return with -1 -- certainly not
UB.
[...] It is arguable that this also applies to a call of fseek on a text
stream with a value that does not correspond to a position in the file
which ftell might have returned.
My proposal is for two new functions fseekB and ftellB which are not
ftell or fseek compatible.
The value returned is the exact offset in the file where the
corresponding to the next byte to be read/written.
and you've tested this for sparse files, databases, etc? Files with
multiple read/write operations permitted? Files with lockable
sections?
Again, failure is not the same as UB. What is a specific case that you
think invokes UB?
Keep in mind that we're speaking of text streams, where
the number of characters written to a stream need not be the
same as the number of bytes written to the file. A familiar
example is putc('\n', stream) on Windows, where one character
generates two bytes.
So what? If you read that back on Windows, you also get just one
character. What does this mean? It means that it has to count as 1
character (so long as you read the file in text mode.) It doesn't
count *underlying byte representation*, it counts offset in the units
of "characters" or whatever it is that is being written to the file.
[...] There are also systems where writing a
newline produces no bytes in the file, systems where a file
contains both data bytes and metadata bytes, and systems that
use state-dependent encodings for extended character sets.
Underlying file system details do not affect what I have specified. If
you put the contents of a file into an array, then that specifies an
offset to data mapping. That's the mapping you have to support. Its
not impossible, and its not even very hard. Not if your system
supports faithful read-write turn around, and fgetpos/fsetpos.
It's not so much a problem of U.B., but of failure that
doesn't produce a reliable indication. Seek to a position
that happens to be in the middle of a multi-byte character
or in the middle of a stretch of metadata,
How does that happen for a file opened in text mode?
[...] and the problem
may be difficult to detect: a byte in a file does not always
stand alone, but may require prior context (at an arbitrary
separation) for proper interpretation. Here's the stuff of
a nightmare or two: Imagine opening a stream for update,
seeking to the middle of a stretch of metadata, successfully
writing "Hello, world!" there, and only later discovering
that the successful write has corrupted the file structure
and made the entire tail end unreadable ...
Well explain to me how that happens -- remember I am mapping from
offsets of the original data, as if it were all coming from an array to
positions in the underlying file (that we know *exists* because of the
existence of fgetpos, fsetpos functions). So what bad thing is
supposed to happen?
It would be nice if one could do meaningful arithmetic on
file position indicators in text streams,
You mean its nice to know that it is well defined and possible. (You
need a good definition of intmax_t, of course.)
[...] but given the rich
variety of file encodings that exist in the world it is not
always possible to do so.
It might be slow, but its always possible.
[...] The C Standard recognizes this
difficulty, and so does not attempt to guarantee that seeking
to arbitrary positions in text files will work as desired.
Even though it presents an API that clearly implies that it does.
The Standard is cognizant of imperfections in reality, and
does not insist that reality rearrange itself for the Standard's
convenience.
If that were a true and complete description of the standard that would
at least be a defensible and credible stance. But its not. If they
took this stance, ftell() and fseek() would be gone, since
fgetpos/fsetpos already gives you the weaker semantics.
--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/
.
- Follow-Ups:
- Re: Read only last line-
- From: Keith Thompson
- Re: Read only last line-
- References:
- Read only last line-
- From: RyanS09
- Re: Read only last line-
- From: websnarf
- Re: Read only last line-
- From: Eric Sosman
- Re: Read only last line-
- From: websnarf
- Re: Read only last line-
- From: Mark McIntyre
- Re: Read only last line-
- From: Jordan Abel
- Re: Read only last line-
- From: Eric Sosman
- Read only last line-
- Prev by Date: Re: Need suggestions for C links.
- Next by Date: Re: Sorting Array of Structures
- Previous by thread: Re: Read only last line-
- Next by thread: Re: Read only last line-
- Index(es):
Relevant Pages
|