Re: Text mode fseek/ftell
- From: Ben Bacarisse <ben.usenet@xxxxxxxxx>
- Date: Fri, 31 Mar 2006 17:41:55 +0100
On Fri, 31 Mar 2006 10:21:47 -0500, Kenneth Brody wrote:
I recently ran into an "issue" related to text files and ftell/fseek,
and I'd like to know if it's a bug, or simply an annoying, but still
conforming, implementation.
The platform is Windows, where text files use CF+LF (0x0d, 0x0a) to
mark end-of-line. The file in question, however, was in Unix format,
with only LF (0x0a) at the end of each line.
First, does the above situation already invoke "implementation defined"
or "undefined" behavior? Or is it still "defined"?
No you should be OK. Will be able to do more things with fseek if you
open the file as a binary file, but opening it as text should also work is
you keep to the restrictions imposed by the standard (see later).
The problem comes in how ftell() reports the current position. (And,
subsequently fseek()ing back to the same position is wrong.)
Suppose that you have fread() the following 12 characters, starting at
the beginning of the file:
'1' '2' '3' '4' '5' 0x0a '1' '2' '3' '4' '5' 0x0a
(Remember, this file is in Unix format, with a single 0x0a for end-of-
line.)
While you are now at offset 12 within the file, ftell() will return 14,
because it assumes that those '\n' newlines are really CR+LF, and that
the CR was stripped off when read. (Had this file been in Windows
format, you really would be at offset 14 after reading those 12
characters.) For each 0x0a returned by fread(), ftell() will assume you
have advanced two characters in the file.
Actually, you can't say anything about the numbers. For a text file,
ftell does not give you the offset. It returns a code that can only be
used by fseek. You may be right about how you implementation is encoding
the data but you get a clearer understanding of the restrictions imposed
by the standard if you take it at face value -- ftell returns something
you can do nothing with except pass it to fseek.
The net result here is that a subsequent fseek() to the same position
will be wrong.
The standard allows one to fseek using *only* SEEK_SET and the result of a
previous call to ftell (or an offset of 0). If that is all you have done,
and you did not get back to where you expected, then it would seem that
you have a non-compliant library.
If you used you own idea of the stream position (not the result from
ftell) or you used SEEK_END or SEEK_CUR then all bets are off.
So, have I invoked undefined behavior by reading a Unix text file in a
Windows environment? Or is the compiler allowed to return the "wrong"
value as part of an "implementation defined" restriction? Or is this
a bug in the compiler's runtime library?
An example program with what you expect and what happens might make
everything clearer.
--
Ben.
.
- References:
- Text mode fseek/ftell
- From: Kenneth Brody
- Text mode fseek/ftell
- Prev by Date: Re: Globally declared arrays
- Next by Date: best way to index numerical text data ?
- Previous by thread: Text mode fseek/ftell
- Next by thread: Re: Text mode fseek/ftell
- Index(es):
Relevant Pages
|