Re: EOF location?




Not so Clever Monkey wrote:

A single 0x1A (control-Z in ASCII) is used to indicate EOF,

That is only true if the particular program tests specifically for that
character. Any file can have Ctrl-Zs scattered through them and what
happens is entirely dependent on the program.

I carefully noted in the rest of my reply that this is an API
convention.

You keep talking about 'API' but The DOS/Win32 API does not have that
convention, nor does POSIX. Neither knows nor cares about Ctrl-Z in
disk files. Certain application programs may still recognize Ctrl-Z and
may or may not deal with them in any way they see fit.

The only 'convention' was with CP/M programs 30 years ago and that was
_not_ part of the 'API'.

I never suggested that the EOF char had to be unique, but
the very tools that were used upthread would, in fact, never allow one
to "see" the EOF.

Actually you never even tried a Ctrl-Z with those tools. The example
file DID NOT HAVE A CTRL-Z in it or after it. You merely assumed that
there was one and that the tools or the 'API' were hiding it. Now go
back and hexedit in a couple of x'1a's and see what the tool does about
'hiding' it.

They use APIs that look for it and consume it.

You still use 'API' as if you had a clue about it. In DOS/Win32 'End
of File' is _not_ a character, it is the state of reading past the size
of the file. You don't see a character for 'End of Fie' because there
isn't one, not because it 'hides' it.

It may be that some language systems, those derived from CP/M perhaps,
may treat a Ctrl-Z as an 'End of File', but as I indicated, even C
buffered input will happily read and process Ctrl-Zs in a file and will
count them.

similar to EOT in POSIX environments.

Except that EOT is 'End of Transmission' and has no effect of files.

Except for stdin and friends, which are files (or have file semantics)
on POSIX systems.

No. stdin does not recognise EOT. It is certainly true that the
console input handler does and a Ctrl-D (EOT) will create a situation
where the stdin stream has reached the size of the input. That is the
console handler does not pass the EOT to stdin at all.

This is easy to demonstrate by creating a file on disk with embedded
Ctrl-D and Xtrl-Z and then redirecting it to stdin of cat or similar.
The program will read the Ctrl-D and Ctrl-Z and will continue to read
the rest of the characters.

You also failed miserably to demonstrate anything useful at all because
you were trying to show what DOS/Win32 would do by using Unix. Not
that either would work the way you claim.

Since there is no special character marking EOF on
most POSIX systems they are not equivalent. But the chars are used
similarly in other regards.

.



Relevant Pages

  • Re: Text files
    ... The most common form of text files use ASCII code, and there is no character ... Microsoft's DOS started its life as a CP/M look alike for the 8086/8 ... and added automatically a CTRL-Z at the end of text files ...
    (comp.lang.c)
  • Re: Removing trailing hex 1a (decimal 26)
    ... The client concatenates the file with other ... ctrl-z key, ... Turns out that DOS actually did store the exact length of a text ... files will have learned that the end character is a ctrl-z. ...
    (microsoft.public.access.modulesdaovba)
  • Can I make Undo remove more than a character at a time?
    ... In Word 2007, after typing new or replacement text, Undo (whether by the button in the Quick Access bar, Ctrl-Z, or Alt-Backspace) always just removes 1 character at a time, which I find too slow for reverting text. ...
    (microsoft.public.word.newusers)
  • Re: weird eof marker in text file?
    ... I suggested a hex editor to ... Good ol' Ctrl-Z ... Or start up a copy of VB, type this in the Immediate window ... press Ctrl+C (to Copy the character into Windows clipboard). ...
    (microsoft.public.vb.general.discussion)