Re: Reading LAST line from text file without iterating through the file?

On Sun, 27 Feb 2011 15:19:15 -0500, Arne Vajhøj wrote:

Actually I think it is a bit weird to test if a file consists of text
lines without the program being line aware.

It's not testing if a file consists of anything. It's just copying it.

And the code is rather bad:
- you are not calling close on rdr and wtr but those are easy to fix.

Your *opinion* has been noted. The operating system will close the file
handles promptly anyway since the program terminates at that point; and
it was a quick toss-off for a newspost, not intended for actual
production use.

And I am happy to inform you that the above program actually works with
VMS variable length files.

For values of "works" that allows for copying "normal" text files but not
ones with hidden data as I described previously.

The fact is, it cannot stuff 257 objects (8-bit characters plus record
boundaries) or 65537 (16-bit characters plus boundaries) into 256 or
65536 "char" values, respectively -- that's the pigeonhole principle at
work, an ironclad law of mathematics, unless someone somewhere is
cheating by converting into an escape-sequence-using format or binhexing
or something similar.

007A6162 00030072 61620A6F 6F660007 000000

Does not compute.

007A6162 00030072 61620A6F 6F660007 -- four sets of four octets = 16 bytes -- sixteen text characters = 16 bytes


007A6162 00030072 61620A6F 6F660007
.. . f o o . b a r . . . b a z.

00 is unprintable, then lowercase o, then lowercase b. Lowercase b is
later 6F. Etc. Obviously this is not any straight mapping from octets to
character values, whether ASCII of EBCDIC or any similar system.

If your claims were correct it would be two probably-unprintable bytes to
encode a length, followed by foo/nbar, followed by another encoded
length, then baz. That ought to be 14 bytes. There's an extra byte
between bar and baz and an extra byte at the end (the latter could be an
explicit EOF marker, though) and there's the more serious matter that it
is not a simple letter-substitution code like ASCII or EBCDIC. The extra
byte between bar and baz and the non-one-to-one character->byte nature of
the encoding seem pretty clearly to prove that some sort of escaping or
other conversion is taking place -- i.e., that you cheated.

I proved you can't map those VMS thingys one-to-one onto any character
set, and rather than simply admit it you tried to sustain your untenable
claim by attempting what is known in the vernacular as "a fast one", it
would seem. But either you faked the output (and unconvincingly!) or you
didn't, and it betrays the very fact you were arguing against: that it
can't be mapped one-to-one onto normal strings.

I also find it interesting that the first two bytes are 00 7A. That's
either 122 or 31232, depending on endianness. The actual length of the
first record, if the first record is "foo" 0x0A character "bar", is 7. So
either the first two bytes are not a correct character count of the
field, or the system does not use 1s or 2s-complement integer storage at
all(!), or you're just plain lying.

Relevant Pages

  • RE: Password Scoring
    ... I like the scoring idea, but there are actually programs that will allow ... you to decide how many lowercase, uppercase, and special characters will ... Each character is ...
  • RE: Password Scoring
    ... > some German guy. ... >> Subject: Password Scoring ... >> requireing lowercase, uppercase, numbers, and symbols, no dictionary ... >> have to worry about the 7/14 character hash problem intrinsic in LM ...
  • Re: %ahead of matlab script or function means
    ... At least under Windows you can name a file w/ a '%' but you'll not be able to call it from Matlab because the interpreter would comment out the rest of the line following the '%' comment character. ... It must consist only of letters (again, uppercase or lowercase doesn't matter), numbers, and the underscore character _. ... It must not be a keyword as listed by the ISKEYWORD function ...
  • Re: Need to implement strdup, strnicmp and stricmp
    ... pete wrote: ... I would convert both to lowercase myself. ... If the argument is a character for which islower is ... the toupper function returns one of the corresponding ...
  • Re: output ampersand using XML::Twig
    ... XML data, then it receives unescaped utf8 strings from the parser ... first 2 solutions) being to get the unicode character for   ... turn off XML escapes for the element content ... {# use an Encode output filter that encodes (using decimal ...