Re: Is reverse reading possible?

From: Anthony Liu (antonyliu2002_at_yahoo.com)
Date: 03/14/04


Date: Sun, 14 Mar 2004 11:10:47 -0800 (PST)
To: py <python-list@python.org>

Jeff and Ivo,

Thank you very much for your solution. Yes, I figure
out it is easy for an ascii text.

The files (nearly 4M each)I am gonna read are a
mixture of Chinese and English, where each Chinese
character has 2 bytes and each English (ASCII) has 1
byte, although the majority of the texts is Chinese.

So, if we reverse-read it without taking into
consideration the 2-byte Chinese characters, we are
gonna get the Chinese characters spelled out in the
wrong way - unreadable!

So, it might be tedious to read such a text reversely
unless you have a smarter idea. Do you?

--- Jeff Epler <jepler@unpythonic.net> wrote:
> on seekable, byte-oriented binary files, sure.
>
> import errno
>
> SEEK_SET, SEEK_CUR, SEEK_END = range(3)
>
> class Reversed:
> def __init__(self, f):
> self.f = f
> self.f.seek(-1, SEEK_END)
> self.at_eof = 0
>
> def read_byte(self):
> if self.at_eof: return ''
> byte = self.f.read(1)
> try:
> self.f.seek(-2, SEEK_CUR)
> except IOError, detail:
> if detail.errno == errno.EINVAL:
> self.at_eof = 1
> else:
> raise
> return byte
>
> def __iter__(self): return self
>
> def next(self):
> r = self.read_byte()
> if r == '': raise StopIteration
> return r
>
> >>> "".join(Reversed(file("/etc/redhat-release",
> "rb")))
> '\n)worraY( 1 esaeler eroC arodeF'
>
> Jeff

__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com



Relevant Pages

  • Re: The Weakness of Lisp
    ... You used that to characterize me as stupid, ... Do you think chinese is a hard language?" ... "Jeff, ... "And Kenny is stupid?" ...
    (comp.lang.lisp)
  • Re: A question about Chinese input
    ... > Gee, Jeff, ... > I didn't realize you could read and write Japanese and Chinese. ... >> was just testing canna again a few weeks ago in fact and it still ...
    (comp.lang.tcl)
  • Re: How to remove // comments
    ... Mark McIntyre writes: ... that doesn't include ASCII. ... Obviously the encodings used for Chinese and/or Japanese characters ... Chinese in particular has a *lots* of characters it has to represent; ...
    (comp.lang.c)
  • RE: Chinese Character
    ... "Jeff Li" wrote: ... > language we used is traditional chinese. ... > factory in China, we have the following problem, ...
    (microsoft.public.win2000.general)
  • Re: I have nothing to do with ORA-01756!
    ... if I input 4 chinese or 3 chinese characters ... Just a wild guess since I have zero Rails experience: ... Rails does not properly deal with non ASCII text ...
    (comp.lang.ruby)