Re: Unicode is driving me nuts!
From: Anthony Liu (antonyliu2002_at_yahoo.com)
Date: 03/13/04
- Next message: Shalen chhabra: "Python Script for Running a Python Program over Different Files in the Directory"
- Previous message: Anthony Liu: "Re: How to read one byte at a time in Python?"
- Maybe in reply to: Anthony Liu: "Unicode is driving me nuts!"
- Next in thread: Skip Montanaro: "Re: Unicode is driving me nuts!"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 13 Mar 2004 00:35:45 -0800 (PST) To: py <python-list@python.org>
Thank you, Skip. You know what, I guess I'll give up
using unicode, as you also mentioned you used to have
headache with it.
I'll probably just read by bytes and check if the byte
is a Chinese character. If it is, read 2 bytes
instead. What do you think? This way, I will
hopefully not to have a lot of unreadable characters.
--- Skip Montanaro <skip@pobox.com> wrote:
>
> Anthony> str = unicode(raw_str, myencoding)
>
> Anthony> This works just fine with a small
> sample Chinese document.
>
> Anthony> But when I attempted to run the script
> on the entire corpus, I
> Anthony> get the typical "incomplete multibyte
> sequence error" or
> Anthony> "UnicodeEncodeError: 'ascii' codec
> can't encode characters in
> Anthony> position 0-23: ordinal not in
> range(128)"
>
> Can you craft a small example which demonstrates the
> error but which you
> think is correctly encoded?
>
> Anthony> I am at my wit's end, so frustrated at
> handling
> Anthony> non-ascii texts.
>
> Unicode creates lots of problems for the
> uninitiated. I pulled my hair out
> for a long time. It took me a couple tries to get
> my system to work
> (more-or-less) with Unicode. It's still got the
> occasional problem.
>
> Skip
>
__________________________________
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com
- Next message: Shalen chhabra: "Python Script for Running a Python Program over Different Files in the Directory"
- Previous message: Anthony Liu: "Re: How to read one byte at a time in Python?"
- Maybe in reply to: Anthony Liu: "Unicode is driving me nuts!"
- Next in thread: Skip Montanaro: "Re: Unicode is driving me nuts!"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|