Re: python encoding bug?



garabik-news-2005-05@xxxxxxxxxxxxxxxxxxxxxxxx wrote:

>
> I was playing with python encodings and noticed this:
>
> garabik@lancre:~$ python2.4
> Python 2.4 (#2, Dec 3 2004, 17:59:05)
> [GCC 3.3.5 (Debian 1:3.3.5-2)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> unicode('\x9d', 'iso8859_1')
> u'\x9d'
>>>>
>
> U+009D is NOT a valid unicode character (it is not even a iso8859_1
> valid character)

It *IS* a valid unicode and iso8859-1 character, so the behaviour of the
python decoder is correct. The range U+0080 - U+009F is used for various
control characters. There's rarely a valid use for these characters in
documents, so you can be pretty sure that a document using these characters
is windows-1252 - it is valid iso-8859-1, but for a heuristic guess it's
probably saver to assume windows-1252.

If you want an exception to be thrown, you'll need to implement your own
codec, something like 'iso8859_1_nocc' - mmm.. I could try this myself,
because I do such a test in one of my projects, too ;)

> The same happens if I use 'latin-1' instead of 'iso8859_1'.
>
> This caught me by surprise, since I was doing some heuristics guessing
> string encodings, and 'iso8859_1' gave no errors even if the input
> encoding was different.
>
> Is this a known behaviour, or I discovered a terrible unknown bug in
> python encoding implementation that should be immediately reported and
> fixed? :-)
>
>
> happy new year,
>

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/
.



Relevant Pages

  • Re: The Yeoman - a PC-quality Commoner (long)
    ... >>(balanced for playing D&D, i.e. adventuring) ... They're not playing. ... the non-players not playing the non-player characters will feel upstaged ... > This is only a subset of commoner: rural and better than what AD&D ...
    (rec.games.frp.dnd)
  • Re: Made 40!! Er ... Well...Now what?
    ... I have been playing WoW for an embarrassingly long time and have ... I have recently gone back and focused on two Alliance Characters, ... Finally, I settled on a human priest, got him to level 56...then promptly ... dropped him when my best friend started playing. ...
    (alt.games.warcraft)
  • underground Dwarven city design
    ... The players have been playing other characters and races so far, and they have usually lived elsewhere in the world. ... Now I have the four players playing three dwarves and a hobbit, and they have decided to live in the main dwarven city on the continent. ... That's all fine and dandy, were it not that I have only designed the top level of the city, which the characters sometimes visited, making up the rest below as vague scenery, and never paid too much attention to little details, since the characters never went there. ... Now they would like to go live at the second level, and gave me a design of their new "house" and I'm suddenly being confronted with design questions and challenges I hadn't thought of before. ...
    (rec.games.frp.dnd)
  • New player needs advice.
    ... ADOM is my first roguelike and to be honest I am quite shocked at how you can get such deep and immersive gameplay using nothing but ascii characters. ... After that I tried a wizard, which had severe mana problems, got surrounded and died. ... Tried to find some info on the other classes but most in-depth guides I found are for barbarians, which I have little interest in playing. ... Stealth is something that I just can't find any real good info on. ...
    (rec.games.roguelike.adom)
  • Re: Lady in the Lake - movie
    ... In the one I'm playing, The Lord of the Rings Online, most of the ... time you are playing your own character (one of your characters, ... I don't know Lord of the Rings Online, but I played the other MMORPG ...
    (rec.arts.mystery)