Re: non standard path characters
- From: "Martin v. Löwis" <martin@xxxxxxxxxxx>
- Date: Thu, 31 May 2007 23:12:08 +0200
thanks for that. I guess the problem is that when a path is obtained
from such an object the code that gets the path usually has no way of
knowing what the intended use is. That makes storage as simple bytes
hard. I guess the correct way is to always convert to a standard (say
utf8) and then always know the required encoding when the thing is to be
used.
Inside the program itself, the best things is to represent path names
as Unicode strings as early as possible; later, information about the
original encoding may be lost.
If you obtain path names from the os module, pass Unicode strings
to listdir in order to get back Unicode strings. If they come from
environment variables or command line arguments, use
locale.getpreferredencoding() to find out what the encoding should
be.
If they come from a zip file, Tijs already explained what the encoding
is.
Always expect encoding errors; if they occur, chose to either skip
the file name, or report an error to the user. Notice that listdir
may return a byte string if decoding fails (this may only happen
on Unix).
Regards,
Martin
.
- References:
- non standard path characters
- From: Robin Becker
- Re: non standard path characters
- From: Tijs
- Re: non standard path characters
- From: Robin Becker
- non standard path characters
- Prev by Date: Re: Python memory handling
- Next by Date: Re: Python memory handling
- Previous by thread: Re: non standard path characters
- Next by thread: Python memory handling
- Index(es):
Relevant Pages
|