Re: unicode by default
- From: Ian Kelly <ian.g.kelly@xxxxxxxxx>
- Date: Thu, 12 May 2011 10:17:33 -0600
On Thu, May 12, 2011 at 1:58 AM, John Machin <sjmachin@xxxxxxxxxxx> wrote:
On Thu, May 12, 2011 4:31 pm, harrismh777 wrote:
So, the UTF-16 UTF-32 is INTERNAL only, for Python
NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are
encodings for the EXTERNAL representation of Unicode characters in byte
streams.
Right. *Under the hood* Python uses UCS-2 (which is not exactly the
same thing as UTF-16, by the way) to represent Unicode strings.
However, this is entirely transparent. To the Python programmer, a
unicode string is just an abstraction of a sequence of code-points.
You don't need to think about UCS-2 at all. The only times you need
to worry about encodings are when you're encoding unicode characters
to byte strings, or decoding bytes to unicode characters, or opening a
stream in text mode; and in those cases the only encoding that matters
is the external one.
.
- Follow-Ups:
- Re: unicode by default
- From: jmfauth
- Re: unicode by default
- References:
- unicode by default
- From: harrismh777
- Re: unicode by default
- From: Ian Kelly
- Re: unicode by default
- From: harrismh777
- Re: unicode by default
- From: John Machin
- Re: unicode by default
- From: harrismh777
- Re: unicode by default
- From: MRAB
- Re: unicode by default
- From: Ben Finney
- Re: unicode by default
- From: harrismh777
- unicode by default
- Prev by Date: Re: can't get urllib2 or httplib to work with tor & privoxy
- Next by Date: Python enabled gdb on Windows and relocation
- Previous by thread: Re: unicode by default
- Next by thread: Re: unicode by default
- Index(es):
Relevant Pages
|