Re: unicode by default



On Thu, May 12, 2011 at 1:58 AM, John Machin <sjmachin@xxxxxxxxxxx> wrote:
On Thu, May 12, 2011 4:31 pm, harrismh777 wrote:


So, the UTF-16 UTF-32 is INTERNAL only, for Python

NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are
encodings for the EXTERNAL representation of Unicode characters in byte
streams.

Right. *Under the hood* Python uses UCS-2 (which is not exactly the
same thing as UTF-16, by the way) to represent Unicode strings.
However, this is entirely transparent. To the Python programmer, a
unicode string is just an abstraction of a sequence of code-points.
You don't need to think about UCS-2 at all. The only times you need
to worry about encodings are when you're encoding unicode characters
to byte strings, or decoding bytes to unicode characters, or opening a
stream in text mode; and in those cases the only encoding that matters
is the external one.
.



Relevant Pages

  • Re: Unicode and Zipfile problems
    ... >>see try it on your original data, ... So you've never run into trouble with different encodings and editors not ... although some people use Python ... Remember that Python is also the language of "explicit is better implicit". ...
    (comp.lang.python)
  • Re: ECMAScript character set is UTF-16?
    ... Unicode character encoding, version 2.1 or later, using the UTF-16 ... transformation format." ... Assuming you're talking about browser scripting, ... Since all decent browsers can deal with a multitude of text encodings, ...
    (comp.lang.javascript)
  • Re: latin1 and cp1252 inconsistent?
    ... either convert content to Unicode characters or convert Unicode characters ... it must instead use the encoding given in the cell in the second ... The requirement to treat certain encodings as other encodings according ... If HTML content is tagged as ...
    (comp.lang.python)
  • Re: Trying to set a cookie within a python script
    ... I'am still pretty confused about the encodings. ... Asking Notepad++to save all my python scripts as UTF-8 ... to tell browser that the contents of this python script as in UTF-8 ... c1) Your python code has to decide how to encode its information when writing to stdout. ...
    (comp.lang.python)
  • Re: Going the PL/1 way
    ... Python community that they will be supported for EVEN longer than the ... assume that decorators are not on the dark side of the source. ... I often struggle with encodings - and in growing versions of Python you ... language, I feel, when the language and me are struggling with the same ...
    (comp.lang.python)