Unicode literals and byte string interpretation.



If I create a new Unicode object u'\x82\xb1\x82\xea\x82\xcd' how does
this creation process interpret the bytes in the byte string? Does it
assume the string represents a utf-16 encoding, at utf-8 encoding,
etc...?

For reference the string is これは in the 'shift-jis' encoding.
.



Relevant Pages

  • Converting to UCS-2 or UTF-16 for use by a C extension
    ... to convert a Ruby input string into UCS-2 or possibly UTF-16 encoding. ... encoded internally as UTF-8... ...
    (comp.lang.ruby)
  • Re: UTF-8 encoding in AJAX web application.
    ... and if you print the string it would be be printed incorrectly because ... you would be assuming a UTF-16 encoding when the encoding is in fact UTF-8. ... Encodings are only involved when converting text data to binary data ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Unicode literals and byte string interpretation.
    ... On Oct 27, 2011, at 11:05 PM, Fletcher Johnson wrote: ... this creation process interpret the bytes in the byte string? ... assume the string represents a utf-16 encoding, at utf-8 encoding, ...
    (comp.lang.python)
  • Re: Re[2]: unicode mystery/problem
    ... I am using Python 2.4.3 on Fedora Core4 and "Eric3" Python IDE ... JM> print "stdout", sys.stdout.encoding ... variable is a string. ... JM> stris trying to produce a str object from a unicode object. ...
    (comp.lang.python)
  • Re: unicode mystery/problem
    ... I am using Python 2.4.3 on Fedora Core4 and "Eric3" Python IDE ... run your script under Eric3. ... "a" is not a string, it is a reference to a string. ... stris trying to produce a str object from a unicode object. ...
    (comp.lang.python)