Re: Character Encodings and display of strings



"JKPeck" wrote:

I am trying to understand why, with nonwestern strings, I sometimes get
a hex display and sometimes get the string printed as characters.

With my Python locale set to Japanese and with or without a # coding of
cp932 (this is Windows) at the top of the file, I read a list of
Japanese strings into a list, say, catlis.

With this code
for item in catlis:
print item
print catlis
print " ".join(catlis)

the first print (print item) displays Japanese text as characters..
The second print (print catlis) displays a list with the double byte
characters in hex notation.
The third print (print " ".join(catlis)) prints a combined string of
Japanese characters properly.

According to the print documentation,
"If an object is not a string, it is first converted to a string using
the rules for string conversions"

but the result is different with a list of strings.

a list is not a string, so it's converted to one using the standard list representation
rules -- which is to do repr() on all the items, and add brackets and commas as
necessary.

for some more tips on printing, see:

http://effbot.org/zone/python-list.htm#printing

</F>



.



Relevant Pages

  • Re: WideCharToMultiByte returns default character when input language for non-Unicode programs s
    ... I have support for Japanese installed on my machine. ... WideCharToMultiByte to translate a Unicode string to Multibyte and ... if I set the language for non-Unicode programs to English ... by examining the numerical value of the characters in the text string? ...
    (microsoft.public.win32.programmer.gdi)
  • Re: Prothon should not borrow Python strings!
    ... """It does not make sense to have a string without knowing what encoding ... same cul de sac as Python. ... Prothon_String_As_ASCII // raises error if there are high characters ... Python's split between byte strings and Unicode strings is ...
    (comp.lang.python)
  • Re: Letter to US Sen. Byron Dorgan re unpaid overtime
    ... put them in stupid places. ... Programming is difficult (as you must surely appreciate, ... > strings will be in the range 1...1000 characters. ... impose an artificially small limit on string length." ...
    (comp.programming)
  • Re: Byte Array to String
    ... retrieved text will mismatch the original characters. ... encoding the characters. ... Dim strFileData as String ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: A note on personal corruption as a result of using C
    ... impossible to write effective string validation routines by definition ... (Note that a string literal may contain embedded null characters; ... without resorting to abusive language. ... In practice, programmers typically use "struct" ...
    (comp.programming)