Re: Character set woes with binary data



Michael B. Trausch wrote:

I never said it did. It just happens to be the context with which I am
working. I said I wanted to concatenate materials without regard for
the character set. I am mixing binary data with ASCII and Unicode, for
sure, but I should be able to do this.

The problem is that Unicode has no default representation for mixing
with binary data and ASCII. What you should therefore ask yourself is,
"Which encoded representation of Unicode should I be using to mix my
text with those things?" Then, you should choose an encoding, call the
encode method on your Unicode objects, take the result, and mix away!

[...]

In short: How do I create a string that contains raw binary content
without Python caring? Is that possible?

All strings can contain raw binary content without Python caring.
Unicode objects, however, work on a higher level of abstraction:
characters, not bytes. Thus, you need to make sure that your Unicode
objects have been converted to bytes (ie. encoded to strings) in order
for the content to be workable at the same level as that binary
content.

Paul

.



Relevant Pages

  • Re: What is the difference between using Unicode and UUENCODE?
    ... Thus unicode is able to encode all characters: ... Since email server usually only transport 7 bits, unicode characters must be ... and encode binary data with BASE64. ...
    (microsoft.public.outlook)
  • Re: how to communicate unsigned char* to Java
    ... occure within the binary data which are no valid Unicode. ... every possible value has an opposite Character within ANSI ... As opposite to Unicode: ... 00 The sequence is illegal. ...
    (microsoft.public.win32.programmer.networks)
  • Re: how to communicate unsigned char* to Java
    ... Java techniques... ... Strings in Java are Unicode. ... Converting random binary data into a Unicode-String and converting this ...
    (microsoft.public.win32.programmer.networks)
  • Re: how to communicate unsigned char* to Java
    ... occure within the binary data which are no valid Unicode. ... every possible value has an opposite Character within ANSI ... As opposite to Unicode: ... 00 The sequence is illegal. ...
    (microsoft.public.win32.programmer.networks)
  • Re: how to communicate unsigned char* to Java
    ... occure within the binary data which are no valid Unicode. ... every possible value has an opposite Character within ANSI ... As opposite to Unicode: ... 00 The sequence is illegal. ...
    (microsoft.public.win32.programmer.networks)