Re: bytes, chars, and strings, oh my!
- From: Thomas Fritsch <i.dont.like.spam@xxxxxxxxxxx>
- Date: Thu, 06 Oct 2005 11:37:56 GMT
David N. Welton wrote:
[...]
That might or might not rise new problems, because the system default encoding may vary from system to system.Ok - then I could also use this to transform the bytes into a String by then doing new String(bytes, "some encoding, possibly the system one") for regular text files, right?
I had somewhat similar conceptual problems, when I tried to interpret PostScript files from Java. (PostScript is a language that doesn't distinguish between byte and char, because it was invented back in the 1980s era).
My solution there was to choose the "ISO-8859-1" (aka ISO-Latin-1) encoding. "ISO-8859-1" is essential a no-encoding. Its byte->char conversion is simply adding a zero high-byte. Its char->byte conversion is dropping the zero high-byte, and treating all chars beyond '\u00FF' as being illegal (i.e. converting to byte 63, which is '?').
--
"Thomas:Fritsch$ops:de".replace(':','.').replace('$','@').
- References:
- bytes, chars, and strings, oh my!
- From: David N. Welton
- Re: bytes, chars, and strings, oh my!
- From: Thomas Fritsch
- Re: bytes, chars, and strings, oh my!
- From: David N. Welton
- bytes, chars, and strings, oh my!
- Prev by Date: Re: Calling java from c++
- Next by Date: Re: listner or observer
- Previous by thread: Re: bytes, chars, and strings, oh my!
- Next by thread: Re: bytes, chars, and strings, oh my!
- Index(es):
Relevant Pages
|