Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: "Chris Uppal" <chris.uppal@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: 28 Feb 2006 11:52:21 GMT
Roedy Green wrote:
I have rewritten the essay and written an experiment explorer program
to back up much of what I say.
see http://mindprod.com/jgloss/utf.html
Thanks for making the changes.
I haven't actually checked the code -- it seems safe to assume it does
what you say it does -- but with that proviso it seems pretty much OK.
I still think you could usefully make it clearer that your example
en/decoding code is not actually useful (because incomplete), I know
you /do/ say that, but it's burried away and (IMO) gives the impression
that it "doesn't really matter".
However, there is still one major error. It's near the bottom under
"Exploring Java's UTF Support". First off, it still isn't plain that 2
out of the four options you mention (1 and 3) have /nothing at all/ to
do with UTF-8. The so-called "modified UTF-8" format is not compatible
(upwards or downwards) with UTF-8. So I don't think you should mix
references to the two together, and certainly not intermingle them as
if they were all of comparable relevance. Specifically, the page
states (slightly further up, under "DataOutputStream.writeUTF()") that
the length is "followed by a standard UTF-8 byte encoding of the
String"; that is simply not true. You note already that Quasi-UTF-8
encodes 0x0 differently from UTF-8, which all by itself is enough to
make writeUTF() useless for interoperability with standards compliant
encodings. However there is also a major difference in how it encodes
characters off the BMP. Eg. the Uncode character:
U+10302
will encode in UTF-8 as (taken from the Uncode Standard 4.0.1, table
3.3):
0xF0 0x90 0x8C 0x82
whereas under Sun's scheme it encodes as:
0xED 0xA0 0x80 0xED 0xBC 0x82
(I'm using unsigned bytes here).
BTW, you also express some opinions on the (non-)value of the >16-bit
Unicode characters. I have no problem with your expressing your
opinions on your own webpages. I just wanted to add that I don't agree
with them.
-- chris
.
- Follow-Ups:
- References:
- Transmitting strings via tcp from a windows c++ client to a Java server
- From: qqq111
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Roedy Green
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: qqq111
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Roedy Green
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Chris Uppal
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Chris Uppal
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Roedy Green
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Chris Uppal
- Re: Transmitting strings via tcp from a windows c++ client to a Java server
- From: Roedy Green
- Transmitting strings via tcp from a windows c++ client to a Java server
- Prev by Date: How do java programmers cope with java missing c++ const?
- Next by Date: Re: [Slightly OT]: App-server (JBoss) tutorial
- Previous by thread: Re: Transmitting strings via tcp from a windows c++ client to a Java server
- Next by thread: Re: Transmitting strings via tcp from a windows c++ client to a Java server
- Index(es):
Relevant Pages
|