Re: Transmitting strings via tcp from a windows c++ client to a Java server



qqq111 wrote:

We have a C++ client which runs on Windows and that needs to transmit
char* / wchar* strings to and from a Java server.

The client should correctly handle both 'standard' languages & east
Asian
languages (i.e. using wchar).

The obvious options are:

Use UTF-8.
Advantages: Compact /if/ you send mostly ASCII text. Easily readable (for
debugging) /if/ you send mostly ASCII text. No byte-order issues.
Disadvantages: Consumes more bandwidth if you send mostly non-ASCII. Requires
explicit en/de-coding on the Windows box (perfectly possible, but you have to
write the code for it).

Use: UTF16-LE
Advantages: Compact in the cases where UTF-8 is not. Requires no special
handling in the Windows code (since that's the native format for a wstring) and
you always have to specify an encoding at the Java end so it makes no
difference which encoding you use from the Java point-of-view.
Disadvantages: Consumes more bandwidth if you send mostly ASCII text.

Without knowing your requirements, I'd can't guess which option would be best
for you, but I don't think any other options make sense.

Some other points to consider.

If you choose UTF8 then don't use java.io.DataInputStream.readUTF() or the
corresponding write method They doesn't do what the method names suggest.

If you choose UTF16-LE then you should consider whether a BOM (byte order mark)
is forbidden, tolerated, or required by your protocol. Alternatively you could
mandate merely UTF16 (either byte order) and /require/ a BOM -- that would give
you flexibility if you anticipate creating non Windows clients (which I doubt).

If you choose UTF8 then you should consider whether a BOM forbidden or
tolerated by your protocol.

If your choice between UTF-8 and -16 is significantly swayed by bandwidth
considerations, then it might be worthwhile considering using zlib compression.
Java already understands that, and it's easy to use the ZLIB1.DLL from Windows
code.

If your protocol is of the form:
<character count><character data>
then you should be very clear about what you mean by a "character", especially
if you use UTF16 (where there may be more 16-bit wchars / Java chars than
actual Unicode characters). Is the BOM (if any) included in the count ?

-- chris


.



Relevant Pages

  • Re: In the Shallow End
    ... Yes, with Apple finally getting in there, technology is finally moving along nicely again. ... What they like to do is use the Windows desktop as data gathering frontends to these operations. ... They had other plans for Java, ... They make computers for the elite. ...
    (comp.sys.mac.advocacy)
  • Re: IE6 javascript & active x not working
    ... > I'm trying to fix a problem on a Windows XP SP1 Compaq computer. ... Where to get the JAVA VM ... Sun also offers an automatic download and install of the 1.4 Java plug-in ... ActiveX problems in IE ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • Re: How to reach Windows users?
    ... native Windows API, but I've made plenty of things that run on Windows. ... interface library so the game doesn't sort of pop in and out of roguelike-ness ... I find Java is pretty okay for most 'computational' tasks. ... with C-like procedural languages such as Blitz Basic). ...
    (rec.games.roguelike.development)
  • Re: IE6 cannot access secured sites after SP2 install
    ... I installed Windows script 5.6 ... >To download Java VM for XP: ... >Windows Update/Enable the Windows Update Catalog. ... >Sun also offers an automatic download and install of the ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • Re: field validation (was Re: COBOL/DB2 Date edit question)
    ... regular Windows application, and certainly not by a mainframe application ... (or mainframe OS does not support Java). ... one or two subroutines written in Standard (Fujitsu NetCOBOL) COBOL, and C#, ...
    (comp.lang.cobol)