Re: Send string to IP address



On 2009-02-23 21:23:18 -0500, Dirk Bruere at NeoPax <dirk.bruere@xxxxxxxxx> said:

Peter Duniho wrote:
On Mon, 23 Feb 2009 16:55:27 -0800, Dirk Bruere at NeoPax <dirk.bruere@xxxxxxxxx> wrote:

It's just an ascii string, or plain hex byte

Well, which is it?

"Plain hex" implies something formatted as text, but doesn't answer the question of encoding.

There's no "just" as far as "an ASCII string" is concerned. ASCII is every bit as legitimate an encoding as any other, and just as problematic too.

"Byte" seems to imply that you might be dealing with binary data, but you're unclear as to how you're dealing with it. Do you want to transmit the bytes in their original form? Or do you want them formatted as "plain hex" and transmitted as a string?

Pete

I'll probably hand translate the ascii to hex and code as a byte string

Welcome to the future. Characters are not bytes and bytes are not characters. The sooner you stop conflating them in your head, the better off you'll be.

IO in general, including network IO, is done in bytes through the InputStream and OutputStream categories of classes. Strings are done in characters through the Reader and Writer categories. In order to translate a character sequence into a byte sequence, you use an encoding. "ASCII" is one such encoding, as are "ISO-8859-15" and "UTF-8". There is no such thing as "no encoding"; even a simple hand-hacked one based on the binary representations of characters in Java is going to be close to, but subtly incompatible with[0], UTF-16.

Fortunately, Java provides some extremely high-quality tools for encoding and decoding character data to and from byte data. For simple, stream-based applications, there are the StreamReader and StreamWriter classes. A simple example that writes strings to a byte stream might look like this:

// assume the existance of OutputStream out
Writer writeOut = new OutputStreamWriter (out, "UTF-8");
writeOut.write ("Hello, wörld.");

Normally you'd create the Writer once at the same time as you create the underlying stream, rather than every time you write some text, obviously. There's a similar pattern for InputStreamReader.

To tie this back to your original question, you can use this in tandem with the Socket getInputStream and getOutputStream methods. The socket ensures that bytes written to the output stream are transmitted to the peer at < other end; your code is responsible for translating the data you want to send into bytes, and since you're just trying to move unstructured strings around, a StreamWriter is a perfect adapter.

You almost certainly want to use UTF-8. For characters whose code points are between U+0000 and U+007F, UTF-8 and ASCII encodings are byte-for-byte identical. However, UTF-8 can encode the entire Unicode space (containing over 65,536 distinct code points), whereas ASCII can *only* encode the first 127 code points.

Joel on software has a surprisingly good explanation of all of this that's worth reading: http://www.joelonsoftware.com/articles/Unicode.html

Cheers,
-o

[0] Probably.

.



Relevant Pages