Re: Reading text files with java.nio.*

From: Chris Smith (cdsmith_at_twu.net)
Date: 02/03/04


Date: Tue, 3 Feb 2004 08:41:02 -0700

Hugh Mackay wrote:
> I am trying to read in lines from a text file which will then be
> written to another file with xml tags and transmitted down a modem.
> I can move the text file, and get the size but when I read the
> contents and print to screen I get meaningless rubbish.
> The text file contains lines like:
> 20040202 165204, 0000.0.001, 00000000h, S Test
>
> but java spits out this:
> ????????ë?????ë?????????????????ë?????ë?????????????????ë?????ë?????????????????
> ë?????ë?????????????????ë?????ë?????????????????ë?????ë?????????
>

> str.append(((ByteBuffer)(buf.flip())).asCharBuffer().toString());

> Any ideas? AM I missing something blindingly obvious? (you can
> probably guess I am not a great java programmer!)

Yep, you're missing something blindingly obvious. A char buffer reads a
series of Java 'char' values, which means two-byte integer values
containing indices into the Unicode table. Unless you happen to have a
text file in UTF-16BE encoding, this won't be what you want.

Instead, you probably want to look into java.nio.charset.CharsetDecoder,
which has a method called decode(ByteBuffer) used to convert a
ByteBuffer into a CharBuffer. That allows you to decode the text
according to the encoding actually used in the source file.

(Alternatively, you could treat System.out as a byte stream and don't do
the text decoding or encoding at all. That would work if you just need
to copy data between streams, but probably won't work if you intend to
parse or otherwise process the data.)

-- 
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.
Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation


Relevant Pages

  • Re: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • Re: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • RE: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • Re: Send string to IP address
    ... "Plain hex" implies something formatted as text, but doesn't answer the question of encoding. ... There's no "just" as far as "an ASCII string" is concerned. ... Characters are not bytes and bytes are not characters. ... Normally you'd create the Writer once at the same time as you create the underlying stream, rather than every time you write some text, obviously. ...
    (comp.lang.java.programmer)
  • Re: combine serveral .txt files
    ... Open the first stream to read from ... When you've finished all the files, close the output stream. ... the files are using the same encoding. ... and the bicycle has to *want* to change. ...
    (microsoft.public.dotnet.languages.csharp)