Re: Character Encoding
From: John C. Bollinger (jobollin_at_indiana.edu)
Date: 02/21/05
- Next message: Antti S. Brax: "Re: how to code to avoid SQL insertion attacks"
- Previous message: John: "Re: HTML & Java"
- In reply to: Fred: "Character Encoding"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 21 Feb 2005 11:30:34 -0500
Fred wrote:
> I've been using java.net.URLEncoder to encode text coming from a form
> on a web page before I store it in my database, and java.net.URLDecoder
> to decode the text when I read it from the database so I can display it
> to the user. I'm using UTF-8 character encoding.
>
> I recently had a problem where a user copied and pasted text from the
> Attachmate terminal emulator into a textarea and submitted the form.
> The text was stored successfully, but when it came time to decode it,
> the URLDecoder class started throwing errors. I'm guessing that some
> characters that were UTF-8 incompatible came along for the ride,
> because I've had similar problems with Attachmate in the past.
There are no characters incompatible with UTF-8 -- it is a
general-purpose charset covering all of Unicode. Moreover, if you
successfully _encode_ the characters with UTF-8 (in the process of
URL-encoding them) then there is absolutely no reason that you should
not be able to reverse the process. (You do, however, need to specify
UTF-8 at both encoding and decoding time.)
If you post a small, self-contained, compilable example that exhibits
the problem, preferably with test data, then we can probably point you
to where the problem lies. You would also get much better advice if you
showed the actual stack traces for the exceptions thrown. The problem
is not that the classes you are trying to use are broken; it is that you
are not using them according to specs.
Do note, by the way, that you have _two_ encoding/decoding pairs to
worry about here, and so far you have only discussed one. You also need
to worry about the the encoding and decoding involved in sending the
form from the client to your application. Since you say you've had
trouble with Attachmate before, I tend to suspect that your
application's character handling is not as robust as you think it is.
-- John Bollinger jobollin@indiana.edu
- Next message: Antti S. Brax: "Re: how to code to avoid SQL insertion attacks"
- Previous message: John: "Re: HTML & Java"
- In reply to: Fred: "Character Encoding"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|