Re: Unescaping Unicode code points in a Java string
- From: "Oliver Wong" <owong@xxxxxxxxxxxxxx>
- Date: Thu, 31 Aug 2006 13:47:30 GMT
"Greg" <greghe@xxxxxxxxxxx> wrote in message news:1157007079.550984.122030@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
My Java program reads in (from an external source) text that contains
the same sort of unicode character escape sequences as java source
code. For example, one such string might be:
"En Espa\u00f1ol"
Naturally, I would like to convert the five characters subsequence,
"\u00f1", into the single character codepoint (hex 00F1) that those
characters actually represent:
"En Español"
I've been browsing the J2SE 1.5 docs hoping to find a convenient method
to perform this kind of conversion, but so far have not found one. Does
anyone have any suggestions?
Iterate through each character of the String, looking for the sequence "\u". If you find it, delete those two chars, and read in the next 4 chars. Parse that sequence of 4 characters into a integer assuming hexadecimal notation. Take that integer and cast it to a char, and insert the resulting char back into the String.
- Oliver
.
- References:
- Prev by Date: Javamail - error SMTP 554
- Next by Date: Re: How equals method works in StringBuffer?
- Previous by thread: Re: Unescaping Unicode code points in a Java string
- Next by thread: Core java fundamental help
- Index(es):
Relevant Pages
|