Re: Unescaping Unicode code points in a Java string



Greg wrote:
My Java program reads in (from an external source) text that contains
the same sort of unicode character escape sequences as java source
code. For example, one such string might be:

"En Espa\u00f1ol"

Naturally, I would like to convert the five characters subsequence,
"\u00f1", into the single character codepoint (hex 00F1) that those
characters actually represent:

"En Español"

I've been browsing the J2SE 1.5 docs hoping to find a convenient method
to perform this kind of conversion, but so far have not found one. Does
anyone have any suggestions?

Long time ago I searched the Java API and sources for a method doing that kind of String decoding, but to no avail. The only thing I found was method
private String loadConvert(String)
in class java.util.Properties. But because it is private, it is not reusable outside Properties.

(You find the source in src.zip of JDK installation directory)

--
Thomas
.



Relevant Pages

  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • Re: RfD: Escaped Strings
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... \b BS (backspace, ASCII 8) ... \ ** escapes to characters much as C does. ...
    (comp.lang.forth)
  • Re: A note on computing thugs and coding bums
    ... code is valid for any character set that is legal in C (which is a ... characters in the required source character set ... A String, in C Sharp or Java, can be redefined. ... allow programmers to handle some other data format, ...
    (comp.programming)
  • Re: input & output in assembly
    ... [As you've not specified OS or assembler, ... using individual character I/O and handling the rest yourself in your ... it finds in that string, ... ENTER key is pressed (maximum buffer size: ...
    (comp.lang.asm.x86)