Re: HTML in utf8 and perl
From: Alan J. Flavell (flavell_at_ph.gla.ac.uk)
Date: 02/28/04
- Next message: fifo: "Re: using sed from with a perl script"
- Previous message: Bumble: "LWP User Agent/HTTP Request help needed!"
- In reply to: Pawel Niewiadomski: "Re: HTML in utf8 and perl"
- Next in thread: Andy Hassall: "Re: HTML in utf8 and perl"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 28 Feb 2004 18:45:26 +0000
On Sat, 28 Feb 2004, Pawel Niewiadomski wrote:
> "Alan J. Flavell" <flavell@ph.gla.ac.uk> wrote in
> > No, theoretically the second one should generate the Unicode character
> > which you specified. You're confusing Unicode values with their utf-8
> > encodings.
>
> That was the answer I was looking for. I didn't really quite understand
> the difference between the encoding of the character in utf8 and its
> value in Unicode.
Glad it helped.
Of course, now that you know the answer, it should easy to find it in
the documentation. :-}
The Unicode "code points" (the term used in the perluniintro) are
encoded in different ways (different bit-patterns) in utf-8, utf-16 or
indeed other applicable Unicode encodings, but they still represent
the same "code point". It just so happens that Perl chose internally
to represent characters by using utf-8 representation, but the ord()
values of the Unicode characters are still their code point values,
and, as you've now seen, the wide character constant is represented by
\x{...} using its code point value, the same as is tabulated in the
character code charts at the Unicode site,
http://www.unicode.org/charts/
all the best
- Next message: fifo: "Re: using sed from with a perl script"
- Previous message: Bumble: "LWP User Agent/HTTP Request help needed!"
- In reply to: Pawel Niewiadomski: "Re: HTML in utf8 and perl"
- Next in thread: Andy Hassall: "Re: HTML in utf8 and perl"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|