HTML::Entities issue

From: Jan Eden (lists_at_janeden.org)
Date: 04/29/04


Date: Thu, 29 Apr 2004 11:57:18 +0200
To: Perl Lists <beginners@perl.org>

Hi all,

I have the following script (just a test):

---
#!/usr/bin/perl -w
use strict;
use HTML::Entities;
my $string = 'Alfred D&ouml;blin: Berlin Alexanderplatz';
my $string2 = 'Alfred Döblin: Berlin Alexanderplatz';
$string  = decode_entities($string);
print $string, "\n", $string2, "\n";
---
This prints
Alfred D?blin: Berlin Alexanderplatz
Alfred Döblin: Berlin Alexanderplatz
in my terminal
Now the perldoc for HTML::Entities says
>decode_entities( $string )
>This routine replaces HTML entities found in the $string with the
>corresponding ISO-8859-1 character, and if possible (under perl 5.8
>or later) will replace to Unicode characters.  Unrecognized enti-
>ties are left alone.
I do have Perl 5.8.1, so I'd expect the decode_entities method to return a Unicode character string. Why doesn't it do that?
Thanks,
Jan
-- 
A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.