XML::DOM Encoding UTF-8 and ISO-8859-1

From: Addy (aclaure_at_zethon.net)
Date: 02/18/04


Date: 18 Feb 2004 11:54:52 -0800

I'm a little confused as to why I'm getting these results. Consider
the XML file:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<foo>
  <string>Sécurité</string>
</foo>

Through a CGI script, I load up the file, grab the encoding and put in
the CGI header:

my ($parser) = new XML::DOM::Parser();
my ($doc) = $parser->parsefile('foo.xml');
my ($encoding) = $doc->getXMLDecl()->getEncoding();
print header(-charset => $encoding);

However, when I traverse through the XML and print out the above
"string" element, I see grabled text like "Sécurité"

If I change the CGI header encoding to UTF-8 like such:

print header(-charset => 'UTF8');

The text shows up properly. It would seem to me that the text would
show up properly by using the same encoding on the HTML page as is in
the XML file. This is the case when using other encodings, namely
'x-sjis-cp932'.

Could someome help me understand what I'm overlooking?

Thank you,
Addy



Relevant Pages

  • Re: How to encode a UTF8 file in VBA?
    ... and open the XML file in Internet Explorer. ... Explorer will complain if the encoding of the XML file doesn't match the ... Dim objStream As Stream ...
    (microsoft.public.access.modulesdaovba)
  • Re: UTF8: cgi ist staerker als ich
    ... sobald ich auf encoding utf8 wechsle, was in latin1 aber klappt? ... Unter encoding latin1 funktioniert die Zerlegung ueberhaupt nicht. ... ueber CGI geht anscheinend verloren, ...
    (de.comp.lang.perl.cgi)
  • Re: Character Set Problem?
    ... was no problem until I created a test file with accented characters, ... so the actual encoding and the declaration did not match. ... I have an Access 2002 database that imports an XML file. ...
    (microsoft.public.access.modulesdaovba)
  • Re: XML::Simple and utf8 woes
    ... : I've been following this thread because I have been struggling with: XML::Simple writing/sourcing an XML file in cp932 encoding. ... The: NumericEscape is what resolved the writing and setting the encoding in: the xml declaration of the cp932 encoded file to x-sjis-cp932 so ...
    (comp.lang.perl.misc)
  • Re: Assigning another filehandle to STDOUT, using binmode.
    ... input files (but I'm not doing it directly with my own code --- I'm ... The XML parser gets the encoding from the XML file. ... completely independent of the locale. ...
    (comp.lang.perl.misc)

Loading