Re: [XML::Simple-2.12] problems parsing non ASCII strings
- From: Michel Rodriguez <mirod@xxxxxxxxx>
- Date: Tue, 12 Jul 2005 19:16:53 +0200
Jul wrote:
module: XML::Simple-2.12 (also tried 2.14) perl version: 5.00503
Wahouh! Do you know how old this is? 5, 6 years old?
I need to parse and write a XML configuration file wich contains non-ASCII caraters (like 'é', in french). I've choosen, XML::Simple with XML::Parser for these tasks, but everything works fine if and only if I do not include any special carater in the file, otherwise the HASH returned by XMLin() is totaly messed up.
What is the encoding of your file? My guess is that it is in either ISO-8859-1 (or -15) or some kind of windows-12nn
What happens is that the data is read, probably by expat, and converted to UTF-8. The "totaly messed up" characters are in fact perfectly valid UTF-8 characters, that your terminal (or whatever you use to display them) is not set to display.
If XML::Simple can read it then the encoding must be declared in the XML declaration, at the beginning of the XML file.
Your choices are either to convert those characters back to the original encoding, look at the Unicode::* modules on CPAN, or to bite the Unicode bullet and learn how to work with UTF-8 data. In the long run the second option makes more sense, but YMMV.
But really, processing XML with perl 5.00503 seems like a bad idea to me.
-- mirod .
- Follow-Ups:
- References:
- Prev by Date: GD.pm Makefile not installing GD::Polyline
- Next by Date: [XML::Simple-2.12] problems parsing non ASCII strings
- Previous by thread: [XML::Simple-2.12] problems parsing non ASCII strings
- Next by thread: Re: [XML::Simple-2.12] problems parsing non ASCII strings
- Index(es):
Relevant Pages
|
|