Reading XML file - chars being dropped



Hello people,

I have a PHP script parsing an XML file, and am having a problem when
the characterData read contains extended characters (such as é). The
ef_characterData function is the character data handler for the XML
parser, and when I feed it an XML file like the one below, the string
$ef['title'] only contains the string "é to the White House" - the first
few characters are lost for some reason.

If I try to echo the $data variable right at the place where the
assignment to $ef['title'] occurs, $data contains the entire string
("Attaché to the White House"). Seems like the assignment operator is
truncating the string.

I figure this is because of PHP's limitations with 256 chars, but does
anyone have a workaround?

<php code>
function ef_characterData($parser, $data) {
global $curTag, $ef;
$titleKey = "^ROOT^TITLE";
if ($curTag == $titleKey) $ef['title'] = $data;
}
</php code>

<xml file>
<Root>
<title>The local Attaché to the white house</title>
</Root>
</xml file>

Thanks in advance!

MW
.



Relevant Pages

  • Re: XML whackyness
    ... > miscallaneous characters before the first less-than bracket. ... > asking is because there is a routine that reads the text from XML file ... > into memory and passes me the string. ... first angled bracket before sending it on to the XML DOM in our XML objects at work. ...
    (microsoft.public.vb.general.discussion)
  • Re: Character Set Problem?
    ... "Brendan Reynolds" wrote: ... was no problem until I created a test file with accented characters, ... so the actual encoding and the declaration did not match. ... I have an Access 2002 database that imports an XML file. ...
    (microsoft.public.access.modulesdaovba)
  • Re: Character Set Problem?
    ... was no problem until I created a test file with accented characters, ... so the actual encoding and the declaration did not match. ... I have an Access 2002 database that imports an XML file. ...
    (microsoft.public.access.modulesdaovba)
  • Re: XML whackyness
    ... It automatically filters them out and uses them to decide how the rest of the file data should be ... miscallaneous characters before the first less-than bracket. ... The reason I am asking is because there is a routine that reads the text from XML file into memory and passes me the string. ...
    (microsoft.public.vb.general.discussion)
  • Re: Converting "&#x2019;" to an Apostrophe?
    ... all these different strings (including dagger, ellipsis, euro symbol, double quote, etc.) to their ASCII equivalents? ... Perl has so many different modules for handling XML and CGI that it is unlikely my example matches your situation. ... # Demonstrate handling of Unicode characters in a UTF8 encoded XML file ... # First we write some Unicode to an XML file using UTF-8 encoding. ...
    (comp.lang.perl.misc)