Re: XML SAX parser bug?



mitsura@xxxxxxxxx wrote:

> I think I ran into a bug in the XML SAX parser.
>
> part of my program consist of reading a rather large XML file (about
> 10Mb) containing a few thousand elements.
> I have the following problem. Sometimes that SAX parses misreads a
> line.
> Let me explain: the XML file contains a few thousand lines like this:
> "
> <TargetRef>WINOSSPI:Storage@@n91c90a.cmc.com</TargetRef>
> "
> where 'n91c90a.cmc.com' is the name of a system and thus changes per
> system.
> I a few cases, the SAX parser misreads the line. The parser sometimes
> plits characters the line in:
> "WINOSSPI:Storage@@n" and "91c90a.cmc.com".
> I put a 'print characters' line in the 'characters' method of the
> parser that is how I found out.
> It only happens for a few of the thousand lines but you can imagine
> that is very annoying.
>
> I checked for errors in the XML file but the file seems ok.
>
> Is this a bug or am I doing something wrong?

it's not a bug; the parser is free to split up character runs (due to buffering,
entities or character references, etc). it's up to you to merge character runs
into strings.

</F>



.



Relevant Pages

  • Re: SAX PARSING DESIGN PATTERN
    ... I am parsing out an xml document using a sax parser. ... In the class that implements the parser element for a given tag I include a reference to the parent parser element object. ... The parsing loop will retrieve a handler for the current tag during startElement() and set its "parent" instance variable to the current AbstractHandler before pointing currentHandler at the new one. ... Or does your SAX parser actually have StartElementand EndElementmethods? ...
    (comp.lang.java.programmer)
  • Re: non SGML character escape
    ... data and create xml file. ... to escape these characters. ... Even though XML Validator fails can XSLT validation by pass these ... (a standards-compliant parser will), and since parsing is a prerequisite ...
    (comp.lang.java.programmer)
  • Re: python from Java
    ... >>events generated by your SAX parser. ... See the problem is an XML ... to find out that they are not supported on python-ce, ...
    (comp.lang.python)
  • Re: XML SAX parser bug?
    ... >> I think I ran into a bug in the XML SAX parser. ... the SAX parser misreads the line. ... > entities or character references, ...
    (comp.lang.python)
  • Re: invalid xml chars returned in select .. for xml auto
    ... Assuming that you may have an XML 1.1 parser, ... SQL Server 2005 will entitize such invalid characters (note: ... nor System.XML provide an XML 1.1 parser), but an XML 1.0 parser will still ...
    (microsoft.public.sqlserver.xml)