Re: Parsing Large XML files

From: Derek Fountain (nospam_at_example.com)
Date: 03/01/05


Date: Tue, 01 Mar 2005 19:25:38 +0800

doug wrote:

> How can I parse a large XML file that is to large for memory? I am
> currently using php 5.0.3 and the libxml parser, I would like to read
> it incrementally from a file, but the parser gets the entire contents
> from as String?

The standard XML solution to this problem is to use a SAX parser instead of
a DOM one. However, there doesn't seem to be a SAX parser in the PHP XML
library. One solution appears to be:

http://www.engageinteractive.com/mambo/index.php?option=content&task=view
&id=3628&Itemid=10159

Google might help find others. Or maybe use an external SAX based tool to
boil the XML down to something a bit smaller that you can manipulate from
PHP?

-- 
The email address used to post is a spam pit. Contact me at
http://www.derekfountain.org : <a
href="http://www.derekfountain.org/">Derek Fountain</a>


Relevant Pages

  • Re: SAX PARSING DESIGN PATTERN
    ... I am parsing out an xml document using a sax parser. ... In the class that implements the parser element for a given tag I include a reference to the parent parser element object. ... The parsing loop will retrieve a handler for the current tag during startElement() and set its "parent" instance variable to the current AbstractHandler before pointing currentHandler at the new one. ... Or does your SAX parser actually have StartElementand EndElementmethods? ...
    (comp.lang.java.programmer)
  • XML parser error ..
    ... I have just started to creating PHP parser for XML ... // set cdata handler ...
    (php.general)
  • Re: XML SAX parser bug?
    ... > I think I ran into a bug in the XML SAX parser. ... the SAX parser misreads the line. ... > I put a 'print characters' line in the 'characters' method of the ...
    (comp.lang.python)
  • Re: python from Java
    ... >>events generated by your SAX parser. ... See the problem is an XML ... to find out that they are not supported on python-ce, ...
    (comp.lang.python)
  • Re: XML SAX parser bug?
    ... >> I think I ran into a bug in the XML SAX parser. ... the SAX parser misreads the line. ... > entities or character references, ...
    (comp.lang.python)