Re: XML parsing with python



On Aug 18, 11:24 am, Stefan Behnel <stefan...@xxxxxxxxx> wrote:
inder wrote:
On Aug 17, 8:31 pm, John Posner <jjpos...@xxxxxxxxxxx> wrote:
Use the iterparse() function of the xml.etree.ElementTree package.
http://effbot.org/zone/element-iterparse.htm
http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk
Stefan
iterparse() is too big a hammer for this purpose, IMO. How about this:

  from xml.etree.ElementTree import ElementTree
  tree = ElementTree(None, "myfile.xml")
  for elem in tree.findall('//book/title'):
      print elem.text

-John

Thanks for the prompt reply .

I feel let me try using iterparse. Will it be slower compared to SAX
parsing ... ultimately I will have a huge xml file to parse ?

If you use the cElementTree module, it may even be faster.

Another question , I will also need to validate my xml against xsd . I
would like to do this validation through the parsing tool  itself .

In that case, you can use lxml instead of ElementTree.

http://codespeak.net/lxml/

Stefan

Hi ,

Is lxml part of standard python package ? I am having python 2.5 .

I might not be able to use any additional package other than the
standard python . Could you please suggest something part of standard
python package ?

Thanks
.



Relevant Pages

  • Re: Universal grammar
    ... affects how you want to do the parsing. ...     The man gives the house plants to charity. ... Do you know how GLR works? ... logical models for the natural language semantics. ...
    (sci.lang)
  • Re: text file parsing to mysql
    ... what is blah blah blahdsd fds ...     a. ... complex in PHP for parsing this data. ...
    (comp.lang.php)
  • Re: Python equivalent to SharePoint?
    ... with similar functionality? ...    http://en.wikipedia.org/wiki/SharePoint ... Take a look at alfresco it is an opensource alternative to sharepoint ...
    (comp.lang.python)
  • Re: Is this a bug of str.join?
    ... I'm just working around to generate some fake file for parsing. ... I tried this with python2.6 from debian source and python2.3 which I ...
    (comp.lang.python)
  • Re: Schildt
    ...   syntax descriptions is a matter of current debate: ... is that up to now they haven't inspired any new kinds of parsing ...
    (comp.lang.c)