Re: Parsing Rdf (Rewrite)



On 5/31/07, Brandon McGinty <brandon.mcginty@xxxxxxxxx> wrote:
I would think that I could do:
etexts=tree.findall('pgterms:etext')
(or something like that), Which would pull out each etext record in the
file.
I could then do:
for book in etexts:
print book.get('id')
This isn't yielding anything for me, no matter how I write it.
Any thoughts on this?

I know very little about ElementTree, but a bit of experimentation
shows that the following seems to work:

import xml.etree.cElementTree as et

tree = et.parse("C:/temp/catalog.rdf")
root = tree.getroot()
etexts = tree.findall("{http://www.gutenberg.org/rdfterms/}etext";)
for book in etexts:
print book.get("{http://www.w3.org/1999/02/22-rdf-syntax-ns#}ID";)

I see some comments on namespace issues here:
http://effbot.org/zone/element.htm#xml-namespaces if that helps.

--
Jerry
.