Re: xml.sax feature question

From: Martin v. L÷wis (martin_at_v.loewis.de)
Date: 10/26/03


Date: 26 Oct 2003 10:56:50 +0100

christof hoeke <csad7@yahoo.com> writes:

> the problem i have is that if the xmlfile has a doctype declaration
> the sax parser tries to load it and fails (IOError if course).
> partly because the path to the DTD is just a simple name in the same
> dir e.g. <!DOCTYPE contacts SYSTEM "contacts.dtd"> and i guess the
> parser does not use the path os.path.walk uses (can i somehow give the
> parser this information?). but it also could be a DTD which should be
> loaded over a network which is not available at the time.

In XML, the SYSTEM identifier is a URI reference; in your case, it is
a relative URL. An XML processor must interpret this relative to the
URL of the main document. If you have the main document on a local
disk, the relative URL will be intepreted relative to the file name.
So you should put the DTD along with the document (in the same
directory).

> i guess to simply set a feature of the sax parser to not try to load
> any external DTDs should work. question is which feature do i have to
> disable?
> p = xml.sax.make_parser()
> p.setFeature('http://xml.org/sax/features/validation', False)
>
> i thought turning off the validation would stop the parser to load
> external DTDs, but it still tries to load them.

This just turns of validation. The parser you are using is not
validating anyway, so this has no effect. The parser still loads the
DTD, in order to expand entity references it may encounter.

> any other suggestions?

You need to turn off resolution of general entities:

p.setFeature("http://xml.org/sax/features/external-general-entities",False)

Alternatively, you can install an entity handler which then uses a
different mechanism of resolving the DTD (and other external entities).

Regards,
Martin



Relevant Pages

  • xml.sax feature question
    ... to find out which elements are used and only partly a DTD is available. ... simple sax ContentHandler simply stores all names in a dictionary (to ... sax parser tries to load it and fails. ...
    (comp.lang.python)
  • Re: Lets think who will like to say delphi is dying?
    ... An alternative to XML? ... Are you making your own DTD? ... you may have to build your own based on an existing loose parser. ... The loose parser just gets you a head start so you don't have to write a strict parser from scratch - all strict parsers abstract from sloppy ones anyway.. ...
    (borland.public.delphi.non-technical)
  • Re: SAX performance
    ... Jon Smith wrote: ... > I am using the Xerces SAX parser, to parser a lot of document of different ... try to see what happens when you parse multiple ... Also, if your documents use a DTD, and you are using a validating ...
    (comp.lang.java.programmer)
  • xml.dom.minidom help!
    ... I having a problem with xml.dom.minidom parser while ... reading some in-house XML configuration files. ... The files have a DTD at the begining of the file ... I get those default values inside the tags! ...
    (comp.lang.python)
  • Re: If DTD is unspecifed XML should not parse
    ... Doctype of the XML file and also throw an error? ... If the DTD is not specified by the document type, validation is not performed and parsing runs normally. ... Depending on the parser and API you're using, you may be able to detect that no DTD has been specified and have your program do something appropriate. ...
    (comp.text.xml)