Re: Problem round-tripping with xml.dom.minidom pretty-printer



The last line of p() calls itself: it is an unconditional recursive call
so, no matter what it does, it will never stop. And since p() also
prints something, calling it will print endlessly.

Sorry, I wasn't clear. I realize that this recurses endlessly. The
problem is that it also adds blank lines endlessly.

By removing this line, you get something like:

<?xml version="1.0" ?>
<a>
<b>
<c/>
</b>
</a>

That seems sensible, imo. Was that what you wanted?

Sure. That's fine unless you then re-parse this out put and print it
again in which case you get the behaviour you describe:

An additional thing to keep in mind is that toprettyxml does not print
an XML identical to the original DOM tree: it adds newlines and tabs.
When parsed again these blank characters are inserted in the DOM tree as
character nodes. If you toprettyxml an XML document twice in a row, then
the second one will also add newlines and tabs around the newlines and
tabs added by the first. Since you call toprettyxml an infinite number
of times, it is expected that lots of blank characters appear.

Right. That's the behaviour I'm asking about, which I consider to be
problematic. I would expect a module providing a parser and pretty-
printer (not just for XML parsers) to be able to conservatively round-
trip.

As far as I can see (and your comments back this up) minidom doesn't
have this property. Unless anyone knows how to get it to behave that
way...

Ben
.



Relevant Pages

  • Re: Problem round-tripping with xml.dom.minidom pretty-printer
    ... an XML identical to the original DOM tree: it adds newlines and tabs. ... When parsed again these blank characters are inserted in the DOM tree as ...
    (comp.lang.python)
  • Re: format string to certain line width
    ... tabs or newlines) to a line width of 60 characters. ... transform the string to set the linewidth? ... newlines turns out to be incorrect this may not do what you wanted. ...
    (comp.lang.python)
  • Ignoring spaces, tabs and line in XML DOM parsing
    ... I am parsing an XML file using xerces DOM parser. ... newlines and tabs as a TEXT section. ...
    (comp.lang.java.programmer)
  • Re: character classification functions
    ... > return true not only for spaces, but for newlines, tabs, and some other ... Look at the Hyperspec Characters section in particular: ...
    (comp.lang.lisp)
  • Re: Great SWT Program
    ... It's not putting mouse support into an editor running on modern ... Didn't I explain that this whole thing is "you don't like tabs" ... characters into spaces" feature on. ...
    (comp.lang.java.programmer)