Problem round-tripping with xml.dom.minidom pretty-printer



Hello

I have run into a problem using minidom. I have an HTML file that I
want to make occasional, automated changes to (adding new links). My
strategy is to parse it with minidom, add a node, pretty print it and
write it back to disk.

However I find that every time I do a round trip minidom's pretty
printer puts extra blank lines around every element, so my file grows
without limit. I have found that normalizing the document doesn't make
any difference. Obviously I can fix the problem by doing without the
pretty-printing, but I don't really like producing non-human readable
HTML.

Here is some code that shows the behaviour:

import xml.dom.minidom as dom
def p(t):
d = dom.parseString(t)
d.normalize()
t2 = d.toprettyxml()
print t2
p(t2)
p('<a><b><c/></b></a>')

Does anyone know how to fix this behaviour? If not, can anyone
recommend an alternative XML tool for simple tasks like this?

Thanks
Ben
.



Relevant Pages

  • InStr and HTML Screen Scraping problem
    ... I am trying to fix an application that was working ... parse out the data to fill with ... Here is the problem the Html file ... to scrape the wrong varaibles. ...
    (microsoft.public.vb.general.discussion)
  • Re: I cannot open a hyperlink in an office 2003 program
    ... R-click on an HTML file and select Open With and set Internet explorer ... to open that file type. ... ArmandoS wrote: ... > someone tell me how to fix this problem? ...
    (microsoft.public.office.misc)
  • Re: HTML
    ... > When I try to open an HTML file, Picture, it does not open only displays ... > how do I fix? ... Prev by Date: ...
    (microsoft.public.windowsxp.general)