Re: Html or Pdf to Rtf (Linux) with Python

From: Stephen Thorne (stephen.thorne_at_gmail.com)
Date: 12/17/04


Date: Fri, 17 Dec 2004 11:00:18 +1000
To: Axel Straschil <axel@straschil.com>, python-list@python.org

On Thu, 16 Dec 2004 19:30:37 +0000 (UTC), Axel Straschil
<axel@straschil.com> wrote:
> > That's easy. Load the HTML in MS Word, and save it as RTF. Script it
> > via COM using the python win32all (I think that's what it's now
> > called) package.
>
> As I wrote in my posting and the subject: linux ;-)
> I could try to do this with open office, by I'm afraid this will not
> be a performant solution ;-(
> I realy was spending hour's on that, the only thing I found was a
> spezifikation for reach text, maybe a good point to start a project ...

I've been able to successfully get konqueror to generate a pdf from a
html file via dcop. It's something along the lines of:
% dcop konqueror-25827 html-widget1 print 1
You can launch konq in a xvfb (X Virtual Framebuffer) then communicate
via dcop to send commands to the browser (load this url, print this
page, etc).

I've been investigating doing the same feat using JS/XUL/etc in
mozilla. It probably is possible. There's lots of documentation about
the XPCOM api available from http://xulplanet.com/

As for converting to RTF, someone has already pointed out PyRTF.

Regards,
Stephen Thorne



Relevant Pages

  • Re: PR_BODY_HTML to PR_RTF_COMPRESSED to PR_BODY
    ... You can load the RTF into a hiden RTF control, then stream it out as plain ... In case of HTML, you can load it into an instance of IHTMLDocument2 object ... >> the right RTF tags for each HTML tag the way Outlook does it. ...
    (microsoft.public.win32.programmer.messaging)
  • Re: XML and DOM through MFC?
    ... DOM is a really poor document mechanism, and in general it is very, very limited in what ... HTML and RTF are sort-of-unrelated; both deal with formatting and layout (while XML deals ... specified in a document available from Microsoft, and most new Office products and other ...
    (microsoft.public.vc.mfc)
  • Re: TextEdit bug in Save As HTML
    ... Open an RTF document "MyDoc.rtf" in TextEdit. ... The ".html" is added automatically, but you won't see it if the ... TextEdit does what most other apps do in similar situations. ... affects how it opens HTML files when you drag them onto TextEdit or use ...
    (comp.sys.mac.apps)
  • Re: Unhandled exception in MSHTML.DLL
    ... > always crashes when its CHtmlView view started to load a specific local ... > html page, please let me know if I have misunderstood anything. ... > program crashes at. ... All 3 files load just fine in a regular browser. ...
    (microsoft.public.vc.mfc)
  • Re: Colored Text
    ... Although html may at first seem easy, it also requires you to escape any special characters, such as < with < & with & etc, etc. ... Generally html will be relatively slow compared to rtf. ...
    (microsoft.public.vb.general.discussion)