Re: How do I Extract Attachment from Newsgroup Message
- From: kyosohma@xxxxxxxxx
- Date: 31 May 2007 07:14:04 -0700
On May 31, 8:54 am, "snewma...@xxxxxxxxx" <snewma...@xxxxxxxxx> wrote:
I'm parsing NNTP messages that have XML file attachments. How can I
extract the encoded text back into a file? I looked for a solution
with mimetools (the way I'd approach it for email) but found nothing.
Here's a long snippet of the message:
n.article('116431')
('220 116431 <D8PANK...@xxxxxxxxxxx> article', '116431',
'<D8PANK...@xxxxxxxxxxx>', ['MIME-Version: 1.0', 'Message-ID:
<D8PANK...@xxxxxxxxxxx>', 'Content-Type: Multipart/Mixed;', '
boundary="------------Boundary-00=_A5NJCP3FX6Y5BI3BH890"', 'Date: Thu,
24 May 2007 07:41:34 -0400 (EDT)', 'From: Newsclip <newsc...@xxxxxx>',
'Path: newsclip.ap.org!flounder.ap.org!flounder', 'Newsgroups:
ap.spanish.online,ap.spanish.online.business', 'Keywords: MUN ECO
PETROLEO PRECIOS', 'Subject: MUN ECO PETROLEO PRECIOS', 'Summary: ',
'Lines: 108', 'Xref: newsclip.ap.org ap.spanish.online:938298
ap.spanish.online.business:116431', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y5BI3BH890', 'Content-Type: Text/Plain',
'Content-Transfer-Encoding: 8bit', 'Content-Description: text,
unencoded', '', '(AP) Precios del crudo se mueven sin rumbo claro',
'Por GEORGE JAHN', 'VIENA', 'Los precios
... (truncated for length) ...
'', '___', '', 'Editores: Derrick Ho, periodista de la AP en Singapur,
contribuy\xf3 con esta informaci\xf3n.', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y5BI3BH890', 'Content-Type: Text/Xml', 'Content-
Transfer-Encoding: base64', 'Content-Description: text, base64
encoded', '',
'PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NUWVBFIG5pdGYgU1lT',
'VEVNICJuaXRmLmR0ZCI+CjxuaXRmPgogPGhlYWQ
+CiAgPG1ldGEgbmFtZT0iYXAtdHJhbnNyZWYi',
'IGNvbnRlbnQ9IlNQMTQ3MiIvPgogIDxtZXRhIG5hbWU9ImFwLW9yaWdpbiIgY29udGVudD0ic3Bh',
'bm9sIi8+CiAgPG1ldGEgbmFtZT0iYXAtc2VsZWN0b3IiIGNvbn
This looks like what you might be looking for:
http://mail.python.org/pipermail/python-list/2004-June/265018.html
Not sure if you'll need this or not, but here's some info on encoding/
decoding files:
http://www.jorendorff.com/articles/unicode/python.html
There are lots of ways to parse xml. I use the minidom module myself.
Mike
.
- References:
- How do I Extract Attachment from Newsgroup Message
- From: snewman18@xxxxxxxxx
- How do I Extract Attachment from Newsgroup Message
- Prev by Date: Upgrading a largish group of packages w/ distutils
- Next by Date: Re: Python memory handling
- Previous by thread: How do I Extract Attachment from Newsgroup Message
- Next by thread: Re: How do I Extract Attachment from Newsgroup Message
- Index(es):