Re: parse mime

From: Erik Rosenbach (heath10277_at_hotmail.com)
Date: 10/07/04


Date: Wed, 6 Oct 2004 21:26:53 -0700

Thanks Edwin for the answer, unfortunately it will not be that simple of a
process. What I have to work with is actually about 80K records coming out
of an Oracle database. These records (message bodies) are from an Lyris
News group server and I am porting them into something else. One of the key
problems with the content and the embedded mime is that I have some messages
with multipart headers... for text, html, and for attachments. I'm not
worried about any of the attachments, I just want to scrape the content.
Not every message is embedded with the mime either. Some of them are plain
text which is fine. The really problematic messages are the messages in
which people have posted the messaged and replied to. Some of these
messages have long trail of "original message" headers in them as well.

What I would ideally like to find, is a email parser I can passed the
message body to, and have it return to me a only the plain text body.

Thanks,
Erik

"Edwin Martin" <e.j.martin@chello.nl> wrote in message
news:O988d.8051$jG2.4213@amsnews05.chello.com...
> Erik Rosenbach wrote:
>> I have a question about how to parse out mime from a message. I have
>> email messages that are stored in a database table and these messages
>> have mime headers embedded within them. How can I parse out this mime
>> content and retrieve only the text messages?
>
> You mean you want to get rid of the header of the message?
>
> The body of the e-mail message is the text after the first blank line.
>
> That's easy to program.
>
> Edwin Martin.
>
> --
> http://www.bitstorm.org/



Relevant Pages

  • Re: parse mime
    ... Not every message is embedded with the mime either. ... messages have long trail of "original message" headers in them as well. ... >> I have a question about how to parse out mime from a message. ... > Edwin Martin. ...
    (comp.lang.java.developer)
  • Re: Security flaw in how Outlook verifies digital signatures
    ... If you don't protect your private key for your certificate and anyone ... within a message and the From, Reply-To, Sender, or Return-Path headers ... The public key portion of the certificate is deployed within the MIME ... If you can decipher the RFCs regarding S/MIME version 3, ...
    (microsoft.public.outlook)
  • Re: Messages Received in MIME Format
    ... There is probably something wrong with the MIME headers. ... Corrupt headers doesn't mean that they as appear as gibberish. ... text or HTML format. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • Re: =?UTF-8?B?562U5aSNOiBTdHVubmVkIGJ5IGFwdGl0dWRlLg==?=
    ... >> It's a dual-format message encoded in MIME base64 format. ... > Where the heck are you seeing base64 encoding? ... Read the headers. ... My message has no transfer encoding other than a straight ...
    (Debian-User)
  • Re: Filter mime/multipart E-Mail message to text/plain
    ... my native language needs at least ... Message headers should stay completely unchanged ... MIME is sufficiently nasty that I'd like to avoid peddling ... So it doesn't matter whether the encoding is ...
    (comp.lang.perl.modules)