Re: deciphering emails in PERL

From: Bob Walton (see_at_sig.invalid)
Date: 10/22/04


Date: Thu, 21 Oct 2004 22:04:28 -0400

daniel kaplan wrote:

> i am using net::pop3 to read from my email server, but am having some
> difficulties...
>
> for instance i can easily find the FROM and TO and SUBJECT
>
> but deiphering where the BODY of the message itself is kind of hard....so
> manny headers, and they change depending on the format (HTML vs. NON) and
> the sending app...OE, AOL, etc
>
> i have to think, with the vast exapnse of lib. i see out there for PERL,
> someone has written something? no?

Did you miss the FAQ "How do I parse a mail header"? (in perlfaq9). It
points you to some stuff on CPAN, which is an excellent place to look
for pre-written code to do common tasks in Perl.

The body of a mail message starts after the first null line. You should
be able separate the header from the body with something like:

   my $header;my $body;
   ($header,$body)=split "\n\n",$msg,2;

This assumes you have one message in $msg -- mail messages in a standard
mail file each start with the characters "From " starting in column 1,
without the quotes but including the space (not to be confused with the
"From:" header line). Text lines starting with those characters is
prohibited in messages, and is commonly handled by replacing "From " in
the message body with ">From " if it occurs in column 1.

HTH.

-- 
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl


Relevant Pages

  • Re: A quick question about forwarding contacts
    ... These documents are based on earlier work documented in RFC 934, ... Internet mail header fields. ... printable US-ASCII characters before invoking a local mail UA (User ... which may have data or character set limitations. ...
    (microsoft.public.outlook.general)
  • Re: Where Should I Get the Latest SLRN?
    ... Discussions about updating the slrn ... When following-up on an article with no Newsgroups header, ... this function does not work with wide characters. ... installbin and installdirs swapped in the install ...
    (news.software.readers)
  • Re: Creating a Compressed File
    ... This also works fine but the extra Print command seems redundant. ... It writes a header of a standard length at the beginning of the ... The first two characters are PK (most likely in deference to Phil Katz ... This is required for Windows to believe it's a 'legal' zip file. ...
    (microsoft.public.access.modulesdaovba)
  • Re: Flat-File to Xml Query...
    ... special character resulting in corrupting the remaining data in that ... While parsing I get error: ... Line number 2885 has the special characters and when replaced/removed ... type at the same level as Header and Detail records, ...
    (microsoft.public.biztalk.general)
  • RE: [Full-Disclosure] Sidewinder G2
    ... If you not current with security software to the last two years your screwed ... A search at Cert for "Secure Computing" and "Sidewinder: ... exploit contains characters outside of the set defined by RFC822 (aka binary ... (do you really need a HTTP host: header length greater than 50 characters?). ...
    (Full-Disclosure)

Loading