Parsing an email

From: Dan (dan_hoffard_at_hailmail.net)
Date: 09/27/04


Date: 26 Sep 2004 19:09:35 -0700

What is the best way to get the body of the following email message
into a file? The following code gets the subject and from fields
nicely, but I can't figure out how to get the body:

my ($summary, $i);

    for (file_read "$f_email_html") {
    print "$i";
        if (/<b>From\:<\/b> <a href\=\'mailto\: \&quot(.+)\&quot/) {
             $i++;
             $summary .= "From: $1\;\n ";
        }
        elsif (/<b>Subject\:<\/b>(.+)<br>/) {
             $i++;
             $summary .= "Subject: $1\;\n ";
        }
        
    }
    
    file_write "$f_email_summary", $summary;

Here is the .html file I am trying to parse:

(01) <a name='10962432060' href='#top'>Back to Index</a> , <a
href='#top'>Previous</a> , <br><b>Date:</b> Sun 09/26/04 19:00:06<br>
<b>To:</b> &lt;dan_hoffard@hailmail.net&gt;<br>
<b>From:</b> <a href='mailto: &quot;Dan Hoffard&quot;
&lt;dan_hoffard@hailmail.net&gt;'>Dan Hoffard</a><br>
<b>Reply to:</b> <a href='mailto:'></a><br>
<b>Subject:</b> test<br>
<blockquote><pre>This is a multi-part message in MIME format.

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

test
asdf1
asdf2
asdf3
asdf4
asdff
Dan Hoffard
dan_hoffard@hailmail.net

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
</pre>
<html><p>
<HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1106" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>test</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Dan Hoffard<BR><A=20
href=3D"mailto:dan_hoffard@hailmail.net">dan_hoffard@hailmail.net</A><BR>=
</FONT></DIV></BODY></HTML>

------=_NextPart_000_0039_01C4A3F7.606C81F0--

</blockquote><br><hr>



Relevant Pages

  • Parsing Email
    ... What is the best way to get the body of the following email message ... Dan Hoffard ... Post a follow-up to this message ...
    (comp.lang.perl.misc)
  • Re: Parsing an email
    ... >What is the best way to get the body of the following email message ... >Here is the .html file I am trying to parse: ... >Dan Hoffard ...
    (perl.beginners)