Re: how do I write a regex that looks for 'X' 'NOT Y' 'Z'

From: Harry Putnam (reader_at_newsguy.com)
Date: 03/30/04


To: beginners@perl.org
Date: Tue, 30 Mar 2004 09:37:56 -0600

Bram Mertens <M8ram@linux.be> writes:

> Somebody suggested to use a rule like:
> /From\:\s".*"\s*<my_e-mail-address>/i
>
> And another rule to catch the 2 exceptions. But the .* means that the
> parser might test the entire e-mail making the test slow and heavy on
> memory-usage.
> Something like:
> /From\:\s".{0,20}"\s*<my_e-mail-address>/i prevents this but I'd like to
> know if there's a better solution.

You may get some REALLY inventive regex here... I have before.
But I think you're working too hard and should resort to
spamassassins own tools. This can be done with a push/pull technique.

Investigate spamassassins `meta' handle. (In place of the more normal
`header' handle)

I think most of the documantation is in Mail::SpamAssassin::Conf but
I remember having to have it spelled out for me.

It works like this:

(NB:The underscores are important, spamassin uses them to know how to
handle `meta')

> I'm trying to write a rule for SpamAssassin that looks for the following
> in a message:
> "From: " followed by "anything BUT 'Mertens Bram' or 'Bram Mertens'"
> followed by "<my_e-mail-address>"

(Please note this is untested and may be done more simply)

First define the push and the pull using underscores as shown

header __To_Me_First From =~ /[^\@]*(Mertens Bram|Bram Mertens)
header __Just_My_Email From =~ /<my_e-mail-address>/

Now the meta combo

(Note the bang (not)
meta Not_My_Name_Just_My_Email __Just_My_Email && !__To_Me_First