how do I write a regex that looks for 'X' 'NOT Y' 'Z'

From: Bram Mertens (M8ram_at_linux.be)
Date: 03/30/04


To: perl beginners-digest mailing list <beginners@perl.org>
Date: Tue, 30 Mar 2004 15:34:52 +0200

Hi

I'm trying to write a rule for SpamAssassin that looks for the following
in a message:
"From: " followed by "anything BUT 'Mertens Bram' or 'Bram Mertens'"
followed by "<my_e-mail-address>"

So these two shouldn't trigger the rule:
From: Bram Mertens <my_e-mail-address>
From: Mertens Bram <my_e-mail-address>

But something like this should trigger it:
From: "optometric" <my_e-mail-address>

this rule catches the above:
/from\:\s\"optometric\"\s<my_e-mail-address>/i

But the rule needs to catch other fake names as well.

I've tried among others:
/From\:\s(?:(?:Bram\sMertens\s)|(?:Mertens\sBram\s))<my_e-mail-address>/i
/from\:\s(?!(?:Bram\sMertens\s)|(?:Mertens\sBram\s))<my_e-mail-address>/i
/from\:\s(?<!(?:Bram\sMertens\s)|(?:Mertens\sBram\s))my_e-mail-address>/i
/from\:\s(^(?:Bram\sMertens\s)|(?:Mertens\sBram\s))<my_e-mail-address>/i
/from\:\s[^(?:Bram\sMertens\s)|(?:Mertens\sBram\s)]<my_e-mail-address>/i

this partly works:
/from\:\s(?!(?:Bram\sMertens\s)|(?:Mertens\sBram\s)<my_e-mail-address>)/i

Only this look for "From: " NOT followed by "Bram Mertens
<my_e-mail-address>" or "Mertens Bram <my_e-mail-address>"

So it will also trigger on 'From: "jack" <jack@home.com>' or even 'From:
' which is not what I want.

Somebody suggested to use a rule like:
/From\:\s".*"\s*<my_e-mail-address>/i

And another rule to catch the 2 exceptions. But the .* means that the
parser might test the entire e-mail making the test slow and heavy on
memory-usage.
Something like:
/From\:\s".{0,20}"\s*<my_e-mail-address>/i prevents this but I'd like to
know if there's a better solution.

Perhaps testing against some characters, or character-combinations that
don't exist in 'Bram Mertens' or 'Mertens Bram'?

Is there a way to test how (in)efficient or demanding a certain rule is?

(Sorry for the long post.)

TIA

-- 
# Mertens Bram "M8ram"   <M8ram@linux.be>          Linux User #349737 #
# SuSE Linux 8.2 (i586)     kernel 2.4.20-4GB      i686     256MB RAM #
#  3:16pm  up 8 days 18:53,  8 users,  load average: 0.09, 0.19, 0.10 #


Relevant Pages

  • Re: Serial: bug in 8250.c when handling PCI or other level triggers
    ... In which case the receive_charsfunction gobbles up to 255 characters ... from the device before relinquishing to the main interrupt loop. ... > - On a virtualised system this trap can trigger because the emulations ...
    (Linux-Kernel)
  • Re: How to set up for word completion, like the date does
    ... a minimum of four characters is NOT required to trigger F3 in Word 2007; you need only enough to be unique. ... I have so few AutoText entries that, to insert my lorem ipsum text, I can type just the letter "l" and then press F3 in Word 2007. ... "Stefan Blom" wrote in message ...
    (microsoft.public.word.docmanagement)
  • How to create an automated input box.
    ... I have the code below that will evaluate a string of characters. ... logic say once i enter in my last character it will trigger the logic ... Private Sub Button1_Click(ByVal sender As System.Object, ...
    (microsoft.public.dotnet.languages.vb)
  • Re: exim4-light vs exim4-heavy
    ... > Note that you can always trigger, say, clamav and ... > spamassassin from procmail instead, still allowing you to use the ...
    (Debian-User)
  • Re: Why sql Server automatically change my input value.
    ... The default number of characters to display per column in Query ... >I don't have any trigger on the table. ... >>Can you repro the problem using Query Analyzer? ...
    (microsoft.public.sqlserver.programming)