Re: Fighting Spam with Python



On Thu, 25 Aug 2005 13:22:53 -0400, François Pinard wrote:
>[David MacQuigg]
>
>> The key new features needed in a spam filter are the ability to
>> extract the sender's identity (not that of the latest forwarder), and
>> to factor into the spam score the reputation of that identity.
>
>This will only work if your system is immune to forgeries, while being
>largely widespread.

Stopping forgery is what the new authentication methods are all about.
Getting these methods widely and effectively used is our big
challenge, and one that I hope to accomplish with my efforts. There
are a bunch of pieces that need to work together more smoothly.
That's where Python comes in. There are some challenging constraints,
like the system has to work without government regulation. I've got a
first draft of a website for open-mail.org - temporarily at
http://purl.net/macquigg/email/registry Suggestions are welcome.

>> In the flow we envision, the spam filter is the final process, used
>> only on the 5% that is hard to classify. 80% will get an immediate
>> reject. 15% will get an immediate accept without filtering, because
>> the sender is authenticated and has a good reputation. Eventually,
>> all reputable senders will join the 15%, and the 5% will shrink to
>> where we can ignore it.
>
>It's fun to read statistics about a vision! :-)

The 80% is real. http://messagelabs.com/emailthreats As to how the
remaining 20% will split, that's a guess, but one that I think is
realistic. See http://www.spamhaus.org/effective_filtering.html for
comparable numbers using only IP blacklists and spam filtering.

The 5% still needing filtering will be those senders that don't offer
any authentication or that authenticate with an identity that has not
yet acquired a reputation.

>> >You might find www.spambayes.org of interest, in several ways.
>
>Spambayes is surprisingly good as it already stands.

I haven't used Spambayes, but my experience with Spamnix (an offshoot
of Spam Assassin) is that statistical filters always have a few false
rejects. In my case, that's about two per week.

The solution to this problem is a reliable system allowing receivers
to determine the identity and reputation of an unknown sender. Then
we can safely ignore the spam.

-- Dave

.



Relevant Pages

  • Re: Fighting Spam with Python
    ... > to factor into the spam score the reputation of that identity. ... > In the flow we envision, the spam filter is the final process, used ... 15% will get an immediate accept without filtering, ... > all reputable senders will join the 15%, and the 5% will shrink to ...
    (comp.lang.python)
  • Re: Filteringg
    ... thier spam filter is rejecting them and sending an NDR to either one of the ... sender email address listed and when I try to add it to my Junk Email ... I get the Exchange message "The email address for this sender is ... Your message did not reach some or all of the intended recipients. ...
    (microsoft.public.exchange.admin)
  • Re: Spam Defeats Filters and Download Block
    ... Pictures that are included in the mail are always show. ... > I just recieved an junk email which not only evaded the spam filter ... > if/when I choose to download them). ... > Sender" without the sender being on the Safe Sender list. ...
    (microsoft.public.outlook.general)
  • Re: Unable to recieve mail from safe sender
    ... Replies use your e-mail address from the message that they are ... Also, if the sender has multiple e-mail accounts, new messages may be ... triggering a spam filter. ... filtering changes via their web mail. ...
    (microsoft.public.internet.mail)
  • Re: Stop an application from using my internet connection
    ... ban the ip from the sender but both those can result in not receiving valid ... id go with the spam filter option if you must. ... > If the virus not on my network system and chosen to use MY email address ... > the sender of the emails ...
    (microsoft.public.isa)