Ridiculous Amounts of Spam by Adam Shand

A couple years ago, I almost gave up on email. My work with the Personal Telco Project meant that I often received legitimate email messages from people I didn't know with subjects like "I'd like to meet you" and "interesting!" Bayesian filters were almost worthless, and my hate for third party blacklists was still fresh (many years of running mail servers for ISPs and dealing with other people's badly maintained blacklists).

It got so bad that I seriously considered putting an auto-responder on all my emails which simply told people that I no longer accepted email and if they wanted to get in touch with me to either call me, use my AIM account or leave a message on my wiki.

I never quite had the balls to actually do it, and eventually I discovered the reputable SBL+XBL maintained by SpamHaus, and integrated it into my mail server. Combined with amavisd-new and SpamAssassin it has consistently kept the level of spam which actually arrives to my inbox at a level which is manageable.

Here is a graph showing the amount of rejected email my personal mail server (which is only used by myself and a handful of family and friends) has handled in the last week:

[[!img defaults size="443x" align="center" link=no]] [[!img mailgraph-rejected.jpg caption="Look at all that spam!"]]

Now compare that to the amount of remaining email that it handles:

[[!img mailgraph-received.jpg caption="Non-spammy mail rate stays fairly similar"]]

So that means that in the last week about 96.3% of the email which my server has handled is outright spam.

Now consider that I have things configured very conservatively (I want to do everything I can to make sure that a legitimate message is never marked as spam), and I'd guess that it's a fairly safe bet that of the remaining 3.7% of "legitimate" email, probably at least half of it is still spam, and quite possibly a lot more than that.

Unfortunately, my graphs only go back for a year (one of the downsides of using RRDTool for this sort of thing) but look at this:

[[!img mailgraph-year.jpg caption="The last year"]]

The two flat bits in May and Aug/Sept are because my logging software crashed, and I didn't notice (oops), the time before May isn't actually zero it's just that the current levels are blown out the scale.

And remember this is just my relatively unknown personal server, I can't even imagine what AOL or Gmail are dealing with.

journal posted on 18 Oct 2006

Copyheart 1994–2024 Adam Shand. Sharing is an act of love.