How many host are the SPAM coming from ?
19 January 2007 23:48
To see what bad hosts on the Internet is sending all the Spam to my mail server, I have collect 973 HAM mails and 3001 SPAM mails from 45 hours of mail traffic to run some home build statistics on to get a more clear picture of where mails are coming from.

What i found was:

973 HAM mails and 3001 SPAM mails was received from 2543 different host on the Internet

3 of the host sent both HAM and SPAM
2226 of the host sent only SPAM and not any HAM
314 of the host sent only HAM and not any SPAM

As you can see in the graphic below a lot of the bad remote host only sent 1 Spam mail and then disconnect the connection, so that indicate that most SPAM are coming from mail bots and not from open mail relays..

That is why IP RBL is not working very well any more to stop the SPAM as they normally only have bad mail servers listed in them...
Maybe smart to use this to build my own automatic IP blacklist by collect who is sending mails to my server, like if a remote host has not sent any HAM mail to my server in 3 months and it has sent more then 10 SPAM mails within one month, so then it will get blacklisted. I have to collect a lot more data to see if this is a good idea..... 45 hours of mails is to little data to base it on....

Phishtank is out, Another now cool anti spam app is in.
19 January 2007 18:26
After testing Phishtank URL data-feed as a way to stop SPAM by matching the URLs in the mails with the data-feed, i have found that this way is simple wast of CPU time as the URLs was to random and Phishtank is not taking that into the count..

But i am now testing another well-known anti-spam application as a plug-in for Mailsweeper and this is working very very well, in fact better then i had anticipate it to do and the best thing of all..... it is free, but i have still a lot of work to do before i will make the information public here at my website. :-)

PhishTank not worth the CPU time it use
4 January 2007 21:05
Follow-up on blog Tooms is diving into the Phish Tank

After testing some more i have not been able to get the hit rate better, there is simply to much random things in the URLs for this to work well.
For this to get better then there most be some standards for what to do with random things like if the hostname in the URLs is detected as random then it will be change to the hostname random.domain.tld..
so URLs like this "http: // some.thing.domain.tld/"  will be change to "http // random.domain.tld/"

So the idea need more work before it will be good.

In fact some years ago i had a homebuilt plug-in for mailsweeper there was matching URLs with a local database and the database fill with URLs from some honeypot accounts on the same server, it work well but the database got very big and it cost a lot of CPU time.. so it can work but need more work to build smart code there can detect for random things in the URLs... So the idea is not dead, maybe some day I will have a public plug-in there is matching URLs with a database.

