Here at LinuxMagic, we have developed and worked with many Anti-Spam technologies, and participated in both Anti-Spam Task Forces, and Email Best Practices discussion groups, and this has led us to some of our basic philosophies. And as such, we have developed and worked with tools to deal with Spam problems worldwide. This is our opportunity to share these philosophies, and tools with the general public, and allow contributions from the same, so that we all can gain their benefits.
We believe in several fundamental concepts.
Okay, I see the pundits lining up. What do you mean "filtering is bad"? Well, let me start by quoting a source.. "Filtering is a game between the spammers and the Anti-Spam vendors, put in a rule, get around it... ". This is an over simplified statement, but it does have some merit. Not only is filtering a very complex endeavor, and projects such as SpamAssassin and Bayesian filtering have to be commended for their work in this area, but it also requires large overheads, and only applies to Email Processing. We have the philosophy, we can 'choose' to control the 'privileges' that allow other mail servers or persons to send us mail. And this doesn't need to first look at the contents of what is being sent to decide that. (Disclaimer: Don't get us wrong, we still use filtering ourselves, but we concentrate on pre-filtering technologies)
A digression, first of all there is only really 3 types of Spam.
And determining if a sender is in one of the above groups, does not depend on the contents of what he sends, simply that is unwanted or undesirable bulk mail. We can use other techniques, pre-data to determine if a sender falls into the above, and as such, we simply do not allow them the privilege of sending email to us. But let's look at the above a little more. First of all, commercial spammers are handled most efficiently, by documenting them when we find them, and refusing to allow them to connect to our servers. This is most effectively done by 'blacklisting' (See Blacklisting Discussions and BMS Technology). We can even go farther as to stop connections from networks known to willingly harbour such content. Another technology which can pro-actively often help in this area is 'rate limiting' (See Rate Limiter)
But the second method is getting to be the worst behavior, as they come from innocents, however we can take the high road and say that no one should get the privilege of sending us mail, unless they know how to, and are operating a 'Best Practices' Email server.
And we have some technologies to control that, but especially anything we can do to perform 'Mail Server Profiling' assists us in determining if the connecting machine does conform to that requirement. One of the main ways to determine that, would be to list all the network addresses that should not be running mail servers, and some attempts to do so have achieved varying successes, from having network operators submit such addresses to DUL lists, or to perform even more radical approaches such as outbound Port 25 blocking, however in many cases, very large networks have taken neither approach. It would be nice to just say, those networks we will refuse traffic from, but this does not work in the real world.
But their are other approaches to do mail server profiling, and one main technique is comparing the DNS addressing schemes to known regular expressions, to see if the address is a generic access point connections, ie 1.1.1.192.some-domain.com. Best practices for operating a mail server is that the operator should ensure they have a more informative reverse entry, showing info on the responsible party, ie mail.some-domain.com or gateway.some-domain.com. And to assist in this, we have initiated an Open Project to record examples of such naming conventions that represent 'general access points' instead of legitimate mail servers. (See Dynamic DNS Regex Project)
The 3rd example is more difficult, and can arise from many reasons, from sys admins who do not monitor outbound email traffic, to compromised servers and open relays, to companies that are not able to control the issue. And often very prestigious ISP's fall into this category, including many well known 'Free Email Services' that are abused by Spammers. In this case, it is very difficult to take the high road, and just block the whole ISP, as many people may want to receive legitimate mail from such sources. You COULD block them, and allow users to whitelist, but this is often an unpreferred approach, so this is often where filtering mechanisms tend to be the fallback technology, and encouraging ISP's to adopt outbound rate limiting etc, can help. It is in this area the most work needs to be done, and applying 'privilege' rules more difficult.
However, by using 'privilege' based controls, such as RBL's, blacklists, and Mail Server Profiling can reduce the amount of incoming Spam by upwards of 90% so that IF you need to use filtering technology, at least you can achieve a significant performance improvement.
Note: One of the best examples was a recent client who switched to our technologies claimed to have reduced inbound mail from 2 million message per day, to 60 thousand messages per day, that his servers had to handle.)