In the heady days after “A Plan for Spam”, developing anti-spam software was an exciting field full of rapid developments, as giants of computer science and mathematics like Tim Peters, John Graham Cumming, and Bill Yerazunis came up with new ideas to identify spam. In those days, it felt like a constant race with the spammers – they would come up with a new trick to fool our software, we’d figure out a counter to the trick, and so on – building better mousetraps all the way down.
These days, everything seems a lot tamer. Spam is still a big problem, but effective filtering gets rid of most of the bad mail, and to be a great product, it’s critical to offer not just filtering but a wide range of additional functionalities, like integration, configurability, flexibility, and responsive support. Spammers still come up with the occasional new trick, but the innovation from the bad guys has really moved to other areas, like social media.
So does this mean there are no longer any risks in the email protection industry? Not at all.
Sender IP Reputation
One mechanism for detecting spam is considering the reputation of the sending server’s IP address (through DNSBL and the like). With storage getting cheaper and faster every year, it’s quite feasible to have information about all four billion IP addresses and extremely efficient to do a check like this. But wait – that’s IPv4; with IPv6, we need to consider 1038 addresses – even if you treat a full /48 as a single location, you’re still looking at a vast number (over a septillion!) of addresses that a spammer can send from.
IPv6 has been around for nearly 20 years now – but in recent years, we’ve seen more systems starting to actively deliver mail (both good and bad) from IPv6 addresses. As more servers transition over to IPv6, negative sender IP reputation becomes less and less effective as a method of identifying spam.
The industry is lucky here that the IPv6 transition is going so slowly, as it gives us time to figure out ways to combat this vulnerability. In particular, positive sender IP reputation is still a useful indicator, so we can move from recognizing mail as spam, to recognize mail as not being spam.
Unfortunately, initiatives to formalize this reversal (like ipv4whitelist.eu), haven’t been successful so far. It looks like a decentralized, automated approach – likely one that is very challenging for newcomers, will be what ends up in place in most systems.
We are the problem
In a way, one of the largest vulnerabilities comes from a conflict between the services that companies provide. Although botnets of end-user systems still exist, with the hardening of end user operating systems, and the move away from unprotected dial-up to something behind a router that probably has at least a basic firewall, this is much less of a problem than it once was.
Today, the bulk of unwanted mail comes from compromised servers. Unfortunately, it’s simple to set up a basic install of a product like WordPress without having the necessary security procedures in place (like rapid security updates) and end up an unwilling source of bad mail. Companies that provide hosting facilities need to allow their clients to send legitimate mail, and to install software, but also need to ensure that they don’t end up a source of spam or malware. A large hosting company will often be a significant source of spam, while at the same time they are actively filtering out inbound spam.
In many ways, the onus of filtering is moving from the recipient end to the sender end. You’d think this was much simpler – no more need to track down some bad player in a country you’ve only heard of in pub quizzes. However, when filtering mail at the sender’s side, there’s a lot less information available, so distinguishing between good and bad mail becomes much more challenging.
In addition, these servers aren’t generally sending out the personal, unique, messages that we’re all happy to receive (and indeed, a lot of those are now coming through Facebook, Slack, Twitter, or Instagram). Instead, they are commercial mail, mailing lists, notifications about the latest new startup, and product announcements. Most organizations are doing a good job of ensuring this mail only goes to people that have asked for it – but it does look a lot like the mail that people don’t want – and it’s often mail that one person will want, and another not want. Deciding what to do with this at the sender’s end is much harder than at the recipient’s.
Invasive tracking, data leaks, and spies – oh my!
Email protection isn’t just about preventing people from sending or receiving spam and malware. Increasingly, organizations are being asked (by governments, generally) to keep more information (either logs and other metadata or the actual full messages themselves) while at the same time coming under sophisticated attacks from a range of sources (some unabashedly bad, and some that believe that they are acting in the world’s best interests).
ESPs and companies that provide email services, especially those with a SaaS model, need to be more and more focused on their own security, both software and processes, to be sure that everything possible is being done to ensure that user data is never exposed, and that obligations to retain information are fulfilled. With every email leak, more users migrate away from email to a walled garden alternative, or decide that using an external service provider is an unaffordable risk.
(Increased security can make filtering harder, too. So far, end-to-end email encryption, such as with GPG, has never caught on, as it’s too complex for most users. However, we’ve seen that in modern environments, where the majority of devices are mobile and more restricted, it only takes one company to decide to make an improvement – like defaulting to TLS for all HTTP traffic – to have a phenomenal impact. If Apple or Google decided to make email encryption good enough to lick, there would be much less information available for a SaaS filtering solution).
One person’s vulnerability is another’s fascinating challenge …
I do miss the old days of rapidly chasing (and overtaking!) spammers – the challenge of a building a new filtering technique every week was invigorating. However, the challenges that confront the industry (and the above are just three) now that it has matured are a lot harder in many ways, and I’m glad that I’m working on those, along with many talented colleagues here and elsewhere, as an older and wiser person.