n this article, we’re presenting “passthrough,” a powerful technique for e-mail filtering I’ve never seen described by others. I particularly like it because it’s so easy to “retrofit” on existing mail servers. I hope passthrough benefits you as much as it has my clients and me.
Here’s the problem passthrough most often solves: A system administrator wants to filter a fraction of incoming e-mail, while handling the bulk of it as normal. The administrator knows a criterion by which he can discard spam, autorespond to information queries, log periodic receipts of updates, or otherwise process the selected item. How, though, can he or she extract only those items from the incoming message stream, and let all the others pass through for normal delivery?
Unix workers have a traditional solution. Most of the world’s e-mail is handled at some point by industrial-strength Unix servers, where several distinct processes cooperate to route and deliver individual items. A typical combination has sendmail routing messages between servers, and procmail delivering messages to individual mailboxes on a particular server. Dave Anderson, president and chief executive officer of Sendmail, Inc., estimates, “Sendmail is the industry’s dominant Internet messaging server, routing two-thirds of the world’s email.” Even Microsoft-based e-mail networks are often gatewayed through a sendmail-based firewall.
As a complement to sendmail, procmail is an 11-year-old open-source application that offers a wealth of filtering capabilities. It’s already in wide use on large e-mail servers, and it’s possible that the largest share of e-mail filtering on the Internet as a whole is currently performed by procmail.Arcane and Inflexible
So procmail solves the problem, right? Not entirely. While procmail has the virtues of being fast, widely available, and relatively secure, it’s also regarded as intimidating to newcomers.
Moreover, its own “recipe” language for filtering is inextensible; in particular, it has no provision for “attachments” or other modernizations of e-mail, which rely on the Multipurpose Internet Mail Extensions (MIME). Procmail is great for sorting through messages with a subject line of “This is spam,” but getting it to segregate only attachments with an extension of .vb has always been regarded as deep wizardry.
Until now. Rather than try to wedge more capabilities into procmail, my approach is to define a recipe which hands over e-mail items to an external processor. The sidebar shows a small but complete example of use of procmail’s ‘w’ flag to pass e-mail through a Tcl filter. I often use the Tcl programming language for such e-mail filters because Tcl is concise, less resource-intensive than competitive scripting languages, easy to learn for other administrators with whom I work, and has good libraries for the kinds of processing I need. However, the passthrough technique is “language-neutral,” and I’ve also employed Python, Perl, Ruby, C, Java, and several less well-known languages. In fact, many of the smallest and fastest filters can be written with just the Unix shell (/bin/*sh).
Does It Matter?
Passthrough is important; every e-mail administrator should know about it. Procmail is in wide use, not just with sendmail, but also as a “delivery agent” for such other popular e-mail routers as Postfix, Exim, and Qmail. Several FAQs on procmail discuss filtering with external processes; see this link for an example.
I frequently come across workers who need more programmability in their filtering, and yet I’ve never seen “w” passthroughs used or explained. In writing this article, I did discover a PGP key retrieval recipe described by Adam Shostack that has used passthrough since the mid-’90s. Apart from that, the only use I’ve observed of “w” is in a procmail recipe idiom for certain kinds of transformations applied to messages. I’d welcome information about other instances.
Passthrough’s greatest disadvantage is that it limits performance. A straightforward passthrough recipe spawns an extra process for each e-mail item it diagnoses. Heavily loaded e-mail servers can’t afford this.
However, passthrough is so simple and reliable to implement, and gives such important benefits, that I use it often. Its flexibility in rationalizing e-mail processing often means that overall system responsiveness improves when I use it. On the one occasion that it turned out to constrain performance, I was able to transform it into a “trickier,” but higher-performance, pipeline. Write me at this address with news of how it works for you.