Add to your news feed

RSS Feed

Instapaper requires you to authenticate to add a feed.

Please use the form above to enter your username and password

We do not store this information in any way

A system for sifting: mining email

Valuable data lies in your email archives, and it's maddening that something you once came into direct contact with can be so hard to retrieve. Sadly, there's no dynamite to help you dig through the rock.

Pebble by pebble, your email inbox has collected enough important information to amount to a small mountain of data. So much so that sometimes when you're looking for one particular item you feel like you're chipping away at that mountain with a spoon. There's also an added difficulty when the information you're looking for is an attached file: all too often does the accompanying email have no keywords or subject lines related to the file.

First off, we need to address the inefficient tools on hand: searching and browsing.

Searching for keywords using the search engine native to our reader - even on Gmail's world standard known as Google - seems to yield poor results. This system requires us to know precisely the context parameters for the information to be extracted. It's common for an attachment to be included in a nondescript email, or for unique info to be casually dropped on the body of an unrelated email.

Browsing is a time-intensive method, and demands the patience of panning a stream for gold nuggets. All this looking around has beneficial byproducts, however: it's easier to find related or additional information before reaching the target; and sometimes breadcrumbs can be struck upon as we prospect our way, hinting at better search parameters.

Both of these tools present themselves as having each their own pros and cons, then. But just as in actual mining you cannot hope to strip a mountain of its rich veins merely by using a pickaxe, in this endeavor you must also set up an infrastructure to gain access to the deeper strata.

Labeling context – Most email clients will allow you some sort of system for attaching native labels to emails, or even cataloging by color. You might have fifty emails concerning the same subject, but devoted to different projects. An extra second spent tagging them according to their context will open up tunnels on that mountain, and enable you to reach the core of your information nuggets faster.

Subject line flagging – When it comes to attached files, the email's subject line must have very clear keywords pointing to its contents. Granted, you're mostly limited to the emails you compose on this one, but nothing stops you from rewriting the subject line when replying to an email with an attached file. These “canaries” are your warning sign that something may have been lost along the way, at the very least.

Granted, all these flourishes seem to be taking up valuable work time, but it doesn't compare to lose information that is actually just waiting to be retrieved. Information extraction doesn't have to be a jump into a dark pit, as long as you don't forget your battery lights.