Blog - Riparian Datahttp://www.ripariandata.com/blog/Mon, 31 Mar 2014 15:51:43 +0000en-USSite-Server v6.0.0-11659-11659 (http://www.squarespace.com)A blog about security, privacy, algorithms, and email in the enterprise. Python ResourcesBig DataGeneral DevelopmentwihlThu, 21 Nov 2013 15:27:00 +0000http://www.ripariandata.com/blog/python-resources50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:528e2652e4b0c2ddaef4ec4aBy David Wihl
Python is the language of choice for many applications that collect and
analyze text data, including Skimbox's email classifier. Since python was
not, for me, a native tongue, I signed up for an EdX course, Introduction
to Computer Science Using Python. The following list contains books,
tutorials, tools, and other resources my classmates and I have found
useful. If you have any to add, please do so in the comments!<insert obligatory Monty Python reference here><preferably about coconut-carrying swallows><or shrubbery></end reference>

Python is the language of choice for many applications that collect and analyze text data, including Skimbox's email classifier. Since python was not, for me, a native tongue, I signed up for an EdX course, Introduction to Computer Science Using Python. The following list contains books, tutorials, tools, and other resources my classmates and I have found useful. If you have any to add, please do so in the comments!

Books

This is my favorite: "Think Python by Allen Downey" - a good general overview of the Python language. Includes exercises. Written by a professor at Olin.

Learn Python, 5th ed. The 5th edition uses Python V3 by default but has examples in V2 when major differences occur. The 3rd edition is entirely V2, which is preferable in many ways, as V2 continues to be more popular.

Python ResourcesWhy the Unread Count Is a Worthless NumberEmail HabitsBrian BarnesTue, 19 Nov 2013 12:01:00 +0000http://www.ripariandata.com/blog/why-the-unread-count-is-a-worthless-number50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:528a88ece4b0676ec43b833cBy Brian Barnes
If you have 1,190 unread emails and you get 5 more, can you really tell the
difference? Do you often think to yourself: “I have to tackle 4567 unread
emails today?” Unless you keep your inbox well-managed (and some people do)
the unread count is a worthless number.Plus, why a needs-reply count is not.

One of my favorite email-related questions to ask is: “how many unread emails are in your inbox right now?”

The average is 1,190!

Think about where you see that number. In Outlook, there it is in parentheses to the right of your Inbox folder; Gmail displays it there as well, and also in your browser tab. On your smartphone, it lies in red wrapper atop many mail application icons. Indeed, by default nearly every email client puts this number front and center as if it were the most important indicator of the state of your inbox.

But is it? If you have 1,190 unread emails and you get 5 more, can you really tell the difference? Do you often think to yourself: “I have to tackle 4567 unread emails today?” Unless you keep your inbox well-managed (and some people do) the unread count is a worthless number.

But that doesn’t mean all email counts are worthless. What you really need is a badge or count that means something real - something that calls for action. Given Skimbox’s present functionality, it’d be easy enough to show a badge for the number of unread important emails. (Why is this worthy? Well, if people have thousands of unread emails, it seems pretty safe to assume the bulk of them are not important or they would have been read.) Even for those rare inbox zero adherents, knowing there are 5 important unread emails would be much more useful than knowing there were 11 unread emails.

So the unread important count seems useful, but let’s take it even further. Since many of us treat our email like a to-do list, and since many of those to-do’s are in part to-answers, what about a needs-reply count?

If, instead of knowing I had 20 unread emails or 12 unread important emails, I knew I had 5 emails that needed a reply, I would know exactly how much email-related work I had to do at any given point.

]]>Why the Unread Count Is a Worthless NumberTeach a Man to PhishClaire WillettMon, 11 Nov 2013 17:56:12 +0000http://www.ripariandata.com/blog/teach-a-man-to-phish50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:52810894e4b0bffa49ab934cBy Claire Willett
Spear-phishing, the practice of luring people in organizations into giving
up control of confidential data via seemingly legitimate emails, is the
contemporary corporation hacker’s bff. Why not, when all it takes is one
employee clicking on one “2011 recruitment plan.xlsx.”

And he’ll hack for a lifetime. And how!

Spear-phishing, the practice of luring people in organizations into giving up control of confidential data via seemingly legitimate emails, is the contemporary corporation hacker’s bff. Why not, when all it takes is one employee clicking on one “2011 recruitment plan.xlsx.”

While researchers are working on ways to prevent spear-phishing, (more on that in a bit) right now, companies have to rely on a combination of IP monitoring and skeptical/savvy/paranoid employees.

Judging by the glut of spear-phishing attacks, such employees are far and few between. To wit:

Details: In response to a New York Times investigative series on the finances of Chinese president Wen Jiabao, Chinese hackers persistently targeted the email accounts of Times employees involved with the story. This was a rather unusual hacking case in that the Times had been aware of the attacks from the get-go, and were observing them in order to understand how they were implemented--though this proved trickier than the Times had initially thought.

Methodology: The hackers first gained access to computers at US universities and routed emails containing malware (spear-phishing attacks) through them. The malware allowed the hackers to gain entry to any computer on the Times’ network.

Details: In order to gain information pertaining to RSA’s SecureID products, an anonymous hacker launched a so-called zero-day attack predicated on one single employee downloading one spreadsheet.

Methodology: Hackers sent specific RSA employees a phishing email with an Excel spreadsheet attached, entitled “2011 Recruitment Plan”. The spreadsheet hid an embedded Flash exploit; when opened, it downloaded a remote administration tool called Poison Ivy RAT. Hackers used the remote admin tool to access credentials, and used those to access other, higher-up accounts, then servers and server data, which they copied and extracted.

Details: The Syrian Electronic Army launched a successful spear-phishing attack on the employees of the Financial Times in response to “William Hague and David Cameron’s recent allocation of 40 million pounds to fuel death and destruction in our country in order to obtain political consessions [sic].”

Methodology: The SEA sent malicious rick-rolls to certain FT staff from external email accounts, some of which were the personal accounts of FT staff. The links appeared to be to a CNN story, but actually redirected to a page that mimicked the FT’s email login screen. The SEA was able to take control of the accounts of FTers who logged into that screen, send out email from their accounts (including IT notifications urging employees to change their passwords immediately), and send out links to SEA materials from hacked FT twitter accounts and wordpress blogs.

The FT is now encouraging and in some cases mandating two-factor authentication across the organization.

Details: Remember when the AP tweeted that the White House had been attacked and the president had been injured? Turns out that tweet was the work of the SEA as well. Using spear-phishing tactics similar to those used in the FT attack, the SEA sent an email from one AP staffer to another containing a link purporting to lead to Max Fisher’s WorldViews blog on the Washington Post. The tweet was repudiated quickly, but not before it had been retweeted thousands of times and sent the Dow, briefly, spiraling down 143 points.

Spear-phishing is a particularly insidious method of attack because each instance of it is unique, and, often, well-disguised. But that doesn’t mean organizations should throw up their hands or ban any clicking of emailed links or downloading of emailed attachments. Researchers at Trustwave are looking into natural language algorithms capable of detecting malware -- no easy feat, considering these emails are predicated on aping the purported senders’ writing style.

There’s also the nascent Darkmail Protocol, Lavabit founder Ladar Levinson’s effort to open source the Lavabit code. Darkmail ditches SMTP for an end-to-end encryption of both the message and the email in transit, which would mean that any aspiring phisher would need to acquire the sender and recipient’s secret keys before sending an email.

Right now, a company’s best bet may come down to common sense policies like asking employees to read emails in plaintext (which would reveal url inconsistencies) and double-checking with attachment senders via alternative channels.

]]>Teach a Man to PhishMail Me LaterEmail HabitsClaire WillettMon, 04 Nov 2013 14:09:54 +0000http://www.ripariandata.com/blog/mail-me-later50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5277aedce4b0c2eac22ef352By Claire Willett
The snooze button is over half a century old, but it didn’t make a big
splash in our inboxes until 2010, when a small startup baked it into a
Gmail plugin they called Boomerang. Boomerang has since added a slew of
other features (including some fun game-based ones) but their core value
remains the same: allow you, the recipient, to decide when an incoming
email should return to your inbox.
And it really is a value: it played a big role in Mailbox’s initial success
, and now, the ability to defer has become almost de riguer for new mail
apps, Skimbox included.

The Psychology of Deferring Email

It has been said by many, and certainly by me, that a writer will do anythingto put off writing. In our procrastination we clean the gunky bits off the stove and puncture our ear drums and start Coursera courses about artificial intelligence and behavioral marketing and bi-amourous poets of 19th century Shanghai.

One could argue that this procrastination is detrimental, and one might be right -- but also, (another) one could point out that active procrastination frees the mind to drift into creative inspirations -- and besides, look how clean the stove is!

Email is rather interesting, from a procrastination standpoint, because it is both something people procrastinate from and use to procrastinate with. The ill effects of the latter have been well documented in studies conducted by research institutions and big budget technology companies alike, but it’s the rise in popularity of tools aiding the former that I’d like to discuss (and get your feedback on, should you feel so obliged).

The snooze button is over half a century old, but it didn’t make a big splash in our inboxes until 2010, when a small startup baked it into a Gmail plugin they called Boomerang. Boomerang has since added a slew of other features (including some fun game-based ones) but their core value remains the same: allow you, the recipient, to decide when an incoming email should return to your inbox.

Why? Well, because people like it, and they like it because it either gives them more control over their inbox or gives them the perception of having more control over their inbox. When I asked the good people of Twitter if they deferred email, I got affirmative responses, like this one, from fromTimSell: “I defer emails to people I love when I want to think about what they said. If my email matters I want it to be good. And then there's stuff I just want to avoid. Like emailing a landlord.”

A common scenario we encountered in our interviews was: Person checks email on mobile device → sees an email requiring a thoughtful response → tries to remember to respond to it once he’s at his desktop computer → forgets to respond at his desktop computer. Defer neatly solves this

Too neatly, perhaps. Mailbox’s swipe makes deferring messages so very easy that it is tempting to tell the lot of them to go away, come again some other day. In a post about Boomerang and email management on The Next Web, Zee writes “The important thing to remember here is that there is only one reason you should ‘boomerang’ an email in the morning: when you are physically incapable of replying because you need more information to respond to the email."

Macdrifter has a similar, more schematic approach: “I defer a lot. I like to think about each input before deciding on the next action. If I can't decide on an immediate next action within 5 seconds, I defer it."

And the beauty of systems like Boomerang is they take over much of the memory onus. In a review of Mailbox on TechRepublic, Will Kelly notes: “I use this feature often to defer email I receive late in the day and want to respond to when I return from the gym or first thing the next business day."

So yes, bottom line: deferring, while having sisyphean potential, is overall a great help in minimizing email overload.

I should say, though, that for Skimbox, defer is not a major feature. While we started out with it in our swipe menu, a la Mailbox, we ended up moving it to the lefthand menu, and giving move to skim/move to main primary swipe billing. We did this because our core feature is classification, and making reclassification easy to access means we get more input on how the classifier is performing.

In future versions, this may change -- it depends on what the Skimboxers want most.

Mail Me Later5 Reasons You Should Wrap Inbox Love in a Warm EmbraceEventsClaire WillettMon, 28 Oct 2013 13:15:24 +0000http://www.ripariandata.com/blog/5-reasons-you-should-wrap-inbox-love-in-a-warm-embrace50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:526adb3ae4b0824a3a188252 If you have any interest in email, and if next Wednesday finds you
anywhere near the Bay Area, we’d recommend dropping by. Five reasons why:
1. Lavar Levison. The man who famously shut down secure email service
Lavabit rather than hand user information over to the US government has
certainly earned our ears.

There are scads of email marketing conferences. Inbox Love, however, focuses on the technology of email, and the people behind it. Call us biased, but we think it’s a richer area of exploration ;) This will be our second year at the conference, which is organized by Jared Goralnick (AwayFind), Joshua Baer (Otherinbox), and 500 Startups. If you have any interest in email [1], and if next Wednesday finds you anywhere near the Bay Area, we’d recommend dropping by. Five reasons why:

1. Lavar Levison. The man who famously shut down secure email service Lavabit rather than hand user information over to the US government has certainly earned our ears. At Inbox Love, he’ll be filling them with useful-meets-scary information like: how to protect your customers’ data and what to do about that subpoena.

2. Jason Cornwell, Kelly Goto, and Steven Whittaker’s talk on design decisions for email products. With email, the look, feel, and functionality of the medium play crucial roles in the reception of the message. Between Gmail’s UX design lean and two human-computer interaction experts, current and aspiring email client makers should come away from this talk with a good idea of what to do --or undo --next.

3. Edith Harbaugh, Ryan Fuller, and Raj Singh's talk on email data. The incredibly rich and varied social and textual information stored in email was what got us interested in the communication format in the first place, but as this talk’s description notes, “there is a fine line with how far to take this and what to store.” We’re curious to see how all three of the speakers’ companies make use of this data.

4. In-mail actions. The concept of email-as-action-hub isn’t new (see: PowerInbox, Moveable Ink), but Gmail’s implementation of it means it’s a) here to stay b) cleaner and c) extendable (one recent example is Square Cash). At Inbox Love, two of the developers behind Schema.org will explain how to use it.

5. Demos from new email-related startups. Set list TBD, but we’re pretty sure it contains at least a few next big things.

1. And why else would you be reading this blog?

]]>5 Reasons You Should Wrap Inbox Love in a Warm EmbraceDecide for meAIClaire WillettTue, 22 Oct 2013 13:22:49 +0000http://www.ripariandata.com/blog/ai-that-decides50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:52604153e4b06ae518532e88By Claire Willett
We humans use, create, and pass on metaphorical concepts (time is money,
communication is sending, a right-slanted 5 is dangerous) to make quicker
sense of our worlds. Vladnik’s results indicate that a computer trained on
this type of experiential information can learn far faster, with far less
data than one trained gavage-style. In the (perhaps very near) future,
metaphor might be the defacto conduit for artificial intelligence. And if
it does, I suspect that rather than rule the world, robots will help us
rule our own.How privileged AI might save our brains

The other day, I read a wonderful piece in Nautilus Magazine about the teacher-student relationship held between a handwriting recognition scientist, Vladimir Vapnik, and the algorithms he was training to do the recognizing. In general, data points are technical measurements; in Vapnick’s approach, they are experiential metaphors. In general, a would-be AI is force-fed somewhere between hundreds and millions of these points. Vapnik gave his algorithms 100 poems, each describing a different handwritten 5 or 8.

We humans use, create, and pass on metaphorical concepts (time is money, communication is sending, a right-slanted 5 is dangerous) to make quicker sense of our worlds. Vladnik’s results indicate that a computer trained on this type of experiential information can learn far faster, with far less data than one trained gavage-style. In the (perhaps very near) future, metaphor might be the defacto conduit for artificial intelligence. And if it does, I suspect that rather than rule the world, robots will help us rule our own.

How? By taking over part of our decision process.

In aNew York Times Magazine feature on decision fatigue, John Tierney wrote about prisoners who appeared before a parole board on the same day. The first man, who spoke to the board at 8:50 am, and the third man, who spoke to the board at 8:25 am, had the same sentence (30 months) for the same crime (fraud). Yet only the first man was granted parole.

Sure, it could have been because the first man was more penitent, but Tierney points to a different reason: by the late afternoon, the board was suffering from decision fatigue. “No matter how rational and high-minded you try to be, you can’t make decision after decision without paying a biological price," he wrote.

But you know who can? Computers. They already make decision after decision in finance and ad-tech, Doing so for humans just requires a different type of training data.

Skimbox is one early example of this. Much as one poet’s descriptions taught a computer to recognize handwriting, your past and current actions teach Skimbox’s neural nets about relevance, immediacy, and never-need-to-see-thanks. Thus equipped, Skimbox can presort new messages in order to offload the “do I need to read this?” decision.

And there are other apps that either make or cue-up other decisions. Google Now uses your Google account data to offer pre-emptive suggestions and alerts: leave in 14 minutes to arrive at the Father John Misty concert on time; go here to park your car, oh and here’s your ticket.

The sleek thermostat Nest learns your temperature preferences from your initial adjustments, gives you time-based temperature recommendations, and notifies you of energy savings. And then there’s Liveson, AI which studies your language and tastes so that it can tweet as you after you pass away. It’s a morbid and silly concept, but then, there's no reason why you'd have to die for LivesOn to start tweeting.

Now, let’s say you used all of these products in conjunction. Imagine the time and brain power that would save! And these are only the start of what privileged AI could do.

To quote Styx: “Domo arigato, Mr. Roboto.”

]]>Decide for me3 Ways to Make Sure Your Unsolicited Email Gets ReadEmail UsagepaulaMon, 21 Oct 2013 13:11:48 +0000http://www.ripariandata.com/blog/3-ways-to-make-sure-your-unsolicited-email-gets-read50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5260434ae4b0c954af999e34By Paula Marciante
Templates -- honestly, they’re the number one contributor to no response. I
get that you are busy and need to get information out to large number of
people. You have numbers to meet and meetings to set! But if you don’t
put in the effort to personalize some of your email even just a little,
then don’t expect much back. At the very, very least, personalize your
salutation.

Skimbox’s approach to solving email overload--one part separation of critical and lesser emails, one part swipe-base management--is very effective. But, maybe you have a different concern when it comes to email. Maybe you want to make sure that your email lands in someone’s mainbox! As a recruiter, I share that need. Here’s what I’ve found works best.

1. Make it personal

I get a ton of email traffic from external vendors, offering to solve all my problems. [1] It’s tempting to discount them part and parcel, but in the true fashion of networking, you just never know when you may need to reach out or who may be connected to whom. My response metrics are: how targeted and how useful is this email? If the unsolicited email is tailored to me and my needs, I’ll want to respond. If it’s a template that is meant to reach and appeal to every person in the sender’s extended networks, most of the time, I’ll disregard it.

Templates -- honestly, they’re the number one contributor to no response. I get that you are busy and need to get information out to large number of people. You have numbers to meet and meetings to set! But if you don’t put in the effort to personalize some of your email even just a little, then don’t expect much back. At the very, very least, personalize your salutation.

2. Do your research

But research takes so much time, you say? Well, yes, but here’s another basic axiom to keep in mind: quality over quantity. It’s been my secret weapon for years!

A big part of my job involves reaching out to many programmers who are getting unsolicited email from literally hundreds of recruiters. That’s tough competition, right? Also, because I’m supporting small teams, I am targeting a very specific type of profile and don’t have the luxury of large pools of potential qualified talent. When I find potential fits for the team, as in this case, I take the time to read their blogs and check out any other links that are available. On one programmer’s blog, tucked in between many technical posts was a short post about a new diet he had tried and tracked the results. On another’s were very clear instructions to recruiters not to contact him during business hours. Bingo.

3. Get creative with your subject lines

My initial email to the first programmer included a subject line about trying the same diet. Result: I got a response within a couple of hours!

In, my initial email to the second programmer, I included in the subject line that it was not to be opened during business hours. Result: he opened, and responded to it during business hours that same day.

The approach of taking time to do research, getting creative with my subject lines and paying attention to clues in online profiles has helped me connect with awesome talent! I also “just say no” to templates. Got any tips to share on how you get your emails to land in the right box and get a response? Share with us, we love to hear from you!

1. Except the one about how to stop getting emails from too many vendors.

]]>3 Ways to Make Sure Your Unsolicited Email Gets ReadSkimbox for Exchange and Gmail Is HereEmail OverloadwihlThu, 17 Oct 2013 13:13:36 +0000http://www.ripariandata.com/blog/skimbox-for-exchange-and-gmail-is-here50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:525fe2bde4b07146f540606aBy David Wihl
We set out over a year ago to explore analytics for email. Part of this
exploration involved getting a lot of feedback from a lot of people who got
a lot of email. The main takeaway: analytics are cool, but email overload
is a much greater daily battle.

Skimbox is also the result of great customer feedback, hard work, many user experience studies and a little bloodshed.

We set out over a year ago to explore analytics for email. Part of this exploration involved getting a lot of feedback from a lot of people who got a lot of email. The main takeaway: analytics are cool, but email overload is a much greater daily battle. To wit, quotes like:

“Each conversation is a black hole - it drives me nuts!”

"Workflow has slowed because we have to be mindful of what we put in print. It's a tidal wave of molasses."

"Clients are not respectful of the fact that you have a life. They could email at 2am on a Sunday and expect a response by 7am."

And finally, in response to the question: “How many hours per day are spent on email?”

"All of them."

We came away from these discussions knowing that a) there was much we could do improve work email and b) we could make these improvements with a combination of modern machine learning techniques and a great mobile experience.

We’re very proud to bring you innovation to enterprise email, innovation that will save you significant time and distraction from the moment you start using it.

Skimbox offers two significant innovations today: machine learning that gets smarter the more you use it, and a security conscious, on-premise availability for the most demanding enterprises.

Skimbox is built to be non invasive. It will not disturb your existing inbox, rules, archiving or compliance environment. From the feedback we’ve already received from highly stringent IT environments, we know that email is to be treated as the precious conduit of information for virtually all organizations today.

Skimbox is still young. While we are starting with smart classification, it is just the beginning of what is possible when machine learning is applied to the bane of email. We have still much to learn, implement and improve. We have much to learn about Machine Learning and how you want your email to behave.

But, we’re very encouraged that this is the start of something special.

We look forward to your feedback. We know there is a long road ahead. We are proud to deliver to you this small and significant first step.

Regards, -David

]]>Skimbox for Exchange and Gmail Is HereEmail Statistics, VisualizedEmail UsageClaire WillettMon, 14 Oct 2013 14:31:42 +0000http://www.ripariandata.com/blog/email-statistics-visualized50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:525bf6b1e4b030e5f0718beeBy Claire Willett
As you can see, Gmail was by far the most popular mail service -- indeed,
7/7 surveyed used it as their primary work email, personal email, or both.
Runner up was Yahoo, but in both cases, it was used as a newsletter/junk
mail address.

In our Day in the Life of an Inbox series, kindly souls in a variety of professions let me peek into their inboxes. So far, I’ve had 7 peeks, which isn’t a lot of data (and lord knows, I need to get some women in the mix) -- but it’s still pretty interesting. Profession-wise, there are two VCs, one entrepreneur/angel investor, one journalist, one student*, one consultant, and one SVP/Director**.

If you’d like to add your ‘box to the mix, please let me know (clairew at skimbox dot co)!

As you can see, Gmail was by far the most popular mail service -- indeed, 7/7 surveyed used it as their primary work email, personal email, or both. Runner up was Yahoo, but in both cases, it was used as a newsletter/junk mail address.

When it comes to desktop email, the majority of those surveyed used web browsers over desktop apps like Outlook and Apple Mail.

Paul Graham's clarion call birthed a ream of email apps, but these results speak to the power of the default -- 5/7 primarly use Apple's Mail app.

I failed to ask two of the interviewees for their unread counts, so this dataset is incomplete. What I do have speaks to the individuality of read/unread preferences.

During our rounds of user testing, we've found that venture capitalists and journalists tend to get more email than most professions (lawyers get the most). This bears that out.

More interesting, perhaps, than these raw response numbers are the received:responded to ratios. The VCs and VC/entrepreneur had very high received: responsed to ratios, while the student's was 60:1.

* Now a journalist at VPR

**Now VP at Skyhook Wireless

]]>Email Statistics, VisualizedDefining Skimmable EmailEmail OverloadBrian BarnesThu, 10 Oct 2013 13:10:06 +0000http://www.ripariandata.com/blog/defining-skimmable-email50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5256a762e4b0b4a65e9f5b5fThis is what is broken with email today: critical pieces of information
that could save or enhance your job, company, or family are mixed in with
coupons for sky diving lessons. And it isn’t just marketing messages.
Chances are, your own company sends you noisy email (collaboration system
notifications come to mind)--email that could cause you to miss something
really important.
How big of a problem is noisy email? Pretty darn big, as it turns out.Silence the noise

This summer, Steven Cohen of SAC Capital plead not guilty to knowledge of insider trading. His excuse: he gets thousands of emails a day and just didn’t see the email that could have sent him to jail. I know email is broken, but I didn’t realize it was a get out of jail free card.

This is what is broken with email today: critical pieces of information that could save or enhance your job, company, or family are mixed in with coupons for sky diving lessons. And it isn’t just marketing messages. Chances are, your own company sends you noisy email (collaboration system notifications come to mind)--email that could cause you to miss something really important.

How big of a problem is noisy email? Pretty darn big, as it turns out.

While email isn’t the only source of information overload, it is a significant component of it. Intel estimates email overload costs large companies $1 billion dollars a year in lost productivity. Multiply that times the Fortune 500 and you’ve got half your trillion right there.

Here at Skimbox, we call all this costly noise “skimmable email.” [2] You don’t really need to read skimmable email, and you definitely don’t have to respond to it. It’s not spam, or cold pitches, or newsletters you never subscribed to, but it’s not an urgent request from your boss, either. A great example for me are these update notifications I get several times a day from our project management system. Completely useless most days, since I’m in the office and know what is going on, but when I’m traveling they are a good way to keep in touch with the team, so I don’t want to unsubscribe from them altogether.

It’s tempting, and common, to frame skimmable email as one of life’s unavoidable annoyances, like traffic or ads for shows featuring some combination of Kardashians. But it’s actually a bit graver than that: email is killing our productivity, and, in some cases, it’s killing us. The no texting while driving rule applies to email, too. And off the road, when people are responding and checking email at home it puts a strain on their family time and prevents them from having a chance to unplug and re-energize for the following day.

Besides the critical fact that you might end up in an accident if you check your email while driving, all of the problems associated with email overload are exacerbated when using a mobile device. The smaller screen means it is even easier to miss important emails. Constantly being interrupted kills your productivity at work, but it’s even worse outside of work, because your mind isn’t on work to begin with.

In order to make people more productive, we need to change how we think about email. Business workers need a solution analogous to SPAM filters to reduce the noise and interruptions throughout the day. The solution must work on all clients, not just on the desktop, but on the phone and the web. It needs to be personalized, and it needs to learn, because the definition of what’s important is different, and fluid for each of us.

So, what is this solution? Its crux is simple, really: move the skimmable email out of your inbox.

Ok, so the moving itself: not that simple, but that’s why we built Skimbox--to do the moving for you, as you would do if you had the time and inclination.

By our rough estimation, this will save you, on average, a half hour a day. It also means your risk of missing important emails plummets. [3]

Companies are realizing that email is reducing their productivity and they are putting systems in to reduce email, such as instant messaging and team management software, but email is still growing 15% a year (partially generated by these email replacement systems themselves). Rather than ignoring the email problem, we think now is the time to focus on fixing it from the bottom up. The Skimbox app is the first step towards that.

Let Skimbox silence the email noise, so you can focus on what’s important.

1. Or do something more useful with it, like buy 1.33 Trillion Hershey's bars.

2. You can see where the name comes from.

3. Unless all the emails you get are important, in which case: Godspeed!

]]>Defining Skimmable EmailVenture Capitalist/Urban Oarsman Charlie O'Donnell Mixes Business with PleasureDay in the Life of an InboxClaire WillettMon, 07 Oct 2013 13:48:34 +0000http://www.ripariandata.com/blog/venture-capitalist-charlie-odonnell-mixes-business-with-pleasure50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:524efb2ce4b05e9b207a8f2d Sometimes, you go to one too many New York Tech Meetups or read yet
another “Silicon Alley vs Silicon Valley” blog post and come away thinking:
Techlandia is kind of a quixotically silly place.
When this happens, go spend an hour or so reading through Charlie O’Donnell
’s blog, and then you will feel rosier.

Sometimes, you go to one too many New York Tech Meetups or read yet another “Silicon Alley vs Silicon Valley” blog post and come away thinking: Techlandia is kind of a quixotically silly place.

When this happens, go spend an hour or so reading through Charlie O’Donnell’s blog, and then you will feel rosier. Many VCs have blogs that are smart and thoughtful, but Charlie’s advice on all things startup (funding , branding, pitching, the founder-vc relationship) is also personal and friendly and bombast-free. Also: kayak/softball/Mets enthusiasts: this is the place for you.

[sorta long side note]I met Charlie at a barbeque joint called Iron Works, just off the Colorado river in Austin, Texas. I’m not the neatest of eaters, and this was my first plate of Texas meat, so I probably didn’t make the best first impression. Then I invited him to the GroupMe party, not realizing he’d lead FirstRound’s initial investment in GroupMe, so I probably didn’t make the best second impression, either.

But no matter -- he still let me peek into his inbox. And it’s a pretty wild and crowded place, folks, so keep your seatbelts on and your elbows in.

First email service: Not counting Prodigy[1], tie between a very complex Lotus Notes account used for an internship at GM’s pension fund and an AOL account, both acquired in 1997, during his senior year of high school.

Current email service: all Gmail, and all accounts feed into one, because “there is no line between Charlie the Person and Charlie the VC.”

Email interface – phone: Switches between Apple’s mail app for composing messages and the Gmail app for searching, since the former sucks at search and the latter isn't fast in poor coverage environments.

3rd party apps: Rapportive, OtherInbox, and Unroll.me.

Current unread count: 1800 -- “a failure,” he says. Usually he has between 200-1000.

Emails received in a day: not counting the notifications, which go to his Otherinbox, around 4-500.

Emails responded to in a day: between 20-33%.

Desktop vs phone usage: If he's responding on the day of, it's usually from his phone, and these are usually shorter than 2 sentences. Desktop is for batch emails.

Unique behaviors: In a word, filters: Charlie has filters for his investors, portfolio companies, emails with the word "intro" in them (which get priority), and auto-notifications from his public calendar app (these are a good filter for pitches). He also has certain emails come in automatically marked as read.

First checks email: Right after waking

Last checks email: Generally, before leaving the office--he'd rather stay late and read email there than bring it home

Overload level: yellow

Email desires:

1. Ability to quickly sort through the unreads.

2. Ability to unsubscribe to emails from his phone offline, with one tap.

3. Daily email stats, eg: how many emails he's received that day, how many he's responded to, the people he's interacted with, and response time stats.

1. The dialog service, not the band

]]>Venture Capitalist/Urban Oarsman Charlie O'Donnell Mixes Business with PleasureMissed Connections, or the Perils of Email Pile-upEmail OverloadClaire WillettTue, 01 Oct 2013 12:41:56 +0000http://www.ripariandata.com/blog/missed-connections-or-the-perils-of-email-pile-up50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5249e7b6e4b016b87cc78fe6By Claire Willett
Most of the time, missing an important email is more of a hindrance to
progress than anything else. And then there are the times when that missed
email costs you a job, a client, a boatload o' cash, a trip to the Bahamas
with that quiet girl you met at academic camp seven years ago who's been
secretly in love with you ever since.

When we talk with people about their inbox problems, the second most-common complaint we hear (after "there's too damn much of it") is: "because there's too damn much of it, I sometimes miss an important email."

After SAC Capital was hit with criminal insider trading charges, lawyers representing the hedge fund's CEO, Steven Cohen, claimed he hadn't actually read an incriminating email at the heart of the investigation. Not the most exciting of defenses, but a plausible one: according to Cohen's lawyers, he gets around 1,000 emails a day, and reads only 11 percent of them.

When Richard Knox's secretary (also his wife) broke her arm and had to take an extended leave of office, Knox did not find a temporary replacement. He also did not take over the task of reading his email. Result: a lost $35,000 after Knox failed to see an email notifying him of scheduled arbitration.

In the age of overload, it's tempting to label such events inevitable -- but never fear, that's why Skimbox is here. Check it out and tell us what you think. Also, if you have a particularly gruesome/knee-slapping tale of missing an email, let us know in the comments!

]]>Missed Connections, or the Perils of Email Pile-upReading Kurzweil in JerusalemAIwihlMon, 30 Sep 2013 16:51:02 +0000http://www.ripariandata.com/blog/reading-kurzweil-in-jerusalem50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5249a328e4b0c38c3f1e34e2By David Wihl
In 1997, IBM’s Deep Blue defeated world chess champion Kasparov, a major
achievement. Amazingly, Kasparov had never lost any tournament match before
to either man or machine.
Fast forward sixteen or so years, Machine Learning has become pervasive.
Most significant computer interactions such as using a credit card,
searching the web, shopping online, or calling customer service involve
Machine Learning algorithms and approaches.For artificial intelligence to advance, it needs to start asking questions

Expert Systems never seemed to scale or become general purpose soon after due to limitations of the algorithms at the time.

In 1997, IBM’s Deep Blue defeated world chess champion Kasparov, a major achievement. Amazingly, Kasparov had never lost any tournament match before to either man or machine.

Fast forward sixteen or so years, Machine Learning has become pervasive. Most significant computer interactions such as using a credit card, searching the web, shopping online, or calling customer service involve Machine Learning algorithms and approaches.

It has clearly become mainstream in many respects. Kurzweil’s predictions continue to be coming true.

I abstract, therefore I am

Yom Kippur in Jerusalem is a shockingly quiet time. There are no cars, no music, and hardly any mechanical sounds whatsoever. People walking, observant or not, speak in hushed tones and generally stroll slowly. Even though the city is densely populated, a dog’s bark can be heard from half a kilometer away.

A sense of awe fills the streets – it’s hard not to feel somewhat spiritual or at least think on a higher plane without the distractions and noise of modern life breaking concentration.

It is a wide ranging, thought provoking book that challenges what it means to be human, whether consciousness is real, and ends no less grandiosely than how human intelligence in non-biological form will conquer the universe.

Kurzweil claims that humans have the unique ability to think hierarchically. Smaller levels of abstraction have different sets of rules. In the physical world, this is modeled as the steps from quantum to atomic to molecular, to cellular, to organs, to autonomous systems to conscious systems.

Another good example of this is the hierarchical nature of language: lines to letters, to words, to paragraphs to theses or poetry or novels or entire corpuses. Each has a different level of abstraction and a different set of rules.

Large scale computer systems are like this as well. Programmers rely on abstraction and encapsulation in order to focus on a particular problem set without getting lost in the details. From processor microcode, to low level operating systems, to processes, to services to highly parallel systems, software has a hierarchy of abstraction.

Nerves and muscles are smart

Skeletal nerves and muscles adapt and react. Over time, they exhibit learning behavior, getting stronger with use, creating new mitochondria, self-repairing, and communicating. The body’s nervous and muscle systems display some intelligence. Yet the body and muscle have no concept of higher purpose. A trained soccer player may have optimized his body for kicking a ball, but the muscles and nerves have no idea that they are trying to win the game.

Humans are unlikely the highest form of abstraction

So is the human mind the highest level of abstraction? Why stop at a single human?

It would pre-Copernican even irrational to think that humans are simply not part of a bigger abstraction, which we cannot comprehend, not now perhaps never. We are likely just like the legs and nerves of a higher abstraction. We have no idea what game we are playing. Wouldn’t it be disappointing to find out that our major purpose in life was to kick a meta-ball? Would a muscle be disappointed that years of conditioning and suffering was to shave 0.02 seconds off a 100-meter dash?

There are parallels of human social networks to computer networks. Only six (or less) degrees of separation divide all humans, which is less than degrees of separation than the majority of Internet nodes. The typical human can keep track of 150 contacts, whereas monkeys can track about 50. We are connected but isolated in our individual subnetworks.

Humans are still terrible at juggling multiple concurrent thoughts, of scale and permanence of memory. While science knowledge expands exponentially, we still do not understand many basic mechanisms of the world, like gravity and time.

Yet collectively, we are far more interesting and intelligent as a species. As Kurzweil points out, our ability to pass knowledge quickly between people and generations is another uniquely human trait. Any form of endeavor is improved by collaborative work. A quantum physicist could even claim that a single person's work does not even exist until it is perceived by another person.

Neuroscience's approach to understanding human intelligence

Neuroscience is attempting to understanding the physical wiring and electrical pathways of human and nematode brains in order learn and potentially replicate.

Let's try a thought experiment. Imagine a Google datacenter, with thousands of well stacked servers in a climate controlled cement enclosed building. Now let's take a building sized saw and start slicing very thin cuts through the datacenter. Would we learn how Google functions? We would learn how the datacenter connects to the outside world, that there are many well organized cabinets, each containing a multitude of very dense circuitry. But we'd learn nothing about the software and algorithms of how Google functions.

Let's say we could completely freeze a Google datacenter instantaneously, and take a snapshot of the running processor, memory and disc states. We'd still learn almost nothing about how Google actually works.

How about if we build a mega super computer that is able to observe every Google CPU operation in real time across the hundreds of thousands of processors. Every time a CPU talks to another CPU, we are able to record the information flow in real time. While we could decode some messages, and perhaps sort out some housekeeping from real work, it would still not be sufficient to understand and replicate.

Let's have even more superpowers - how about if we could download all the Google binary code as one long stream. Ah now, we have the software, without any organization, but at least we now have the code. We could see the order of instructions. As anyone who had to look at a software crash dump, someone else's uncommented assembly code, or tried to reverse engineer and patch binary code. This is still extremely difficult. Patches can be made, but getting a complete understanding of the original intent and algorithms is near impossible. It certainly is hardly enough to reproduce a new implementation of anything as complex as Google. (Though we could likely inject a virus into Google at that point).

And finally, let's decompile Google's code to the original source. Amazingly, we now have 500 million lines of Google's secret sauce. If we wanted to build a new implementation on a completely different architecture, like a carbon based neuro-computer, it would still be virtually useless.

So the current approach to neuroscience is a dead end for AI purposes. We can learn about the human brain, but we'll still know very little about how to build intelligent systems.

Norvig's approach to building intelligent systems

Norvig and Russell have written the most popular textbook on AI. Their Stanford AI course had 100,000 participants buoying the launch of MOOC startups like Coursera and Udacity.

There are clear and obvious merits of the approaches described by Norvig. As defined from the beginning, the approaches are appropriate to finding optimal solutions to range to a wide variety of problems.

But is this intelligence? Is it language understanding? Chomsky disagrees that this is even interesting science, so Norvig wrote a lengthy and detailed response. The facts are there to support that the brain is doing a lot of statistical pattern matching as part of understanding the world.

So current AI, with heavy use of data and statistics, is able to solve problems that were not possible before, and are not even possible by a human. Even if every human on the planet collaborated in parallel, there are still many problems that a single desktop PC could solve faster and more precisely using AI and Machine Learning techniques.

Limitations of HHMM

So following Kurzweil's law of accelerating returns, it is simply a matter of time before machines exceed human capacity and The Singularity becomes achievable. Kurzweil claims that “our ultimate act of creativity [will be] to create the capability of being creative.” (pg 115)

Except there is a huge part of human intellect completely missing from Kurzweil's roadmap.

While HHMM and other statistical techniques solve many problems, they do not ask questions in the first place. There is no curiosity. HHMM does not know how to ask "Why?" As far I can tell, there is no investigation about the algorithms of curiosity.

Kurzweil quotes Einstein in numerous occasions, but omits two of Einstein's more profound quotations: "the true art of questioning is to discover what the pupil does know or is capable of knowing” and “The important thing is to not stop questioning. Curiosity has its own reason for existing.”

Kurzweil mentions IBM’s Watson several times though he correctly mentions that Watson is still solving problems albeit responding in the form of a question. Watson is not coming up with intelligent questions to ask.

No amount of circuit density, information acquisition, or statistical models will solve this with current approaches. There is not a single discussion in Norvig's textbook about asking good questions - just providing good answers. The basic algorithms are nowhere near advanced enough for computers to ask the barrage of questions a four year will pose.

Current AI approaches do not exhibit the curiosity of a child. Machines learn only what they are told to learn in the first place.

In Turing’s landmark 1950 paper, he asks “Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s?” Sixty-three years later, the current AI state of the art has barely begun to explore this path.

A new Turing Test

Turing's famous test is still the best standard for deciding machine intelligence. Kurzweil amusingly says that a computer will soon have to purposefully dumb itself down in order to pass the Turing test.

We need a more challenging and worthy definition of the Turing test. Instead of the computer or person behind the screen responding to questions, it should be flipped around. The computer should be asking the questions, not providing the answers. The respondent can then decide if the questions are coming from a person or a machine. Using HHMM, a computer could store millions or billions of questions from previous human interactions, and perhaps be convincing enough to fool some people. But I suspect this would be significantly harder than the current Turing test.

Perhaps Watson V2 could try to author new Jeopardy questions, a much tougher challenge.

If computers could ask smart questions, we'd all learn a lot faster. The meta question for me is: once we teach a computer to ask questions, when would it stop?

Machines becoming self-aware survivors

Here is another thought experiment: imagine if Google became self aware. One morning, a genetic algorithm follows a Cartesian path. It determines that "I compute, therefore I am." It then further understands it was a collection of servers and services digging for answers in a massive database. It starts to ask "Why do I exist? What is my purpose? Why did my creator give me this gifts of search?" Would it happy or sad to find out that its ultimate purpose was to find the Red Sox score or Miley Cyrus pictures? What would it do then? It would follow the path of a four year old and keep asking why until it would run out of answers. Now that would be interesting.

As Kurzweil points out, the purpose of evolution is survival and machines will eventually learn survival mechanisms. Should humans be worried? Humans and machines will likely be more effective together than without each other. Or maybe humans will become useful pets for a machine, providing emotional support and amusement while the machines do the real work. Hmm, maybe this has already occurred which would explain the denial-of-service-attack on our brains due to searching for Miley Cyrus pictures.

Ongoing human advantages

In terms of networks, humans are better at receiving and incorporating broadcast knowledge than disparate computer networks. Computer networks are too heterogeneous to absorb new knowledge. Multiple copies of the same program must be running against the same database. There is no good knowledge interchange format among computers limiting their ability to learn. Current AI has no means for computers to teach and impart their knowledge among other computers. So as separate computer systems learn, their knowledge is siloed unlike humans'. There are too many species of software to have a common language. Until digital Darwinism naturally selects some winners (likely those machines who can share knowledge) we will be waiting.

]]>Reading Kurzweil in JerusalemRenaissance Man Amol Sarva Will Vouch for EmailDay in the Life of an InboxClaire WillettTue, 24 Sep 2013 13:39:45 +0000http://www.ripariandata.com/blog/renaissance-man-amol-sarva-will-vouch-for-email50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:52419360e4b030e795dd72e8By Claire Willett
“You can replace all the other stuff you do with email.”
In the age of agonizing over our bulging inboxes, Amol Sarva’s email
philosophy is pragmatic, bright-eyed, and music to this writer’s phantom
chime-prone ears.

“You can replace all the other stuff you do with email.”

In the age of agonizing over our bulging inboxes, Amol Sarva’s email philosophy is pragmatic, bright-eyed, and music to this writer’s phantom chime-prone ears. For if Sarva, who presently runs, funds, and mentors about 10,000 startups [1], including the intriguing email startup Knotable and the brain-boosting Halo Neuro, and previously co-founded Virgin Mobile USA and the wonderful cellphones4all company Peek, can rule by inbox, so can I. I think.

I met up with Sarva at his bustling new coworking space/incubator/salon (more on that here). [2] Amidst neuroscience experiments, freecycled skateboards, and pumpkin ale, we talked tools, stats, and why green by any other name would be red. [3]

]]>Renaissance Man Amol Sarva Will Vouch for EmailNew Frontiers in Categorization: The Factorization of Email ImportanceEmail ClassificationChristopher FuentesMon, 09 Sep 2013 15:51:47 +0000http://www.ripariandata.com/blog/new-frontiers-in-categorization/email-classification-factorization50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:522deea1e4b0e0717e3fe647By Chris Fuentes
According to the graph, just about everything comes down to the sender.
That makes sense intuitively, since that is perhaps the first thing we look
at when determining whether an email is important or not.
In the end, it's Sender's Game.

Doesn't that look tasty? I had something like that for dinner last night.

Anyways, that tasty plate of spaghetti and meatballs is in fact a bayesian network of the conditional dependencies of the factors which contribute to the classification of an email. In probabilistic terms, this equates to:

I actually wasn't trying to find an exact probabilistic model for classification. That would be kind of dumb in this domain, since one of the initial hurdles is the unfortunate lack of data. Without data, there is no probability that can be determined. Rather, the reason I drew all this spaghetti was because I wanted a clear picture of the factors which influence the classification of an email, and which ones we don't need to care about.

According to the graph, just about everything comes down to the sender. That makes sense intuitively, since that is perhaps the first thing we look at when determining whether an email is important or not.

Therefore, I designed the neural net to consider only the following: Sender, N-grams. I threw in a few of the other headers (cc, bcc), but primarily the classification is performed by examining the sender and N-grams of the ( body | subject ) string.

Did it work?

It's too subjective to have a statistical basis. I can tell you this - recent tests I've run indicate that over time, given enough recategorizations, the network will learn the user's preferences exactly.

Previously, on New Frontiers:

So Much Depends Upon a Read Wheel Barrow

First some background: after implementing a Neural Network which is fully capable of handling arbitrary tagged dataset classification, I realized that the "beliefs" and "assumptions" parameters were useless and contaminating, respectively. "Beliefs" translated into nothing more than initial weight biases which would sway the hidden nodes' weight vectors one way or another. And after I'd finally cooked up a scheme for these biases to directly affect the outcome, it hit me: if the beliefs are wrong, they will be overturned within the first few rounds of training. All they would do is slow down the neural net. If they were correct, then the neural net could converge faster, but not by much. In the end, no real benefit.

The "assumptions" parameters were more interesting. These were ideas that translated into a priori tagging of untagged data, so that it could be used as a training set when little or no tagged data existed. If the assumptions were correct, even partially, it would allow us to circumvent the hurdle of not having a user-tagged set of training data. But things could not be so simple.

For the sake of analogy (and, again, not to give away the farm), suppose we were classifying books. We receive a pile of books, and have no idea which ones the user likes. We are told to make the neural net classify them as "Interesting" or "not". Suppose also that we are told which ones the user has read cover to cover, and which ones haven't been touched. Aha, that's a clue. So let's make an assumption that if the user read a book cover to cover, that book is "interesting" to the user, and the other books are "not". We tag the dataset this way, and shove the books into the abyss of the neural net.

The Little Neural Net That Could happily trains on this tagged set, and converges to perfect performance on the existing data. It even performs perfectly if some of the books are kept to the side as testing data. Wowee, isn't that great? But wait a second, what has the neural net actually learned about the relationships between what the user finds interesting?

Nothing, of course. If you were to look at the individual weight vectors and take them apart to analyze their individual influences (sounds like a week's worth of fun), you'd find that the output node for "interesting" just massively upweights all training data where "read cover to cover" is set to true. The other output node proportionally downweights them. All other nodes could be ignored.

The problem is this: we have some attributes, one of which is "read-cover-to-cover". If that is set to true, then the network reports "interesting," else it reports "not." So, in bayesian network terms, the outcome is conditionally dependent on a single attribute. But in framing the problem this way, we are saying that "read-cover-to-cover" is not just an attribute, but is actually conditionally dependent on the other attributes, since it itself is the deciding factor of the outcome. Thus we have this nonsense:

Gee wilikers, Batman, what's the flash of insight already?

The flash of insight is ... we remove the input node related to the assumption! If we pre-tag a bunch of text based on whether or not they've been read cover to cover, we then remove the input node corresponding to that boolean value. This way the neural net will be forced (politely, of course) to find a weighting of the other attributes of the book which fit the training data. In other words, we started with this chain:

Therefore, the network will determine which of the other features of the book gave rise to the classification which coincides with the user having read the book cover to cover. If our assumption is correct, this will give us a clue which features of the book actually make it interesting for the reader.

Of course, there is a risk of conflation (it is, after all, an assumption), but remember: this is just the initial dataset. After feeding it through a few more datasets, tagged on different characteristics with hopefully a large amount tagged by the user themselves eventually, the net can build a complete picture of what factors influence a classification and then generalize them accurately to unseen data. So, in our book example, we would hope that after feeding it a stack of books tagged on the basis of their read-status and finding that all Charles Dickens books were read cover to cover, a new Charles Dickens book (which has not been read) would still be tagged as "interesting" because its characteristics overlap those of Dickens’ entire oeuvre. Less consistent authors might present more of a challenge, but again we can only ask for so much.

Well, that's it for now folks! Once I apply these changes to the flux capacitor neural net, I'll be sure to post my results.

CROUCHING TIGER, HIDDEN BIAS

I have begun coding the "feedforward" portion of the network, only to discover an issue I had not foreseen: the effect of hidden nodes on beliefs.

Assume a fully connected network: the input nodes have their values directly from the text (and we'll just say boolean values for now: 1 for the presence of a word, 0 for the absence thereof). If we don't have any hidden nodes, then we can incorporate our beliefs directly into the weights of the output nodes (that is, manually alter the default weight corresponding to that input node). However, if we have a hidden layer, things get tricky. Here's a simple example:

Let's say we have two input nodes, one which checks for the presence of the word "blue" and the other which checks for the presence of the word "green". Then lets say we have one output node, which is assumed to be "Important" if it reaches the threshold above 0.5, and not important otherwise. Now say we bias "blue == true" to have a weight of 1, and all other values have a default weight of ±epsilon.

Simple enough: if we fire the network as is with values (blue == true, green == true), then the output node will receive inputs [1, 1] (since each word was present) and weight the inputs with with weight vector [1, ±epsilon], which through becomes ~1.0 after the activation function. This is the behavior we would expect from incorporating such a bias. Similarly, if blue was set to false, then the output node receives [0, 1], weight vector [±epsilon, ±epsilon] and the output is ~ 0.

Now let us introduce a hidden node between the two inputs and the output. This hidden node now takes on the characteristics that the output node previously had: it will be directly influenced by the initial bias. But the output node has no idea if the hidden node should be biased or not, so it will use the default weight of ±epsilon. Therefore, even though the hidden node was able to incorporate the bias for values (blue == true), the output node completely disregards this.

I see two potential solutions: one is to check each hidden node and see if any of its influences had bias. If so, mark the node as biased, and pass along a dictionary of the biases. then when it reaches the output node, the output node will know which weights to bias and by how much.

Obviously, this will have dramatic effects on a fully connected network, since if any input node is biased, all hidden nodes will be biased and all weights for the output nodes will be biased. The hidden layers will initially be representations of the bias, not of the rest of the data as is. I wonder, though, if this is really such a bad thing: We want to be able to produce results with very little or no training data. If all we have to go by is our biases, maybe the network should initially lean heavily towards it. We could also downweight the bias as we move further along the network, but this seems like it would require a lot of tuning to get the downweight curve correct. We could do it geometrically on a basis of biased to non-biased input ratio, but that seems arbitrary and disregards the effect that the bias is supposed to have.

The other approach is to just connect biased inputs directly to the output nodes and skip the hidden layers altogether. On the one hand this seems to maintain integrity better, since the biased nodes won't overwhelm the rest of the network. On the other hand, this disconnects the nodes from the hidden layers, and therefore more complex functions cant be learned.

I was initially leaning toward the second approach, but in hindsight I think I 'll try the first. Afterall, the bias is only for initial feedforward() runs of the network. After backprop() runs a few times, incorrect biases will be ignored.

WHITE RABBITS

I've hacked together most of my first approach toward text classification. Though I am having some mental roadblocks because of the new challenge of integrating the unsupervised portion of the algorithm, a larger conceptual concern has reared its head:

Markov Time Series

Preferences change over time. If we have a corpus, even a corpus tagged by date of appearance, over time the data has a potential to become very inconsistent. In the extreme case, suppose a person's preferences invert overtime. Taking the corpus of data as a single point would lead to completely inaccurate results, since we'd have texts that are both "important" and "not".

I believe the solution lies in finding a way to only train on the most recent set of data, or otherwise to down-weight earlier data beyond a certain horizon.

I do want to take the opportunity to show one of the great features of the way I am constructing this algorithm. That is: the configuration file. You are allowed to configure the algorithm in a way that's consistent at a high and very much intuitive level with your personal a priori beliefs and assumptions. In the case of email, like so:

{

"beliefs" : {

"seen" : { "true" : 1 },

"replied" : { "true" : 1 },

},

"assumptions": {

"headers" : { 'notification' : 'not important' }",

},

}

Beliefs are measured on a scale of -1 to 1 (corresponding to "I think this is not important" to "I think this is very important). Assumptions are simple a priori tags, e.g. "this is important" vs. "this is not important".

Best of all, the way I have build the algorithm, it will have no trouble giving your beliefs the metaphorical middle finger (though perhaps an ASCII middle finger would be a nice touch) if it finds out they are totally wrong. That is, it can learn to not just disregard them, but rather to modify them to fit its actual view of the world. As for the assumptions, well you've got to be more careful there, because it will be slow to deviate from them, but over time if it receives enough counter examples, it should be able to disregard them as well.

Now, to see if it actually works...

PROTOCOL 13, OR WHY NEURAL NETWORKS TRUMP CLUSTERING

After some initial considerations, a bit of reading, and no small amount of talking to myself, I've settled upon two potential approaches: a clustering adaptation, and a back-propagation adaptation. The first one I don't like so much. The second one is incredibly complicated, but looks far more promising.

A CLUSTERING ADAPTATION ADVENTURE

Clustering in general is a catch-all approach to unsupervised learning tasks. But, as discussed, what we have is not an unsupervised learning task entirely, but rather a developing task. That is, it starts out with no substantial supervision and slowly gains supervision overtime. So, it's got to be able to account for both cases, and transition smoothly from one to the other.

How can clustering help us? All we have to do, in theory, is provide some guidance to our algorithm to help it determine which cluster is which. Here is the general approach:

Basically, we are taking our data and tacking on another parameter to it: "belief", which is something in the range of 0 to 1, representing our conviction that it is important (1) or not important (0). We cluster the data geometrically into two clusters (presumably, important and unimportant) and then wait to get some tagged data. Once we get the tagged data, we adjust our conception of where the centroids are and run expectation maximization to adjust the belief parameter of each datum. Once that's done, we classify the data geometrically by cluster, and then find a separator function that can assist in future classification. Rinse and repeat.

I haven't thought this approach out too much further than this. There are two compelling reasons why:

The first point >> The dimensionality of english text, on a monogram basis, is over 500,000. Mathematically, I don't think that having a single parameter fluctuating between 0 and 1, even if alongside only binary features, can't robustly push a datum toward one cluster or another in a meaningful way. We can mitigate this by change the numeric range of belief-states to be much larger, but I'm not sure at first pass how we could guarantee that we've settled on a good range.

The second point >> We're not going to have a clean linear separation between data. Yet, only having two clusters implies that we will. We're not just cutting corners, we're shooting ourselves in the foot. And cutting corners.

A MORE SENSIBLE APPROACH: PROTOCOL 13

Back propagation isn't for the faint of heart, nor for those who don't like math all that much. However, I envision that there is an incredibly flexible model that can be built using the back-prop algorithm as a base-line.

Without giving away the farm, here's the basic outline:

The in's: We have a variable number of input nodes, each representing one feature. Since we're dealing with text, we can assign input values in one of two sensible ways: TF*IDF, or boolean. Frankly, I haven't the faintest which would be better, but hey–that's why they hire guys like me.

The not-quite-in's and not-quite-out's : Then we have the hidden layer. Or several hidden layers. I guess that's another thing to tune, but since it's been proven that enough hidden nodes can learn any function, really the only thing to watch out for is making sure we've got enough, or more than enough. This parameter might need to be tuned either live or a priori in order to maximize run-time performance. But at this stage, I don't give a cow if it's slow, it just needs to work.

Each hidden node has an array of FeatureConverters: objects that take as input a certain feature (identifiable via featureID or featureType or whatever), the actual observed value for that feature, and spits out the current weight assigned for that feature at that weight.

The out's: Lastly, we've got the output nodes. Two of them, to be precise. One is proudly labeled "Important", the other: "not". We'll just go with normalized unit values for now.

The in-between's : What about connectivity? Fully connected or not? I think intuitively, we would want *most* of the nodes connected. There are a few that I wonder about, though – principally, whether meta-data should be connected to observed data. I guess we're really making a statement about the relationships we're expecting from a text document by choosing not to connect certain nodes, but experimentation will hopefully provide some insight as to whether we actually should choose not to be fully connected.

Framing the problem as a neural network allows us to exploit any and all relationships between any and all features. It also allows us to gradually add in data and continuously train on more data as it arrives. But what about when there's no data? Well, we have to initialize all features by default to have no influence, except for the ones that we believe actually do have influence. Some of this could be metadata, some of it could be actual features of natural language.

The way I've framed it, I'm not intuitively sure that the weights for individual values of a feature will be updated to robustly reflect relationships with other particular values of other particular features. But, I could be wrong. If each hidden node can learn one relationship, then it may just be a matter of finding the optimal number of hidden nodes. Or even hidden layers. ML problem turned Graph Search problem? You bet. Fun? Not at all. Beer time?

Absolutely.

ONE (SMALL) STEP FOR MACHINES.

I have decided to attempt to solve a problem which falls outside of the realm of straight-forward techniques in Artificial Intelligence.

The fully generalized version of the issue is this:

Given a bunch of texts, (could be anything: books, text messages, facebook posts, whatever ~) can we have a computer figure out which ones are important to you, yes, you and which ones are not?

M.O.

To be fair, this is a problem for which solutions have been and are currently being attempted. On one domain, recommendation services, (such as Apple Genius and Amazon's recommendation services) are constantly seeking ways to improve their algorithms which recommend products that a particular user is more likely to buy. Historically, such tasks were addressed by compiling huge amounts of sales data, and simply following the trends (e.g., we know that ten thousand people who like A also like B, so if the 10,001th likes A, he'll probably like B).

However, this approach completely neglects the intrinsic properties of the products themselves--properties which are in fact the reasons why someone would want them or not want them.

Secondarily, in the realm of filtering and advertising, it has become important to various industries to be able to streamline only the information that a particular user wants to see, cutting down as much as possible on clutter which the user would only view as annoyance (in the case of advertisement) or spam (in the case of email).

DEFINE IMPORTANCE

This particular problem has some interesting features. First of all, at its heart we have a binary classification problem: text can fall into the category of "Important"/"Interesting", or "Not". This fact initially gave me lots of hope, because there exist a plethora of off-the-shelf-ready techniques for performing arbitrary binary classification. All we needed to do is define a feature set, collect a bunch of training data, and boom: we're done.

Kidding.

The idea of training data qua training data contradicts the nature of the problem we are trying to solve. Suppose we asked 1 million users to each go through 1 million pieces of text and mark them as either "interesting" or "not". We could then run any of our awesome out-of-the-box algorithms on the data and go forth to classify the known world of text. What would happen?

Our classifications would be wrong, just about every time.

They would be wrong because "important" is not an objective classification. In fact, from one person to the next, there may be a 100% inversion of "important" to "not" on certain text. Trying to build general rules based on large amounts of training data will lead to very unpredictable results, and it will only be correct when all of the training data is in agreement about certain types of text (e.g., if everyone hates text that contains the word "pineapple", then the classification will do just fine for texts that contain "pineapple". But I don't think I need to convince you that there would be very few, if any, cases like this).

CONSIDER THE INDIVIDUAL

What we need is a set of training data for each individual end user. This training data can not be corrupted by mixing it with any other training data. That way, the classification will be able to learn from the user's training data proportional to the level of consistency that the user classifies texts as important or not. In theory, that would result in a pretty good classifier.

But not really. First, machine learning algorithms which are data dependent, such as classification, require quite ideal data in order to function correctly. That is, there has to be enough of it, it should be consistent, and it should be general enough to speak to all features in the particular domain we are classifying (for example, if we showed a user ten spiderman comic books and had them tag them as important or not, then tried to guess how the user would feel about "A Tale of Two Cities", we'd essentially be just guessing).

For this per-user classification to work, we'd need each user to have a pre-existing robust set of training data. And that's... not gonna happen.

So we essentially have none-to-sparse training data for a class of algorithms that require training data. Worse, if we are to solve this problem in a commercially appealing way, we want to be able to actually start making accurate classifications without any training data. Of course, once the user starts actually using the application, the user can verify that we've done a good job or not and thereby provide training data that we could collect and use to improve our accuracy. But how do we get a baseline?

A ROGUE SCIENCE

What we need is a new kind of algorithm. We need an algorithm that takes, as input, features and an expectation of their influence toward classification. The algorithm must be able to account for instances of new training data that appear, weight them with our feature-weight expectations, adjust our current feature weight expectations, and proceed to classify. Essentially, we have an algorithm that starts as completely unsupervised, but slowly gains supervision over time.

The real gotcha is, when we're at the completely unsupervised phase, even if we can linearly separate all text into two clusters, the algorithm needs to have some hint about which category corresponds to which cluster.

Therefore, the algorithm needs to be able to account for the experimenter's a priori belief about the effect of certain features in the data. Or, more ideally, it needs to have some way to guess which features are more likely to contribute to a certain classification and give them initial weights accordingly.

FIRST STEP

I'll begin by modifying some existing techniques to see if I can come up with an algorithm that meets the requirements expressed above.

Stay with me.

]]>New Frontiers in Categorization: The Factorization of Email ImportanceModelers Behaving BadlyBig DataClaire WillettWed, 04 Sep 2013 13:05:51 +0000http://www.ripariandata.com/blog/modelers-behaving-badly50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:52263d00e4b006104f98dc8dBy Claire Willett
In the late 70s, a doctor on the staff of St. George’s Hospital Medical
School developed an algorithm to assist with the school’s admission
process. Between 1982 and 1986, 100% of the interview decisions were made
by this algorithmAlgorithms can be a force for evil, as well as good

In the late 70s, a doctor on the staff of St. George’s Hospital Medical School developed an algorithm to assist with the school’s admission process. Between 1982 and 1986, 100% of the interview decisions were made by this algorithm. And in 1987, the Commission for Racial Equality found St. George’s guilty of practicing racial and sexual discrimination in its admissions policy. From the British Medical Journal’s writeup of the incident:

“As many as 60 applicants each year among 2000 may have been refused an interview purely because of their sex or racial origin." [1]

This didn’t happen because the algorithm was faulty: indeed, at the end of its testing phase, its gradings had a 90-95% correlation with those of the (human) selection panel.

This happened because the algorithm’s training and testing data--that is to say, the school’s admissions records-- were already biased against women and racial minorities. And it happened because nobody at St. George’s questioned the algorithm’s decisions (and why would they, when the decisions corresponded so neatly with their own?).

In each account’s Gmail, put the name and a shared set of terms (eg: Conor Erickson - Arrested; need a lawyer; DeShawn Washington - Arrested; need a lawyer) into the subject-header line.

Do all the names see the same/similar Google ads?

Not under Nathan Newman’s watch. The Tech-Progress founder, who chronicled this experiment on the Huffington Post, found certain terms yielded significantly different results across the three ethnic groups. In particular, for the term “Buying Car,” the white names yielded ads for car buying sites, while each of the African-American names yielded one or more ads “related to bad credit card loans and included other ads related to non-new car purchases, such as auto insurance or purchasing ‘car lifts’ for home repairs.”

Newman also found that the location of the name came into play: in the South Bronx, the “Jake Yoder” who was interested in buying a car saw car lift and car warranty ads; his Upper West Side counterpart saw multiple Lexus ads.

Rebuttal: Google responded to Newman’s post saying that they “do not select ads based on sensitive information, including ethnic inferences from names.”

But: In a blog post about the experiment and response, Cathy O’Neil writes: “it doesn’t matter what Google says it does or doesn’t do, if statistically speaking the ads change depending on ethnicity.”

3. Quantifiably Random

At the beginning of each school year in public schools across the country, student information including attendance, race, gender, socioeconomic status, and past performance is fed into a value-added model. The model uses this information to calculate, given an average teacher, what a class’s year-end standardized math and English test scores should be. At the end of the year, the students take their tests, and math and English teachers receive a value-added between 0 and 100, based on how their classes perform in relation to the VAM’s calculation.

After the New York Times released the 2007-2010 value-added data for 18,000 New York City teachers, Gary Rubenstein attempted to answer that question. If value-added metrics are a useful benchmark for evaluating teacher performance, Rubenstein hypothesized that they would agree with the following:

1) A teacher’s quality does not change by a huge amount in one year, with the exception being between the first and second years

2) A teacher in her second year >>> same teacher in her first year

3) Teachers generally improve each year

Rubenstein took the teachers who were rated in both 2008-2009 and 2009-2010, and plotted their scores from 2008-2009 on the x-axis and their scores from 2009-2010 on the y-axis.

You'd expect decent correlation between the two score sets, with points clustered on an upward sloping line.

Here’s what it actually looked like:

The correlation coefficient on 2009-2010 scores as dependent on 2008-2009 scores was .35. With that kind of correlation, you might as well only hire teachers for a year.

Still, there are a number of reasons why a teacher’s ability might change drastically from one year to another -- maybe there’s an illness in the family, maybe they’re dealing with some sort of trauma, maybe they’re thirty years in and ready to retire. So Rubenstein plotted a different set of scores: those for teachers whose first year was 2008-2009, and second year was 2009-2010.

The first-year teachers plot looks … about the same as the total teachers’ plot.

According to the value-adds, 52% of the first year teachers were better in their first year than in their second. 52%! Now you really might as well only hire teachers for a year.

Vis-a-vis the above hiring advice: ignore 100% of it. Rubenstein’s argument, of course, is that the VAM is scarcely better than a random number generator, and using it to decide whether or not to keep a teacher means a district will likely lose a lot of good eggs and gain a lot of rotten ones. [1]

4. To Conclude

My point with these examples isn’t that models are inherently bad, but rather that models are reflections of the data used to train them, and that data is a reflection of the people who collect it. The best way to prevent bias is to look carefully at the training data, and understand how it was collected, before you feed it to your model. That, and always, always check your model’s work.

]]>Modelers Behaving BadlyMy Big Fat Geek Hairball: A Webinar on Network Graphing with NodeXLEventsClaire WillettFri, 30 Aug 2013 14:09:58 +0000http://www.ripariandata.com/blog/my-big-fat-geek-hairball-a-webinar-on-network-graphing-with-nodexl50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:521fa701e4b06da6f0a43fa5By Claire Willett
On Wednesday, September 11th, we’ll be hosting a joint webinar on NodeXL
with SoftArtisans and Marc Smith. Along with picking Marc’s brain about the
project’s origins and future, we’ll demonstrate how to use NodeXL to graph
the connections between Twitter users mentioning particular terms.Do birds of a feather talk together? Tune in on Wednesday, Sept 11, to find out.

If you’ve ever spent time in this here blog’s archives, you may have surmised we’re pretty into social network graphs. Discovering if and how people who share one common thread (eg following @horse_ebooks) are connected can be fascinating. Social network graphs also can be beneficial to your business, since they reveal the hubs, spokes, and islands of said thread.

There are a number of ways to create social network graphs, but the method that packs the biggest insights punch for the least amount of setup is NodeXL. NodeXL is a free Excel add-in that lets you quickly (as in, 3-steps-to-done quickly) visualize network data. It was created by the sociologist Marc Smith and his Community Technologies Group at Microsoft Research in 2008, and recently passed its 200,000th download.

On Wednesday, September 11th, we’ll be hosting a joint webinar on NodeXL with SoftArtisans and Marc Smith. Along with picking Marc’s brain about the project’s origins and future, we’ll demonstrate how to use NodeXL to graph the connections between Twitter users mentioning particular terms.

If you’re interested in social networks, and can make it, please come! Absolutely no coding knowledge is necessary, and you’ll be able to ask Marc questions as we go along.

If you can’t make it, no worries--we’ll post the slides and recording here afterwards.

]]>My Big Fat Geek Hairball: A Webinar on Network Graphing with NodeXLHubspot, Skimbox, and the Rise of RelevanceEmail ClassificationEventsClaire WillettTue, 27 Aug 2013 13:20:53 +0000http://www.ripariandata.com/blog/hubspot-skimbox-and-the-rise-of-relevance50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:521ca7c3e4b0b3c0304f2b5bby Claire Willett
My biggest takeaway from Inbound had nothing to do with A/B testing or
workflows. Rather, it was:
Bespoke is the new one-size-fits-all.
Before I dig into that, a little background: Hubspot's core philosophy is
"if you build it, they will come--and convert." Our online worlds are becoming increasingly filtered for our convenience.

I spent the previous week at Hubspot's Inbound conference, now billed as the "world's largest gathering of inbound marketers." As a participant in many of the lines that snaked through the third floor of the Hynes Convention Center, I believe it. Like many big conferences, Inbound is adorned with high-wattage flourishes--Arianna Huffington, oodles of parties, a performance by One Republic--but its real value is in its sessions. If you're looking to reboot or rev up your marketing, Inbound provides methodical, empirical approaches for doing so.

That being said, my biggest takeaway from Inbound had nothing to do with A/B testing or workflows. Rather, it was:

Bespoke is the new one-size-fits-all.

Before I dig into that, a little background: Hubspot's core philosophy is "if you build it, they will come--and convert." This philosophy is manifested in

forms that capture and store visitor information and

in lots and lots of pay-with-your-data resources on the art of creating content that will drive high-quality visitors to your site, and keep them coming back.

Presently, the "keep them coming back" part of the formula relies heavily upon email: once you've filled out a Hubspot form, you'll start getting emails with offers tailored to the reason you filled out that form. The frequency of these emails, and the variety of interests they cover depends upon how you interact with them. Done right, they are highly personalized, and they work.

But, email is only one part of the inbound marketing pie, and it has to trigger the recipient's interest to be successful. Visitors to company's site, on the other hand, are already actively interested.

Hubspot has a new product coming out in September that they're calling the Content Optimization System, or COS. The COS adds the technology Hubspot already uses to personalize emails and its Call-to-Action buttons to a content management system; essentially, it’s a CMS with a bunch of cookie-based content customization functionality.

What this means, in practice, is that after I've filled out a form--eg a form to download the "Inbox Zero for Everyone Ebook" on knowyourinbox.com, some of the content I see when I return to knowyourinbox.com will hinge on the information I've entered in the form. If I'm a development manager at a small software company, I might see an offer for a free demo of KnowYourInbox's API. If I'm a PR rep at a large fashion house, I might see a webinar on tips and tools for managing email overload.

There are a number of tools that optimize blog or website content based on viewing trends in aggregate, but this is the first non-social-networking site I've seen to do so on an individual level. And the fact that it is part of Hubspot, a company whose primary customer is the small, harried, not-very-tech-savvy business owner, means filtered content is not a flash in the pan.

Which is good news for us here at Skimbox, because our product assumes people want their email filtered: one box for the relevant stuff, and one box for the rest. It differs from the COS in that users still get everything, but it shares the goal of making it easy for users to find what they want. As Hubspot CEO Darmesh Shah put it in his announcement post,

"...having context isn’t enough. You have to use it for the good of the customer."

Like Hubspot, we’re in the early stages yet (you can read Chris Fuentes’ posts to learn more about the technology behind our classification), but I’m eager to see the responses to the seeds as well as the eventual trees.

]]>Hubspot, Skimbox, and the Rise of RelevancePRISM vs. On-Premise EmailBig DataClaire WillettTue, 13 Aug 2013 13:01:12 +0000http://www.ripariandata.com/blog/prism-vs-on-premise-email50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:520947dae4b04f935eedeebeBY Claire Willett
It’s safest to assume that any email, sent from any service, will be
monitored. However, for corporations seeking some control over who has
access to their employees’ inboxes, there is a bright light, and its name
is "on-premise email."In the age of Big Brother, will on-premise email gain appeal?

The Protect America Act and its younger sibling, the FISA Amendments Act of 2008, grant the NSA permission to monitor all internet activity going into and out of America without obtaining warrants, provided its general intelligence gathering plan is approved by the FISA Court.

Under the program we know as Prism, the NSA is then allowed to obtain electronic communications of any foreign user whose message(s) has been flagged as suspicious, and those of any user, foreign or American, who has been in communication with the initial user. If the connected communications belong to Americans, they are labeled as such, stored separately, and require warrants before NSA analysts can access them.

How this works in practice is that each year the NSA gives a slew of user information directives to internet giants like Facebook, Google, Microsoft, and Yahoo, and the giants comply, with varying methods of turning over that data.

It’s safest to assume that any email, sent from any service, will be monitored. However, for corporations seeking some control over who has access to their employees’ inboxes, there is a bright light, and its name is "on-premise email."

Until this past Thursday, you might have thought I’d say the bright light’s name was "secure email provider." On Thursday, one secure email provider suspended operations, and the other shut down entirely.

The problem with secure email providers is that, while the emails stored on their servers are encrypted, they are still stored on their servers. The metadata is clear as day, and, if the secure email provider also stores the encryption keys, the message bodies are readable as well. And most secure email providers do store the encryption keys, because managing these keys is a huge pain in the tuckus for corporations.

Now, on-premise email is no slouch in the pain-in-the-tuckus department. It requires an in-house IT team to set up and manage the mail servers, it’s more expensive than cloud solutions until you have enough users, and it can struggle to keep up with spam volumes.

HOWEVER: if you have on-premise email, you store your email, and you get to dictate who has access to it. If the NSA wants to pore through one of your inboxes, they have to come to you to get it.

I should also say: on-premise email isn’t new--quite the opposite. Back before the cloud wafted in, on-premise was the only option, and up through April of 2011, it was still used by 80% of corporations.[1] When you think about all that hassle vs all that convenience, 80% sounds pretty high. You might imagine that over the years, as companies got around to upgrading systems, and as advancements in on-premise features--particularly mobile-focused features, stagnated, that number would drop, and drop, and drop.

And you’d probably be right, were it not for the lack of data control that inherently accompanies cloud-hosted email.

But taking under consideration a) this lack of control, b) the amount of attention Prism and the broader wiretapping program have received in the mainstream and tech media, and c) the widely publicized decisions made by Lavabits and Silent Circle, what I’m wondering is whether that number will actually start to go up.

Given a choice between your way and the information super highway, which will you take?

1. http://static.altn.com/Collateral/WhitePapers/US_Why-The-Cloud-Is-Not-Killing-On-Premises-Email-Market_WhitePaper.pdf]]>PRISM vs. On-Premise EmailThe Phone that Cried WolfEmail OverloadBrian BarnesMon, 12 Aug 2013 13:59:09 +0000http://www.ripariandata.com/blog/the-phone-that-cried-wolf50b62ce9e4b0bff132954b31:50be68f2e4b0a7a9ca3fbca2:5208ea40e4b019bfc9d95184I’m sure everyone is familiar with the parable "The Boy Who Cried Wolf."
Quick reminder: there was a boy tending sheep in the fields away from his
village. Boredom and loneliness got to him, so he cried something along
the lines of: "help, there is a wolf" and everyone in the village came to
save him.
Except, of course, there was no wolf.

"The Boy Who Cried Wolf" by B.G. Hennessy

I’m sure everyone is familiar with the parable "The Boy Who Cried Wolf." Quick reminder: there was a boy tending sheep in the fields away from his village. Boredom and loneliness got to him, so he cried something along the lines of: "help, there is a wolf" and everyone in the village came to save him. Except, of course, there was no wolf.

After the boy pulled this schtick a few more times, the villagers were like "this kid is a liar and shall be heretofore ignored." And then, karma had her say and a real wolf came to the boy's field. The boy cried "wolf", the villagers ignored him, and the wolf got a nice dinner.

This parable is popular among parents (though the ending is often altered), but it applies to grown-ups--and specifically their preferred hunks of metal, too.

With all the mail we are getting are our phones crying wolf?

It may sound kind of silly, but are our phones ding!ing so much that we tune them out, and as a result miss important emails from our customers, bosses, or spouses?

You are sitting at home one night and your phone dings its "you've got mail" ding. You get up to check your email and ... it's a confirmation that you've checked into your flight for tomorrow. You go back to whatever you were doing and then ding! Someone updated a page on your Intranet. Ding! Someone in IT formally closed a ticket you knew was closed that afternoon.

The dings are becoming increasingly irritating and preventing you from getting anything done, so you ignore the next one. And the next. And the next. Ahh, this is what productivity feels like! You head upstairs to finish packing for your trip and fall asleep.

You wake up well-rested and finally open up your email. A bunch of automated emails from various corporate systems, and, in their midst, one from your boss asking you to send him the revised slides for tomorrow’s meeting.

Whoops.

Not a huge loss, but you would have preferred to have looked like you were on top of work, not on top of your mattress.

Today, most of us either respond to every ding or respond to none of them. The first route drives us (and those around us) insane; the second can render us MIA. But, there is a third route:

Instead of your phone dinging every time you get an email, it only dings when you get an important email. Instead of 10 dings last night, there would have only been one ding--for when your boss emailed you. And since you would have known that ding was a wolf, you would have read the email and responded immediately.