Stream

I've said very nice things about your spam filter in the past, but I'm afraid I am going to have to take it all back. I'm currently going through the spam for the last week, and have gone through about a third of it.

Something you did recently has been an unmitigated disaster. Of the roughly 1000 spam threads I've gone through so far, right now 228 threads were incorrectly marked as spam.

That's not the 0.1% false positive rate you tried to make such a big deal about last week. No. That's over 20% of my spambox being real emails with patches and pull requests. Almost a quarter!

I don't know how to even describe the level of brokenness in those kinds of spam numbers. There were a few pages of email (I've got it set up so it shows me 50 threads per page) where more than half of the "spam" wasn't.

Quite frankly, that sucks. It's not acceptable. Whatever you started doing a few days ago is completely and utterly broken.

It's actually at the point where I'm noticing missing messages in the email conversations I see, because gmail has been marking emails in the middle of the conversation as spam. Things that people replied to and that contained patches and problem descriptions.

They didn't try to sell me a bigger penis or tell me about how somebody is cheating on me. Really.

You dun goofed. Badly. Get your shit together, because a 20% error rate for spam detection is making your spam filter useless.

[ Edit: looks like it started four days ago. As of July 13, it looks like a big swath of lkml has been marked as spam for me. ]

[ Edit 2: final numbers: out of around 3000 spam threads, I had to mark 1190 threads as "not spam". So the numbers actually got worse: about 30% of my spam-box wasn't actually spam. It started around 1pm on Monday, July 13th. The problem really is that clear, that I can tell pretty much when it started ]

[ Edit 3: it wasn't just patches, and it's not just lkml. There were things like Junio's recent git v2.5.0-rc2 announcement etc. The new gmail spam filter hates any mailing list emails, apparently. In the time I wrote the last note, I got seven more emails marked as spam, two of which weren't. ]﻿

One of the challenges associated with the internet of things is figuring out where to put all that data. If you have dozens of connected devices talking to the cloud (and that is a big if) you’ve got to think about where that data lives, how to normalize it and how to grant others access…