It’s hard to find just one or two things to excerpt from Jes­sica Leeder’s great inves­ti­ga­tion into the large amount of global crime that has grown up around some­thing as sim­ple as honey. It turns out that, in response to U.S. and E.U. trade rules designed to keep antibi­otics out of the honey sup­ply, a vari­ety of mid­dle­men have turned up in parts of Asia to con­ceal the ori­gins of honey–a prac­tice that has been met with equal amount of money spent on track­ing down the honey launderers.

Most honey comes from China, where bee­keep­ers are noto­ri­ous for keep­ing their bees healthy with antibi­otics banned in North Amer­ica because they seep into honey and con­t­a­m­i­nate it; pack­ers there learn to mask the acrid notes of poor qual­ity prod­uct by mix­ing in sugar or corn-based syrups to fake good taste.
None of this is on the label. Rarely will a jar of honey say “Made in China.” Instead, Chi­nese honey sold in North Amer­ica is more likely to be stamped as Indone­sian, Malaysian or Tai­wanese, due to a grow­ing mul­ti­mil­lion dol­lar laun­der­ing sys­tem designed to keep the end­less sup­ply of cheap and often con­t­a­m­i­nated Chi­nese honey mov­ing into the U.S., where tar­iffs have been imple­mented to staunch the flow and pro­tect its own strug­gling industry.

Much later in her arti­cle, Leeder notes that since the honey laun­der­ing started in earnest about a decade ago, sev­eral coun­tries that pro­duce very lit­tle amounts of honey enjoy very large honey exports.

Despite the arrests, the honey indus­try has been watch­ing sus­pect import num­bers climb.
They are par­tic­u­larly incensed by three coun­tries that, ten years ago, exported zero honey to the U.S., accord­ing to Depart­ment of Com­merce data. India, Malaysia and Indone­sia are mys­te­ri­ously on pace to ship 43 mil­lion kilo­grams of honey into the U.S. by year’s end.
“It is widely known those coun­tries have no pro­duc­tive capac­ity to jus­tify those quan­ti­ties,” said Mr. Phipps, the honey mar­kets expert.

The rest of the arti­cle, which is well worth read­ing in full, points out dif­fer­ent meth­ods for con­ceal­ing honey’s ori­gins, strate­gies for com­bat­ting the fraud, and a sort of legal back and forth that seems out of place for what feels like a pretty ordi­nary food item.
In read­ing this, though, I was reminded of sig­nals sug­gest­ing that honey may not be the only food sub­ject to sim­i­lar sorts of fraud attempts. For exam­ple, in late 2009, a group of stu­dents decided to use DNA analy­sis to try to ver­ify the ori­gins of their foods–and found that 11 of the 66 foods they tested were mis­la­beled. Not sur­pris­ingly, the mis­la­beled stuff was expensive–sheep’s milk was actu­ally reg­u­lar old milk, stur­geon caviar was really Mis­sis­sippi Pad­dle­fish.
And DNA testing–the cost of which keeps dropping–isn’t the only tool at a consumer’s dis­posal for test­ing food ori­gins and chem­i­cals. A group of Cana­dian chemists have devel­oped a lit­tle strip–sort of like a piece of paper for test­ing p.h. lev­els–to see if a food item con­tains pes­ti­cides, for exam­ple.
As of now, most of these sto­ries about food fraud have received rel­a­tively lit­tle pub­lic atten­tion. But it’s inter­est­ing to imag­ine what would hap­pen if sto­ries about honey laun­der­ing and the like started gain­ing traction–and what sorts of reac­tions it could spur. Cer­tainly, we’d see con­sumers exam­in­ing their Florida orange juice, Cal­i­for­nia cheese and so on a lot more closely. And, of course, we’d also see food com­pa­nies respond­ing by engag­ing in a lot of des­per­ate mar­ket­ing to demon­strate the authen­tic­ity of their foods. And many more mid­dle men try­ing to con­ceal their sup­ply chains.
At some level, I think that sce­nario is only a mat­ter of when, given that, over time, we really won’t need large gov­ern­ments to invest mil­lions of dol­lars to track down the ori­gins of our foods. With pes­ti­cide test strips, cheap DNA sequenc­ing and the like, the sce­nario above–of increas­ing fears of food fraud–may only be a mat­ter of when.

How we did it

How do you gather 22,000 horo­scopes? Obvi­ously you could man­u­ally cut and paste them from one of the many online Zodiac pages. But that, we cal­cu­lated, would take about a week of solid work (84.44 hours). So we engaged the ser­vices of arch-coder Thomas Win­nigham to do a bit of hacking.

Yahoo Shine kindly archive their daily pre­dic­tions in a sim­ple and very hack­able for­mat (exam­ple). Thank you! So Thomas wrote a Python script to screen-scrape 22,186 horo­scopes into a sin­gle mas­sive spread­sheet. Screen-scraping is pulling the text off a web­site after it’s dis­played. Python is a pro­gram­ming lan­guage. You can use it to write scripts that only gather the spe­cific text you want. Then you run it mul­ti­ple times so it mines an entire website.

Well, it’s not quite that easy. Big sites like Yahoo have ‘rate-limiting’ on their servers. That means if you access a page too many times too quickly, it thinks you’re a hacker and deploys all kinds of anti-hacking counter-measures. Ini­tially, Thomas set his scrap­ing speed too high (once every 10th of a sec­ond) and his IP got instantly banned from Yahoo for 24 hours. After some exper­i­ment­ing (and more bans), he found that a two sec­ond delay between scrapes pre­vented the defense mech­a­nisms from kick­ing in. The script was set to run in the back­ground (while we smoked cig­ars and dis­cussed the empire). 12 hours later, we had our 22,000 horo­scopes in a sin­gle file!

We can’t share the 9.5MB spread­sheet with you because it’s Yahoo’s copy­right. But here are the Python scripts should you feel like recre­at­ing the experiment.

Fil­ter­ing it down

So every dif­fer­ent type of horo­scope got sucked up – career, teen, love, daily overview. Who knew there were so many? It was felt, though, that career & love pre­dic­tions would have their inter­nal biases i.e. lots of men­tions of work, career, love, mar­riage etc. So we opted to just analyse the generic daily horo­scopes for each sign. A total of 4,380 (365 per star sign).

Word Analy­sis Ver­sion 1

We used an online tool called TagCrowd to find the most com­mon words. I pre­fer it to Wor­dle. You’ve got bet­ter con­trol over any ‘noise’ in the sig­nal, because you can not only fil­ter com­mon words (“and”, “for”, “is” etc) but also a spe­cial ‘sto­plist’ of words you’ve chosen.

So we broke down the most com­mon 50 words to see if there are any pat­terns of unique words. This is what was revealed:

Word Analy­sis 2

It struck me that sev­eral words in the top 50 – like “some­one”, “really”, “quite” – were just qual­i­fiers and not really that reveal­ing. You’d find them in any Eng­lish word analysis.

So we stripped those kinds of words out (see our sto­plist). And lo! A fresh set of unique, reveal­ing and more accu­rate words appeared in the top words per sign.

Can I just say that I have no per­sonal inter­est in horo­scopes. I don’t know what the var­i­ous char­ac­ter­is­tics of each star sign are meant to be. So you’ll have to tell me if any of this cor­re­sponds to folklore.

Meta-Prediction

One more thing though. This analy­sis appears to reveal some­thing. The bulk of the words in horo­scopes (at least 90%) are the same. That’s not a full, proper sta­tis­ti­cal analy­sis. (If you are a sta­tis­ti­cian and you want to do a proper analy­sis, please get in touch)

The cool thing is, once you’ve iso­lated the most com­mon words, you can actu­ally write a generic, meta pre­dic­tion that would apply to all star signs, every day of the year. Here it is.

The Future

As ever, I’ve laid out my whole process and all the data here: http://bit.ly/horoscoped.
That way it’s all bal­anced and you can make up your own mind. Typ­i­cal Libran!

On Sat­ur­day, we bumped into the pro­lific Mr. Hawtin, a dig­i­tal DJ pio­neer and the guy who orig­i­nally paved a road for much of our work today. Iron­i­cally, Jan­u­ary 2011 is the 10-year anniver­sary of the ground break­ing announce­ment that really kicked off the dig­i­tal DJ move­ment. Final Scratch 1, co-developed by John Aqcua­viva and Richie Hawtin. We took a few min­utes to talk to Richie on cam­era, and he got into the above inter­est­ing con­ver­sa­tion with Ean about the anniver­sary of DVS, what’s next for him and what’s next for the DJ world.

Check back in with us over the com­ing days as our team con­tin­ues to com­pile video, pho­tos and data into a series of arti­cles on our favorite moments from NAMM.

Jes­sica His­che has cre­ated this lovely flow­chart that makes it eas­ier for cre­atives to decide whether they should work for free or not. As you may have guessed, the answer is “No” for most cases but there are some excep­tions to this rule. You just got to love Jessica’s hon­esty in answers like this:

Did they promise you “expo­sure” or “a good port­fo­lio piece”? ➔ This is the most toxic line of bull­shit any­one will ever feed you.

The flow­chart is com­pletely crafted in HTML / CSS and can be trans­lated using Google Trans­late. You can also down­load the JPG ver­sion or wait for the prints to come.

Jes­sica His­che is a Brook­lyn based designer, illus­tra­tor and typog­ra­pher — find more of her works on jessicahische.com.

In what is prov­ing to be a NAMM week bonanza for lovers of hard­ware effects, Korg’s Kaoss Pad Quad may be the best bang-for-the-buck. You can con­trol up to four effects simul­ta­ne­ously, all via the trade­mark KAOSS-style touch­pad, trig­ger­ing effects you want via single-button tog­gles. (In fact, this device reminds me in a good way of the superb but sadly now-defunct Entrancer KPE-1 video device, in that every­thing is neatly accessible.)

Plug in your input from an exter­nal source or use the onboard mic input, then con­trol effects from the touch­pad with multi-color LED effects for visual feed­back. There are four basic mod­ules – looper, mod­u­la­tion, fil­ter, and delay/reverb – each with vari­a­tions, so that Korg promises 1,295 com­bi­na­tions. (That’s an utterly mean­ing­less num­ber to me, but I’ll take their word for it.)

There’s also a “freeze” effect for each mod­ule, so you can lock its set­tings in place. Some effects:

Multi-mode looper with reverse and loop slicing.

Vinyl break.

Duck­ing compressor.

Auto­matic BPM. But real men and women use the onboard tap tempo instead, so pre­tend you didn’t read that.

Pitch shifter, grain shifter.

Reverb, delay, tape echo.

All that’s miss­ing, really, is MIDI input – it’s intended as a self-contained device, and any sync will be up to its auto BPM fea­ture or tap­ping in tempos.

If you’re in my house, you’re not allowed to use the fake vinyl break effect. Sorry, them’s the rules. (Keep them for the next time you need to score an MTV real­ity show.) But oth­er­wise, this looks use­ful. And at this price, with this kind of ready-to-play con­trol, the whole device looks pretty irre­sistible. Korg’s abil­ity to keep churn­ing out KAOSS stuff peo­ple love is kind of ridiculous.