Twitter, Tweet Nest and the Data Protection Act

Unfortunately, as the nerds out there might be aware, Twitter allow access to
just your most recent 3200 tweets online. The only way to get a copy of your
older tweets is to send them a request under the Data Protection Act
1998. Folks elsewhere in the EU should have similar legislation to
help them, but I’m not sure about other countries. To their credit, Twitter
complied with the request fully (as far as I can tell) and without making it too
painful for me (depending on how painful you consider having to send a fax).

Here’s the first twenty lines of the alexmuller-tweets.txt file they sent:

While it was good of them to supply so much information, there was really only
one thing I was interested in: the list of status_ids of all my tweets.
Twitter provide the tweet text as part of this request, but strip other
interesting metadata such as location and source.

With that, I used a rubbish bit of Python (with tweepy) to pull every bit of
JSON I could out of the Twitter API and save it to a file:

WARNING: this code is so wonderfully breakable that it will probably set
your machine on fire. You’ve been warned.

Basic auth calls are limited to 150 an hour, so I left this running overnight to
complete. With the result, you can easily turn it into an array (wrap it in []
and add commas to each line, also known as the poor man’s way to code) and then
use Bryan Veloso’s beautiful script to import it into Tweet Nest.
It’s worth importing your older tweets first before setting up a repetitive job
to pull in new ones, or else you’ll have to do some MySQL funkery that I can
explain in more detail if you need (yell on Twitter).

This script highlights a problem that Twitter mentioned again last night: basic
auth will not work for much longer. This is a huge issue for Tweet Nest too, as it
doesn’t use OAuth. I’m going to (attempt to) add OAuth support to it in the near
future.

While you’re here, let’s have a quick look at what else they returned from the
DPA request. Apart from the normal stuff you shouldn’t be shocked to know
Twitter have access to (direct messages, favourites, followers and following),
these two things were surprising, though not massively so, to me:

A list of all the phone numbers and email addresses that were stored on my
phone when I first ran the official Twitter iPhone app (1222 in total).

A list of every IP address that has accessed my account for the last four
months (it’s fairly standard practice to store this).

For what it’s worth, I agree entirely with David Singleton, who
tweeted:

In case it wasn’t clear, I’m pretty worried about the twitter changes too. I
suspect many of them won’t be enforced, but still, dangerous.

No matter how much you trust Twitter, it would be prudent to store a copy of
your own data.