Where Have All The Old Tweets Gone?

Looking for old tweets? You won’t find any that are more than a week or so old on Twitter. Put the blame on all the massive number of new tweets coming in. The engines, they canna hold captain! Or that is, the search index behind Twitter Search can’t hold it all.

To see this in action, consider a search for “happy new year,” date restricted to New Year’s Day 2010. You’d expect a huge number of matches on that day. You’ll find none.

In fact, in testing the date restriction command on Twitter Search’s advanced search page, I found that it doesn’t seem to be functional. I looked for any tweets containing “happy new year” since Monday and received only four matches. In contrast, if I remove the date restriction, I can see there have been at least 1,500 tweets with those words today alone.

So are the tweets still there in Twitter Search, just inaccessible because of a bad search function? No. I did a different test, a regular search for the word “pedometer.” To find the oldest tweet, I went back page-by-page as far as I could. Actually, I did the search, then changed the “page=” part of the URL to a higher page value. The oldest tweet is six days old. Any that are seven days or older don’t appear.

This matches to what a reader asked about earlier this month. They’d emailed us:

Why am I the only person upset with Twitters cutback on their search engine results? Earlier this month I noticed that they were only displaying the results from previous two weeks. Today I see they have slipped to just showing results from previous 7 days? I can find no mention of this anywhere including their blogs.

Out Of Room In The Search Index

I checked with Twitter and got back this response from Doug Cook, Twitter’s director of search:

In response to your reader’s question, we weren’t growing our search index as fast as the tweet volume was increasing, so it started to represent a decreasing amount of time. In the last couple of days we increased our index size somewhat, so the amount of time will go back up, but there’s going to be a natural “yo yo” effect as the tweet volume increases in advance of our next index size “jump.” As you might guess, we’re working on making this far, far better.

Increasing indeed. This week Twitter cofounder Ev Williams tweeted that the service has seen its highest usage ever. All those tweets have to go somewhere, and that impacts the search index.

The search index is like a big book that contains all the information that the search engine looks through. With Twitter, when you’re searching for something like “pedometer,” if flips open that book and starts listing all the pages in it that mention that word — those “pages” being individual tweets.

However, an index can run out of room, and that seems to be what’s happened in this case. There wasn’t enough room to keep all the tweets recorded in the index, easily accessible for searching, so only about a week or so seem to be present. That’s why you can’t “page back” beyond a certain date. The problems with date restricted searching seem to be a combination of both the index limitation and probably a bug.

Meanwhile, Mashable noted this week discovering that the number of tweets shown for some individuals appears to have gone down massively, when the figures should have gone up. Cook said that’s almost certainly unrelated to the index issue and a separate problem Twitter is exploring.

Old Tweets Are Safe & Online — Twitter Search Just Can’t Find Them

Don’t worry that your old tweets have gone missing for good. They’re still out there, just not immediately accessible to locate through search or by working back page-by-page.

I bolded the words “through search” to stress another point. The old tweets do exist online, if you know where to find them directly. It’s just that Twitter Search itself can’t point you to them. For example, here’s a tweet I did from July 2009.

Some Search Alternatives For Twitter Archeologists

What if you do need to find an old tweet and don’t know where to look? Some suggestions:

The site:twitter.com portion makes Google show only things on the Twitter site (mostly updates people are making) for the words you specify (google os, in this example). After that, you can use the Show Option feature to choose “Recent Results,” then “Sorted By Date” to get the freshest posts at the top. Here’s an example:

The problem is, Google doesn’t always get the datesof a tweet correct, I’ve found. But it’s something to try.

By the way, if you want to search for tweets from a particular person, try it like this:

Just put the person’s Twitter name after the site:twitter.com/ part, as I’ve done for myself above. It works to focus the results to matches from just that person. You can also narrow further to get tweets from that person about a particular topic, such as this:

Over at Bing, the same techniques I’ve shown also work, though Google tends to be more comprehensive and consistent in what it shows, I’ve found.

At Yahoo, they also work — though if you do site:twitter.com or site:twitter.com/username without following those with any search terms, you’ll be kicked over to the Yahoo Site Explorer area. You still get results, but it’s through a different interface.

The best way I’ve found to dependably locate old tweets has been through FriendFeed. The caveat is that this only works for people who have imported their tweets into FriendFeed. As I do this, FriendFeed works as a great backup service for me to locate my own posts.

That should get me any tweets from the FriendFeed user named “graywolf.” Instead, I get back matches from Google Reader. And the Graywolf on FriendFeed isn’t the same as the Twitter user Graywolf (Michael Gray). So if that was who I was after — and I was — I’m out of luck.