Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

After the recent announcement that Groklaw will be archived at the Library of Congress, mjn writes with word that the push to archive more digital content continues: "The US Library of Congress announced a deal with Twitter to archive all public tweets, dating back to Twitter's inception in March 2006. More details at their blog. No word yet on precisely what will be done with the collection, but besides entering your friends' important updates on the quality of breakfast into the permanent archival record, the deal may improve access for researchers wanting to analyze and mine Twitter's giant database."

It's not like it takes a lot of space to archive them, it's just 140 characters per tweet. There's a lot of useless information in the newspapers and books too, but they have archived them too because some of that info is valuable or might become valuable.

O. M. G. This would fill more than half the hard disk space I have in my NAS...truly massive! (At my company, there was an April Fool's rumor going around on the day that Twitter would be going down for 10 minutes while their high school intern upgraded their "Tweet Storage Unit" (TSU) by adding an extra 2TB drive. Har har! To be fair, they store a good bit of metadata besides the tweet itsel

This is probably the best way to capture a snapshot of our current society. Sure, the barrier for entry is a little lower, but I think this will be invaluable for historians who look back and try to understand us.

I'm reminded of the Futurama episode where they go to a museum of the 20th century and everything there is ridiculously inaccurate because of how information tends to get lost and garbled over time. I can just imagine what a museum of the 21st century will look like if their primary source is old tweets. They'll probably think our self-imposed 140 character limit was due to some bizarre superstition and we worshiped someone known only as "aplusk" as a God whose wisdom came down to us in the form of what w

I's fun to think of historians as just attributing everything they learn about societies to religion and superstition, but the biggest reason we think pre-Enlightenment civilizations were obsessively religious is because the priest castes were generally among the most literate and the most concerned with preserving knowledge of the past. Much of what we know about history comes through their writings—and therefore, their perceptions. They quite literally wrote history, to a large extent, and our understanding of their society is colored by their bias.

The Information Age has democratized knowledge to a huge degree. Historians centuries or millennia hence will have plenty of sources other than the lens of the Catholic Church. Given current trends, even just a decade from now a few consumer-grade storage devices could hold everything the Library of Congress or Archive.org contains today. As long as there are a few people in the world interested in preserving it, modern history should be safe.

Hey, that's interesting and insightful. I never thought of it that way before. I wonder if skeptics and nonbelievers were as common then as now. (In America I peg us at about 20% of the population, with about half of us being in the closet.)

Imagine an ancient ritual sacrafice of a virgin or something, and one fifth of the crowd is sort of rolling their eyes thinking "really? I mean, really? You guys think that stabbing a girl with a hymen is going to bring you blessings from magical beings in the sky? Get a

I live in Wisconsin, grew up in Alaska, and lived for a while in New Hampshire and Massachusetts.

Also to be clear, for skeptical nonbeliever I refer not only to Christianity or its similar easy-to-characterize religions, but also the Eastern sorts of religions, and the New Age sorts of religion (ghosts, "energy", pagan spirits).

Obviously, I hope you are right that we amount to greater numbers. Where do you live?

Or, it might cause historians to think there was a Little Dark Age in the early part of the 21st century.

Now, if the LOC would archive/., the historians would know there was a Little Dark Age in the early part of the 21st century (and this post would be evidence that the denizens of the Little Dark Age even knew they were living in such a time).

Now, if the LOC would archive/., the historians would know there was a Little Dark Age in the early part of the 21st century (and this post would be evidence that the denizens of the Little Dark Age even knew they were living in such a time).

When the historians of the 50th century unearth the records of/., they'll realize the Final Dark Age came upon humans in the early part of the 21st century, and that while many saw something happening, none realized the extent. And then they'll click their mandibles

You have to remember that the people usually shouting "wargarbal waste of money" to scientific situations such as these aren't the type to give two shits as to generations that come after them, as we've all seen.:(

Future historians? These people are trying to burn history books today.

We learned more about ancient Egypt from their twitter then from all the official records designed to be survive the ages. Sure sure, very interesting to read the "unbiased" record of a pharaoh in his own tomb, but it is from the "trash" notes that were recovered that we learned about how the country itself worked. Including such little details as that the pyramids were not made by slaves.

The official records of the US will be Fox news. Better pray that future researchers have access to some other source,

You're probably right. For one thing, the Library of Congress runs the Copyright Office, and registering a copyright means the LOC gets two copies anyway. For another, the Library of Congress is an agency of the Congress, which has the power under the Fifth Amendment to take any private property for public use in exchange for just compensation.

I think that the importance of a single tweet varies depending on who is sending it and who is reading it. If I tweet/twitpic about some activity my children are doing, you might think a giant yawn is being generous. Meanwhile, however, a family member or friend reading it might be genuinely interested in that information. To give another example, if @grantimahara tweets about an upcoming episode of Mythbusters, you are a fan of that show, you'd likely find it very interesting. However, someone else who

In the history only popular news or writings were archived. Wouldn't it be interesting to see what someone else, normal people, said about Shakespeare or some kings 1000 years from now? All we have now is what was archived - popular writings that governments agreed to.

They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was;)

A jest, I know, but it does demonstrate a serious point.

Our history books are based on records maintained by the winners of wars, the leaders, the successful, etc. We know a lot about Shakespeare. We know relatively little about how his audiences actually felt about his work.

We largely speculate as to how life was for the ordinary folk during historical periods based on writings about them, not writings from them. The exception to this is diaries, and now many people maintain those any more. Twitter can help replace some of that perspective.

Admittedly, Twitter is not an ideal way to get a picture of a society, but you get to hear historical events told from a very different perspective. Actually, you get to hear them from LOTS of perspectives. They may not be an accurate portrayal of the events, but they are a snapshot of how a society reacts to and perceives events.

That in fact is an ideal reason to do this, and twitter is nearly the ideal forum. The only hole in it is that some people aren't represented. Those who are over- or under-represented can be identified and the weight of their observations adjusted. But those who simply are not recorded will not have had an opinion at all.

The real problem here is, the LoC is a government entity, and all my experiences with technology provided by government entities has left me less than impressed. Searching the LoC's arc

The exception to this is diaries, and now many people maintain those any more.

Maybe not in written paper form, but certainly many people maintain and update their own blogs, notes, and other status updates on things like Myspace, Facebook, and blogspot. Surely those resources would be a good source for the same type of information that is maintained in diaries. I suppose diaries had/have the added advantage of usually being considered private, so more information may be disclosed in them. However, it's become pretty apparent that there are still many netizens that don't think enough

I think Twitter is the ideal way to get a picture of a society. What people say on a daily, mundane level is pretty much what a society IS. The average schmuck doesn't give a rat's ass about what goes on on Capitol Hill (if they even know what Capitol Hill is). A society is made up of people, not leaders.

They were probably too busy watching Medieval Idol to even realize who Shakespeare or the King was;)

Shakespeare was Renaissance English Idol, while Chaucer slammed the Medieval category.

Just because something is now stuffy 'literature' doesn't mean it wasn't wildly populist entertainment in its time. There's a reason why a lot of Shakespeare centers on drunks, crossdressing and hitting people with swords.

The only time I really actively used Twitter was during the recent LHC 3.5TeV event, because the webstream was completely overloaded. LoC preserving it? Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

Sure, why not? You never know what sort of insights you'll get. What people do in their free time is just as important to historians as what they do when they're working. More so, sometimes, since the work is often ephemeral while the free time is an important insight into the culture as a whole.

Most of it's garbage, but garbage middens are one of anthropology's favorite data sources.

Future generations will look back and conclude that some people REALLY did have to TOO much time and trivial stuff to share.

Which is why its important that we store this information. We know what the history books are going to say. We know that the War on Terror will come out to either be a horrible attrocity that human kind should never try to re-attempt, or it will be declared a huge success that ushered in a new era of peace and stability. People will ask "I wonder what was going through peoples heads?"

And this is the PERFECT example. It will show that a lot of people didn't do anything, and they'll probably infer it to be Ap

The LoC isn't archiving URL shortener targets (yet, anyway), but the Internet Archive is on it [archive.org], which at least ups the likelihood that some future researcher will be able to decode what those links pointed to.

If they think tweets are worthy of being archived why not just archive every blog and comment in existence? Many of those offer far more worthwhile insight than 99% of tweets.

I remember in school students and sometimes teachers occasionally mocking the customs of past cultures. There was always that subtle arrogance that we're somehow more enlightened than people were 500, 1000 or 2000 years ago. The problem is that people confuse technological advancements for intellectual and philosophical advancement. I'

All 'useless twits' jokes aside, this is pretty interesting. But I wonder if they'd run into any copyright laws.

Reading the Twitter ToS turns up with this:

You retain your rights to any Content you submit, post or display on or through the Services. By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).

which looks to me like posters retain copyright, but Twitter retains the right to grant others the same license you've granted them (non-exclusive license to provide their service).

It seems like your reading is probably right, but I would hope they would at least anonymize the data. It seems like quite the invasion. Right now, one can only find tweets from a few weeks prior in Twitter's public search. Now anyone can request any prior tweet.

Even if you double or triple the data stored per tweet to account for other metadata, assuming the parent's math is correct, it still shouldn't matter because that's still a trivial amount of storage to manage.

Yeah, yeah, it's public. Agreed. And everybody knows there's no difference whatsoever between what some guy can read and an exhaustive, automated audit trail and connection map of everything that has ever been posted. That's why nobody uses search engines, after all.

Given that we can store almost 525 bytes [ksplice.com] of data in a single twit (I refuse to call them tweets), which is enough for a sector of data plus metadata, could it now mean we can store our data permanently at taxpayer's expense?

I call it TwitterShare as a play on RapidShare to send files easily... and now those files will be forever archived. Sounds like a good way to backup data to me! Other than letting everyone else in the world see your files...

...of archived gopherspace content I'm willing to donate to the LoC. Seems to me this dated motherload of data would have far more historical significance and impact than thousands upon thousands of dissociated mindfarts.

I find it quite ironic Library of Congress would be spending time archiving totally useless things like twitter.com postings, at the same time ignoring the thousands (if not hundreds of thousands or millions) of books in thier archive that they have yet to make public. I would say their first priority should be in making sure that everything that is in their actual Library gets put online and made public first, then after that work is done, then talk about doing other things. It is all a pretty big waste

I know you are joking, but this kind of stuff is actually very important to historians. For example, the only reason we are able to reconstruct how many hours a day people worked in the medieval era is by looking at court records - the judge will ask things like "what were you doing at five" and the person will respond with answers like "eating" or "sleeping" or "working", and by going though a lot of court records, we were able to guess at how people lived back then.

This will allow the historian of the future to guess much more accurately.

Trust me on this. There will be *way* more data than anyone needs to reconstruct "typical" expamples of this information, even if 99% of the data created from present-day society disappears.

The obsessives worrying that we're about to enter a digital dark age forget about the massive amount of loss of data, information, photos, etc. from the past, and also underestimate the stupid amount we're archiving (intentionally or otherwise) nowadays.

Seriously, why not? Mayhaps this will be a treasure trove for some unsuspecting social scientist in the 23rd Century. Really, the study of what boring, routine stuff people do day in and day out is important and can yield valuable insights into the past.

Soon after, he publishes a paper with his revolutionary new theory: People in the 21st century were so forgetful that they decided to record all details about their daily life in a central database so they could recover it if necessary.