Friday, September 19, 2003

Barry Talks! : Terabytes, Petabytes and Metadata

Barry Talks! : Terabytes, Petabytes and Metadata: "Disk capacities double every year, and thus within a decade or so we will be seeing commodity 200 terabyte disks in home PC's. That is an enormous amount of data: by comparison, the entire contents of the United States Library of Congress requires something around a tenth of that, around 20 terabytes, give or take a few hundred gigs. (It's somewhat humbling to think that a person's lifework, say, the collected works of Shakespeare, can easily fit on a floppy disk.)
Put a different way, in 200 terabytes one can store the entire accumulated knowledge of the human race with much more than half the disk to spare.
All this capacity raises two fundamental questions: if we can put all this knowledge on a single commodity disk, how will we ever find anything? And if all that only requires a fraction of the available space, what will we use the rest for?
The answers, it turns out, are related; and, I think, as we outline them we will also begin to see the shape of the next great revolution in computing."

Very thought-provoking and timely essay from Barry Briggs, although I disagree with his dismissal of SQL; while SQL is far from perfect, the underlying relational calculus will be more rather than less relevant over time, with the increasing metadata:data ratio. An excellent reference in this context: Berthold Daum's Modeling Business Objects With XML Schema