Aggregation and Accretion

Aggregation:a group or mass of distinct or varied things, persons, etc.

Accretion: the growing together of separate parts into a single whole.

Bonus definition!Excellence:the state or condition of being excellent.

Step One: Aggregation

I begin with the proposition that aggregation has been one of the key concepts of the past five years or so—a period that we might roughly call the “Web 2.0 era,” if we were so inclined.

You can’t discuss aggregation without touching on RSS. We’ve never seen the oft-forecast “year that RSS goes mainstream,” exactly, but that’s really because we spent a long time looking at the wrong metric. While the online population at large has never taken to RSS readers as a replacement for just visiting a bunch of sites, RSS feeds (often unrecognized as such) have become mainstream as a component of the Web sites we visit or even the engine driving those sites behind the scenes.

Digg and its brethren (whether social, curated, or hybrid) represent another take on aggregation: you’re visiting Web sites, but those sites are themselves explicit aggregations of content from across the Web. Recognizing that the amount of stuff being created on the Web increases at a dizzying rate, we have invested in tools to aggregate what is most interesting (for an arbitrary value of “interesting”) from many different sources and present it all in a single place.

And while it might not be immediately obvious, the API boom of the mid- to late-oughts also gave the aggregation mindset a significant boost. Because developers could directly access the data and functionality of a wide variety of Web sites “behind the scenes,” we got a new kind of aggregation, most clearly expressed in the mashup: “oh, you want the data from site X overlaid on the map from site Y?”No problem.“Now you want to see the pictures from site A and videos from site B that are related to the topics discussed on site C?” Easy enough, here you go.

Step Two: Accretion

Aggregation obviously has real value and isn’t going away, but I believe that in the coming five years we’re also going to be talking a lot more about accretion. Consider this: a few days ago I was in a room with about a dozen people from the New York tech/startup scene who were asked what they were interested in and working on. Three people (and I was not one of them) used some variation of the phrase “real time” in their answer.

I think that in the short term this means that a lot of people will be focusing on the “twitch reflex” end of real time, with tools that offer immediate, compact, and contextual feedback based on users’ input: think pitches along the lines of “it’s like {Twitter, Aardvark, Foursquare} but {for, with} {dog owners, movie times, hyper-local content, flavor crystals}.”

As we go forward, though, I expect that more attention will be paid to the idea that “right now” is only one facet of “real-time,” and that it’s very probably not the most interesting facet of real-time. We’re now used to aggregating data across services to the point where it’s almost taken for granted, but the growth of interest in (and technological capacity to support) near real-time services means that exploration of data accreted over time will become an increasingly big deal.

Any real time service must allow users to participate quickly and easily (reference once again Fred Wilson’s twenty seconds tweet), which means that each data chunk will be relatively small; what’s interesting about those little chunks, however, is that they aggressively blur the distinction between data and metadata. The concern for real time services is not getting users to provide a lot of data at any single point in time, but rather to provide tiny bits of data relatively frequently.

Foursquare is the easy example here: is a foursquare checkin “data” or “metadata?” Many—possibly most—checkins have virtually no content in the traditional sense. You’re not asked to review or even comment on the venue you’re at, you don’t have to say why you’re there or who you’re with…it’s just user, timestamp, and location (and you don’t even explicitly provide the first two). By the standards of a 2004 Web service this would be metadata, not data.

The value here comes in continuing use. A single checkin or single tweet doesn’t mean much outside of a very short time window, and that has been one of the common criticisms of real time services thus far: that the half life of each piece of data is too short and the “content” too thin for this stuff to have real significance. And that’s true, as far as it goes.

It hasn’t gone far enough, though, because our first response has been to aggregate. Twitter’s trending topics and Social Great’s excellent trend tracking make a certain kind of sense out of this real time data, but I think that these are transitional. The big shift comes when we figure out how to make sense of the direction and velocity of change represented in these snippets of data that people are tossing out with a minimum of effort, consideration, and editing.

Web 2.0 brought us the 1% rule: of 100 people online, one will create something, ten will “interact” with that creation, and 89 people will just look at the results. If the next few years go as I hope and expect, those numbers are going to be upended. We’re going to be working with much, much more creation of much smaller things. We’ll still be interested in the aggregated “where are we all now?” but we’ll be paying just as much attention to the accreted “what routes did we all take to get here?”