Given that it’s 2007 now the decision to work with the old data might be strange, but I want to do it to insure compatibility with the analysis in the paper, which is based on the conversation in Nov 2003-Jan 2004. I’m pretty sure (based on non-systematic observation 😉 that blogging practices have changed from 2004, most likely in respect to the following things (mainly those affecting linking between weblog posts that is at core of our definition of a weblog conversation):

Relations between people have evolved and many conversations are moving from weblogs to other media. An example of that is in my paper with Andrea, but I guess many longer-term bloggers could tell similar stories.

Number of (relevant) weblogs have expanded, so reading practices of some people have changed (do you read weblogs of others as closely and as consistently as in 2004? I don’t.)

Large scale introduction of tagging and evolution of categorisation-related features of weblog tools might have changed practices of organising one’s thinking in a weblog, so there is less need to rely on linking to one’s own posts.

Other issues with the dataset:

The community membership is defined in some (attempting to be objective, but far from perfect) way. Some members are probably missing, others do not necessary belong to the community if defined in other ways (e.g. based on topical analysis).

We have only weblog posts (and not comments) in the dataset, which limits the analysis (e.g. we can’t do a proper comparison with the conversation analysed in the paper with Aldo, which included weblog posts and comments).

Some conversations may span boundaries of a community, so those will not be discovered or will be “truncated”.

I’ve got multiple lives, in Russia and the Netherlands, as a practitioner, as a scientist, as a mother, as an educator, as a gardener… I write about things that call for it, but somehow it always ends up being about learning, boundary crossing and networked individuals.