I've started a series of discussions at the Harvard Berkman Center
(Baker House), approximately every other Wednesday evening at 6pm.
The first one was on facts and the locus of authority on the Web.
The next one will be on November 3rd and will undoubtedly be on something
about the Web's effect on democracy, with a side conversation on how
to apply for a Canadian work visa.

The series is free and open to the public. Plus, we serve pizza.
See you there?

Election
Chat

I set up an IRC chat during the last presidential debate, and about
50 people jumped on board, out-snarking one another about the candidates.
(Note: This was not a fair and balanced crowd.) Kevin
Marks, one of the participants, then surprised us by posting a
QuickTime
movie that plays the audio of the debate and shows the chat, synchronized
with the audio. Lots of bad language and comments we regret. Ulp.

I plan on setting up another chat on election night. You're invited.
Check my blog for details.

The future of facts (and the rise of fact servers)

The Wikipedia had to freeze the George
W. Bush entry a few weeks ago because people were altering it to suit
their political viewpoints at an alarming rate. So, the editors pared the
page down to the non-controversial "core" of facts. There was still
a lot of information there — much more than merely "He was born,
he drank, he became president" — and occasional acknowledgements
of controversies, such as whether Bush satisfactorily completed his National
Guard service.

But, most interesting to me, towards the top, on the right, the Wikipedia
ran one of the staples of its biographical entries: A fact box.

I find this two-tiered view of facts, quite common in reference works, fascinating.
And in the context of a bottom-up work such as the Wikipedia, in the midst
of a dust-up over what constitutes a factual account of the life of W, you
have to ask: What's happening to facts?

I don't like facts and I never have. Psychologically, metaphysically and
sociologically, I'm uncomfortable in their stern, disapproving, Cheney-like
presence.

Psychologically, I freeze when I have to recite one. They are, for
me, simply opportunities to be wrong in public. My hesitation is noticeable,
leading people to think I must be struggling to make up the fact, which actually
is frequently the case. That's why JOHO has been 100% fact free since it's
inception. That's my pledge to you.

I also have a metaphysical problem with facts. Of course I understand
that there's a real world that existed before I was born and into which I
will be buried (or smudged, depending on the cause of my demise). But facts
aren't the same thing as reality. They are one way reality — the way
the world is apart from our awareness of it — shows itself to us. Without
us, the universe would carry on fine, but facts wouldn't emerge from the darkness.
Because experience is cultural, facts are cultural artifacts: They're expressed
in language, they have a grammar, they are deeply contextual. Facts don't
like us saying that, but it's true: "The Titanic sank in 1912" is
only a fact because of a context that implicitly includes an understanding
of how names stand for things, a decision to mark time by trips around the
sun, a convention that numbers years from the birth of a guy I don't care
much about, and a historical-cultural context that says that the sinking of
a large ship is worth making an explicit proposition about.

Now, you probably snort at that line of
thought because you think I'm running from the pure, brutal "Look,
it happened!" that facts express. But I'm not. It was sad when
the great ship went down (down to the bottom of the...), and it happened
on a date we agree on. But facts are not context-free meteors that slam
into our planet unbidden. They are instead a way of conjuring up the
world in one of its infinite facets. They are a way of speaking, a form
of rhetoric, and thus should not be treated as if they are the end-all
of thought and discussion. But, sociologically, that's often
how they're used: They are the knuckle sandwich of rhetoric. Facts are,
of course, peculiarly important, but they are not the only peculiar
and important things we say to one another. And they are not quite as
reality-based, muscular and manly as they pretend. Inside every fact
is a value struggling to get out.

I
Love Facts

To forestall rants about how I don't believe in facts
and think that, for example, the date the Titanic went down is subject
to debate, let me state for the record: The Titanic sank on April 15,
1912. We should reject any explanation of facts that lets someone claim
that the date of its sinking is up for grabs, relative or unknowable.
Facts are crucial in disciplines I care a lot about, including science
and journalism. Nevertheless, facts are form of understanding and a
form of rhetoric, and thus they are always infected with slimy humanity.

So, when the Web started heating up the Internet, I was among those who thought
that we were going to see a merging of voice and facts, and, more particularly,
voice and objectivity. (Objectivity is the mood in which we get all factual.)
To a greater extent than I'd hoped, that's happening: Just read your 50 favorite
blogs. Many Big-Time Journalists go to absurd lengths to hide their political
sympathies — one editor boasts he doesn't even vote — but it's
reversed
on blogs: If we don't know who you're voting for, how can we trust what
you write?

And yet...There are classes of facts I don't want wrapped in voice. If I
post a question about the battery life of a laptop, I'll trust the people
who write in response more than I trust the computer company's site, but I
trust the company site more for the dimensions of the machine. The company
is liable for its answer in a way that a random blogger isn't; if I have to
buy a new carrying case because the number was wrong, the blogger can say,
"Sorry, dude, I misread the measuring tape," whereas I'll expect
the company to compensate me one way or another.

Similarly, I count on mainstream newspapers to provide fact-based stories
that "cover" an event: I don't expect in the foreseeable future
to be counting on webloggers to tell me how many troops attacked Samara, how
this was coordinated with other simultaneous battles, or how many civilians
were killed. Of course I expect bloggers to fact check the media's ass but
good, which implies that I don't have full confidence in the media's ability
to deliver the facts. (PS: there's no such thing as "the" facts
because which facts are relevant is not itself a matter of fact.) But covering
events seems to require the type of centralization that only a news bureau
can provide. (Hint: Any sentence of mine that of the form "only a _____
can provide" is likely to turn false particularly quickly.) Further,
news organizations stand behind their stories in a way that someone talking
over the virtual back fence doesn't have to. (Of course, sometimes the news
media stand behind their stories Rather longer than they should.)

The role of facts in discourse may look immutable, but it is exactly the
sort of thing that can change; I've been reading Foucault recently and it's
startling how such deep structures can transform rapidly.(It's also startling
how unbelievably brilliant Foucault was.) I don't know what will happen, but
my hunch is that we are heading towards commoditizing facts, driving down
their value so that they don't provide differentiating value. For example,
take the table of Bush facts at the Wikipedia. With the right API, the Wikipedia
could become a Fact Server that delivers the undisputed facts about any of
its 1,000,000+ topics to any application that asks politely, making facts
cheaper than popcorn.

Now, it would be irresponsible for a fact server to serve up dubious or putative
facts, but if it only serves the commoditized facts, it won't have all that
much value. So, perhaps fact servers will deliver facts along with metadata
about how reliable the facts are: It's 0.99 certain that Bush was born in
1946 but it's 0.4 that he completed his National Guard duty. Will this sharpen
the line between the two tiers of facts — the reliability of lower-class
facts will always be the subject to argument while 0.99s are beyond serious
dispute — or will it tar all facts with the welcome brush of human fallibility?

There are bunches of other questions, many of which take on an Hegelian cast.
For example, the Wikipedia fact box gives Bush's date of birth but not his
race. That's because our culture does not count race as relevant (haha!),
and, no, you can't always tell from the photo. The Wikipedia fact box also
does not state who W's parents are, yet in some cultures knowing your parentage
is as important as knowing the year you were born. But, if Wikipedia acts
as a fact server, it won't have to decide which 0.99 facts to include in the
fact box. It will simply serve up all facts the requesting app wants. Thus,
Bush's date of birth, race and parentage will show up as equal; if your culture
values parentage, your app will make a big deal of that. If some other culture
considers listing the date of birth to be a type of ageism, its apps will
ignore that datum. Undoubtedly, some app will find intense value in the 0.99
fact that Bush is white. So, the commoditization of facts may result in the
formation of cultural fact boxes that divide us on the basis of a consensus
core of 0.99s that we all agree on: Cultures united in a core of commoditized
facts from which they select the fact boxes that divide us. Weird. Or is it
the way the world has always implicitly worked?

The delivery of facts with probabilities as part of them could lead to unpredictable
consequences. Building doubt into facts could transform their rhetorical and
social role. Will we recognize facts as being as perpetually subject to argument
as are opinions? Will their source of authority become an integral part of
them, as opposed to being an outside reference? Will the recognition that
they're socially conditioned degrade them so that all facts are equal, no
matter how contradictory or stupid — appending a huge "Whatever!"
to all factual discussions? Are we heading towards a more sophisticated, nuanced
way of thinking that will put facts in their place, or towards a new age of
stupidity and obstinacy? And in the new world of facts, what will be the sound
of voices conversing and voices testifying?

I believe we are currently inventing a new and important life for facts.
We just don't yet know what it will be.

Here's an idea for the book I am perpetually working on working on. (No,
that's not a typo. I've been working for over a year on a proposal that would
enable me to work on the book.)

There used to be a difference between data and metadata. Data was the suitcase
and metadata was the name tag on it. Data was the folder and metadata was
its label. Data was the contents of the book and metadata was the Dewey Decimal
number on its spine. But, in the Third Age of Order (see the previous
issue), everything is becoming metadata.

For example, imagine you're at a large corporation doing a Third Order treatment
of its digital library of research articles. Instead of (or, in addition to)
designing a large, complex, hierarchical taxonomy, you focus on adding enough
metadata to each article so that people will be able to sort and classify
them any which way they want. If someone wants to find all the articles that
talk about hydrocarbons written in Italian in 1965 and that have more than
30 footnotes, they'll be able to. If someone wants to make a browsable hierarchy
based not on topic but on gender or on the number of co-authors, they'll be
able to. You build enriched objects first so your users can forever after
taxonomize the way they want to, instead of the way you
think they'll want to.

Now take a closer look at these information objects. They look like contents
tagged with lots of metadata, but in fact they're all metadata. If I'm looking
for an article about hydrocarbons written by Barbara Rodriguez, then the article's
topic ("hydrocarbons") and author's name ("Rodriguez, Barbara")
are metadata, and the content is the data. But, I could just as well be trying
to remember the name of the author who wrote an article that included the
phrase "Hydrocarbons are the burros of the the cosmos" sometime
in the 1960s, in which case the content and date are metadata and the author's
name is the data. What's data and what's metadata depends on the person doing
the asking.

So, in the Third Age of Order, all data is metadata. Contents are labels.
Data is all surface and no insides. It's all handles and no suitcase. It's
a folder whose content is just another label. It's all sticker and no bumper.

Why does this matter? It changes the primary job of information architects.
It makes stores of information more useful to users. It enables research that
otherwise would be difficult, thus making our culture smarter overall. But,
most interestingly (at least to me), this does the ol' Einsteinian reverse
flip to Aristotle. Aristotle assumed that of the 10 categories by which one
could understand a thing, one must be primary: Where that thing fits into
the tree of knowledge. So, you could say that Alcibiades is made of flesh
or lived in Greece, but if you really want to understand him, you
have to say that he is an animal of a particular kind. But, now that everything
is metadata, no particular way of understanding something is any more inherently
valuable than any other; it all depends on what you're trying to do. The old
framework of knowledge — and authority — are getting a pretty
good shake.

Right? Wrong? Old? Obvious? Pointless? Stop me before I make a fool of myself
to someone not as nice as you...

My friend Robert Morris who teaches computer science at U. Mass Boston, and
who has always been unnecessarily generous to me with what he knows, says
that the above is pretty much old news:

The short answer is
that in the business, nobody anymore contends there is a diffference between
data and metadata ort her than in a context such as you mention, namely
the metadata is usually that part which helps you locate and use the other
part and which you can often ignore if you already know those things.

Bob points to Life
Science IDs (LSIDs) as an example of a standard that does sort of distinguish
data from metadata.

An LSID is an immutable,
permanent, globally unique key to a piece of information. The LSID spec
requires that getData always return the same bytes for the entire future
of the universe, whereas getMetadata may return things about the information
that could change.

Middle
World Resources

It's no surprise that O'Reilly Publications is a cool company.
It's geeky and Tim O'Reilly is, IMO, a hero of the Web. Even
so, I'm often impressed with just how right they get just about everything
they touch. I don't want to rave about Foo Camp, the free-form weekend
camp-out for nerds and geeks, but one of the many reasons it succeeds
is that even though the company pays for it and lets us occupy its building
and grounds, Tim keeps the weekend free of overt O'Reilly commercial
messages. In true end-to-end fashion, O'Reilly gets out of the center
and allows the ends — the attendees — to connect.

Bit by bit, I'm replacing my desktop apps with open source ones. The
latest one to go has been PolderBits,
a fine sound recorder/editor for which I was happy to pay $29. But Audacity
is at least as good for my minimal needs. And Audacity is open source
and free.

I'm not doing sound editing, so I have no opinion about how Audacity
stacks up in that regard. But it's terrific for recording onto your
computer off a microphone and — more important — for recording
whatever sounds you're streaming. So, if you're listening to radio over
the Internet and they're playing a song you'd like to keep, just press
the Audacity "record" button. (That's known as the "analog
hole" to people who want to plug your every orifice with Digital
Rights Management controls.)

Then you can do a whole bunch of manipulation of the sounds, but I
don't.

What I'm Playing

In order to postpone the pleasure of Doom 3, I'm playing Far Cry, yet
another shooter. Lots of people like it more than I do. And there are
many elements to admire: The graphics are detailed and the island on
which it's set is beautifully drawn. The enemy AI is the best I've seen;
not only don't they get stuck running into palm trees, but when you
shoot at them from a distance, they do things you might do, assuming
you're not a pants-wetting civilian like me. I even don't mind their
we-know-best save system, especially since you can save anywhere you
want if you look up the code on the Web. But, I'm just not finding it
all that engaging. Painkiller, which I finally and regretfully finished,
has humor and imagination going for it. Far Cry has a beautifully rendered
tropical isle and not enough imagination.

Email corrections, additions, detractions and refutations

I got great mail from bunches of you about the piece in the previous
issue about Dewey. If I try to respond to it all here, I'll never get
this issue out, and in the new fast-paced world of the Web, I'm trying to
pare JOHO down so that it can come out more often than Punxsutawney
Phil.

Several of you took issue with my statement that Melvil Dewey was "a
progressive on social issues." Of course, you're right. But you're not
as right as some of you think. Although it's certainly true that he was forced
to resign as NY State Librarian in 1905, Wayne A. Wiegand paints a complex
picture in his biography, Irrepressible Reformer (1996). The anti-Semitism
that got him fired seems to have been rather conventional: He and his wife
created a gated, semi-utopian community at Lake Placid that casually excluded
everyone except white Christians. In his day-to-day dealings, he seems to
have expressed no hatred of Jews. He also was accused of being a sexual harasser,
although — in part because the language of the day was so circumspect
—it's hard to tell from Wiegand's book just how lecherous he was; that
he made at least some of the women who worked for him intensely uncomfortable
seems certain, and it may have been much worse than that. And, indeed, the
fact that women worked cheaper than men undoubtedly was important to him as
he staffed up. The reality seems at best disturbing.

Wiegand argues that Dewey's forced resignation was due not just to his anti-Semitism
and his abuse of women but also to the fact that he was egomaniacal and a
shady bookkeeper who made lots of enemies for good and bad reasons.

In short: Dewey was complex.

(Thanks for the correction.)

Bogus Contest: Name that entity!

You know those objects I talked about in the article above, the ones that
are all metadata and no data? I want to give them a name. It should be something
that businesspeople can talk about without embarrassment. At the moment, believe
it or not, the best I've come up with is extradata; at least that
would let me talk about data, metadata and extradata. So, you do better. You
might take it in a completely different direction. For example, you might
suggest "i-objects," "data monads" or "chrontent,"
which I'd then reject and possibly laugh at.

So, go ahead. I could use a good laugh.

That's it for JOHO. Sorry for the delay. There's just too much going on.
And don't forget that I'm writing absurd amounts of paranoid drivel over at
my blog. Just think how much
worse it's going to get after November 2 when I am terminally depressed. So,
read my blog now, before the Great Depression begins.

And, if you're American, don't forget to vote. Depending, of course.

Editorial Lint

JOHO is a free, independent newsletter written and produced
by David Weinberger. If you write him with corrections or criticisms, it will
probably turn out to have been your fault.

To unsubscribe, send an email to joho-request@freelists.org
with "unsubscribe" in the subject line. If you have more than one
email address, you must send the unsubscribe request from the email address
you want unsubscribed. In case of difficulty, let me know: self@evident.com

There's more information about subscribing, changing your
address, etc., at www.hyperorg.com/forms/adminhome.html.
In case of confusion, you can always send mail to me at self@evident.com. There is no need for harshness
or recriminations. Sometimes things just don't work out between people. .

Dr. Weinberger is represented by a fiercely aggressive legal
team who responds to any provocation with massive litigatory procedures. This
notice constitutes fair warning.

Any email sent to JOHO may be published in JOHO and snarkily
commented on unless the email explicitly states that it's not for publication.