Discussion about my semantic web comments is going on over at Danny's place.

Here's my response to some of these :

On Microformats: Yep, I was using the term loosely, for any kind of home-rolled XML. (And maybe not even XML) So I don’t necessarily want to restrict it to “rel” tags.

Reiterating my emphasis :

And I want to go beyond a debate about upper-case vs. lower-case semantic web if all that means is whether the format is XML-RDF vs. some other representation which can probably be translated into RDF. This does seem to me to be a rather pointless argument about file-formats.

I want to emphasize the distinction I made in the original post because this appears to be the real heart of the argument, and something that’s genuinely theoretically interesting.

That is, the distinction between documents who’s elements get their semantics primarily from the fact that they are *in* those documents, vs. documents who’s elements get their semantics primarily from the fact they are linked to an external ontology.

Let’s call these “document-semantic” and “ontology-semantic” approaches rather than lower-case and upper-case SemWeb.

Now there is a practical question about which strategy to prefer. Document-semantics is, in a sense, what we’ve always had. And part of the pitch for the SW was that this was wrong and the ontology-semantics was a “better” way. It was going to make it easier (and more likely) that information from different sources would be combined, than an ad-hoc custom scraping of document-semantic documents.

Without that claim, the SemWeb seems to me to be nothing more than a file-format.

Now, the value of ontology-semantics really only kicks in when you want to meaningfully combine data from documents without knowing (or caring) what kind of document they’re in.

This combination of elements from different documents is what Shirky called “syllogism” and said wouldn’t be very useful in practice - because the document is the more important context. Half the alleged rebuttals are just trying to say that Shirky is wrong because that isn’t the aim. Nevertheless, I can’t see any other reason to prefer ontology-semantics over document-semantics.

On dependency of centralized servers (and other substrates) :

Shelley has an interesting point about the “centralization” of tagging. I agree about del.icio.us and flickr, but I’m not sure it goes for Technorati tags. I think these are embedded in people’s blogs, and technorati is just a “scutter”. If it disappeared tomorrow, and someone found it useful to write a new scutter, the data would still be there.

Also, I’m not sure this has any semantic significance. You might as well argue that English has a lousy semantics because if all English speakers died, the words wouldn’t have meaning. But all symbol systems need to be physically instantiated. That’s not an argument against their meaningfulness.

Shelley also says : “Technology is just moving parts and a bit of syntax.”

I think most theories of language assume meaning is embedded in practice, and by analogy a semantic web’s meaning would be embedded in the practice of the software that uses it. Nothing has semantics without the moving parts.

Danny Ayers sends people to have a look at my rant. Of course, it's kind of embarrassing he highlights my RageBoy-style swearing. Just to re-iterate, no offence intended to anyone in the following piece. Really. It's all just stylistic sugar. ;-)

But frankly, all the alleged rebuttals are just shooting at a straw-man of their own.

The basic Clay Shirky critique of the SW is that the pain outweighs the potential benefits, and so it's not going to work. Instead, we're going to get machine-readable markup by small, self-interested increments rather than using the W3C solution. Two years on, that assertion looks pretty strongly backed up by events.

Shirky illustrated this generic complaint with two more specific criticisms :

1) that the SW was trying to build a monolithic ontology.

2) that the main touted-benefit of the SW is that, because every semantic item had a unique URI, it should be possible to translate between different documents refering to the same things, and therefore combine the data they contain, producing inferences or "joins" between information in different places. And that this, in practice, will be too hard to be useful.

Of couse, Shirky rather over-egged the critique of syllogisms. And so pointing out that they happen in relational databases is a useful corrective. But this doesn't, as I'll try to show in a moment, actually save the SW project.

So let's take each of the rebuttal responses and look at them.

First, that there is no monolithic ontology. Well, if you take "ontology" in it's W3C technical sense, as a formal desciption of part of the world and the relations between the things it contains, then that's true. Each SW "ontology" is allowed to define its own things and relations. And W3C don't try to force everyone to use the same one.

But at a deeper level, there most certainly is an attempt to put all the things in the world into a single scheme. That is, everything has to have a URI. And URIs, by definition, need to uniquely individuate things.

Two things with different URIs have different identities in the SW, regardless of their context. While two things with the same URI are the same, regardless of context. If you look at Shirky's more recent obsession with tagging and folksonomies you'll see that he's discovering a contrasting world of useful meta-data that's being created without need for such unique identifiers.

In this sense, SW does demand a certain basic adherence to a universal standard that other, apparently more successful, markup schemes are not relying on.

I'll postpone the second claim, that "joins" in relational databases are proof that syllogism is valid, for a couple of minutes. Here I'll just ask if anyone knows of good examples of such joining being done in the wild using RDF. (Genuinely interested to hear of good, popular applications of this.)

More common is the "rebuttal" that argues Shirky is wrong because making joins between different documents is not what RDF is really about.

Which naturally raises the question : so what is the alleged benefit then?

Here's what it seems to be, according the counter Danny linked this time.

Unlike vanilla XML, RDF vocabalaries can be freely mixed together in data without prior agreement. So you often see ad-hoc combinations of Dublin Core, RSS1, MusicBrainz, RDF-calendar, FOAF, Wordnet, thesaurus, Geo-info etc etc frequently deployed together, despite the fact that the creators of those various vocabularies barely knew each other. This strikes me as the height of loosly-coupled pragmatism rather than a wide-eyed effort to build a monolithic universal category system.

In other words, that we can mix different information from different vocabularies in the same document without danger of ambiguity.

And this gives the key to what the SW really is, and why I think that it's not all that useful.

What's really going on here is a discussion about the context of or units of semantics - a debate between some sort of atomism and some sort of holism.

There's long been discussion in philosophy of language about what defines the meaning of text. What's the "unit" that defines semantics. Is meaning a property of words or of sentences? Or of larger contexts, of languages or cultures? There's an analogous problem in genetics, often called the unit of selection problem. Does the evolutionary selective pressure act upon - ie. do we give a semantic interpretation to - the individual gene? Or does the gene only have an effect and meaning in the context of the whole body?

For a long time, I've been puzzled by what exactly is so good about the ability to mix vocabularies in a single document. Let's consider the situation where I have a document mixing data from vocabularies V1 and V2. Now clearly, this document is meant to communicate between two programs, P1 and P2, which need to understand ideas from both V1 and V2. In other words, if P1 can produce the document, and P2 can consume it, then both P1 and P2 should really know about the kinds of things that V1 and V2 can describe.

But if both P1 and P2 need to to know about these ideas, then they can choose whatever protocol they like to exchange them. They derive no great benefit from keying into a widely published vocabulary.

This is a discussion made concrete by Dave Winer's RSS 2.0. If both feed producers and feed consumers need to know about authors and published dates and posts etc., then any file format which can represent these things is viable. (And the simpler the better.)

The only story I could ever imagine that made sense of the claims for the virtue of mixing vocabularies that were defined elsewhere was that a program, P3, might not know about a particular file format (eg. a syndication feed) but might nevertheless know about the Dublin Core vocabulary, and could therefore extract this and do something useful with it from an RSS 1.0 feed.

To me, that looked absurd. It's analagous to the old joke about counting sheep by counting the legs and dividing by four. P3 doesn't know what a syndication document is but it can work out what it "means" by knowing what the sub-document vocabularies mean. And then it's supposed to do something useful with it?

I think this story really throws into relief what the Semantic Web is about, and what the arguments are all about.

The (capital S) Semantic Web is a bet that the appropriate unit of semantics is the Vocabulary or Ontology.

Anti-Semantic Web arguments are really assertions that this isn't the proper or most appropriate unit of semantics.

Let's suppose I have a string "John Smith". Is its meaning more crucially defined by attaching it to a global vocabulary, or is its meaning more crucially defined by its context, such as the document that contains it?

You can, of course, derive meaning from both contexts but, goes the Dave Winer argument, the document is normally sufficient context, so why pay for anything else?

The genetics analogy is instructive here. Even hardcore "atomists" or gene centred theorists have to accept that the body plays an important role, and they've introduced the term vehicle of selection to cover it.

In the same way, you can take the Winer argument as being that documents are the main "vehicles of semantics", whereas the SW are essentially the "atomists" here (pun not intended by me :-). The individual atoms have their meaning, fixed by the uniqueness of the identifier (URI), and their "type" given by the ontology.

The idea that the document is a sufficient vehicle seems to be gaining traction, as the concept of micro-formats becomes more widespread. Essentially, hype about micro-formats is nothing more than an increasing number of people waking up and getting Winer's insight : "we don't need to be intimidated by this Semantic Web. It's not going to happen, or at least not soon enough to be worth waiting for. Let's create something where semantics are fixed by the local context of the document and the programs that use it, rather than a global context."

The second main front of the war against W3C atomism, is tagging. In this case, there are two things that fix the meaning of tags : the natural language of the users, and, once again, the local context defined by which application they're in. This markup is created by non-technical users, who naturally aren't in a position to formally define an ontology or RDF-schema before adding their markup. But they do have the shared standard of their natural language which they can hang their mark-up on. Here the contexts are wider than the scope of the W3C formally defined vocabularies.

OK, quick summary :

The argument in the Semantic Web is all about "semantics" and what most appropriately binds tokens in documents to their meaning. The W3C bet is that individual atoms - given unique identity via URIs, and types selected from global ontologies - is the best model for this. Opponents say there are better ways.

Two prominent fronts have opened up where rival representations are challenging the SW :

the "documents are vehicles of semantics" view, of which the argument between RSS 2.0 and Atom is the most prominent example. But where other micro-formats are also skirmishes.

the "human behaviour" model, where the semantics of tokens is bound by users and derived from their resemblance to words in everyday language. This is the tagging / folksonomic story. Here the "unit" or vehicle is the cultural practice.

Now, to get back to those relational databases. Joining within the database is easy. Because the database is also a unit of semantics,. It's the local context from which all the items derive their meaning. On the other hand, importing and exporting from one database to another is traditionally hard, because that crosses the frontier of semantic definition.

Obviously advocates of RDF see this and think "if only we had a globally fixed semantics" it would be easy. But if getting things between one database and another is hard, defining good global standards is harder. And in practice is NOT happening much.

And it's not a response to this to say that SW allows a plurality of rival ontologies which anyone can invent. Or that lots of people are. Either there is a single standard (as with the near ubiquity of the Dublin Core, and inter-op is possible, or there isn't and inter-op isn't.) But SW defenders often gloss over this, touting the two contradictory benefits of plurality and compatibility as twin virtues - as if you can have them at the same time.

In most cases, the benefits of defining the semantics globally rather than "vertically" within the application domain are marginal.

But it might, just, have been worthwhile if the cost wasn't so high due to the whole W3C implementation of the SW being so FUCKING botched!

But looking at this, something even more fundamental starts to increasingly bother me. Why did URIs have to look like URLs?

URLs describe both an online document and a transport protocol. URIs are nothing but unique labels for things which might or might not be documents and which might or might not be accessable over the internet.

I would, for example, be delighted to know whether friends at <xmlns:foaf="https://xmlns.com/foaf/0.1/"> are different from friends at <xmlns:foaf="http://xmlns.com/foaf/0.1/"> It's probably specified somewhere but I can't find the answer.

Basing URIs on URLs is, in retrospect, crazy. It's like deciding houses should be backwardly compatible with cars and have the same shape of door. Even though cars need to move and houses don't. Or, more charitably, it's reminds one of the early days of cinema which tried to apply the lighting techniques of theatre.

URLs and URIs are two different genres of reference making. And attempting to make them look similar has confused thousands of potential users. Marginally better would have been something like qualified names for classes in Java (eg. com.nooranch.myVocab.greeting) But even here there are some strange, unwanted, notions. Such as a clumsy attempt to classify the type of institution using the top-level domain such as ".com" or ".org". Why should it matter what the "type" of the organization is? Or the country that it comes from. Shouldn't we be suspicious that two vocabularies which define the same tags, but sit at .co.uk and .br are treated as sui generis different things?

Of course, there's a reason why URIs are URLs. Sometimes there really are things you want to get at over the web. And you need to find a real URL to get them. But this is an example of that damned premature optimization in action.

This is just one of many examples why, in the final analysis, the W3C's implementation of the SW smells so bad. And why programmers with a sense of design aesthetics run a mile when they see it. RDF is pitched as some extremely high-level meta-language which can describe almost anything, yet in practice it's riddled with premature implementation commitments : to web-protocols, to XML standards etc. It's this mismatch between the claims for generality, and these awkward, intrusive implementation details that looks ugly and is so off-putting.

Hand rolled XML doesn't have this problem. Sure it's inflexible, local, situated. But it feels appropriate to the scale of the problem. Micro-formats too. And maybe there are notations for the SW which you can reason about at a level of abstraction appropriate to the problem you're trying to solve. Though given the URIs == URLs commitment they clearly don't escape entirely.

And this, I suggest, is unfixable. Even if you dump XML-RDF (which I suspect people within the community who've invested (hundreds of?) thousands of hours of work in won't do) you can't dump the URI. That's the core commitment of W3C's SW. And that's an eternal, embarrassing reminder of the implementation leaking into something that was meant to be abstract. And it's what people cringe over when they see RDF and complain that "name-spaces are complicated".

Wow, this turned into a long rant ... quick summing up. In pure form, the SW is a hypothesis as to what's the "right" unit to fix the semantics of tokens in documents. And its value depends on that theory being right. Two rival notions of the correct unit of semantics seem to be thriving, and possibly showing that the SW hypothesis is wrong.

In practice, the SW looks ugly and off-putting because it failed to succesfully distance itself from certain implementation details as would befit the level of abstraction it aspires too. And this has left it with an awkward legacy of confusingness and complexity which is hard to fix.

Failure isn't inevitable. The SW may still be bulldozed through with enough hard-work by those with sufficient ideological commitment and / or money. But the rivals are thriving because they are cheap, simple and immediately useful. And history tends to favour such things in the long-term.

From about 20 lines of Perl to about 150 lines of Python with several classes. Some of that is due to the slightly more complex way Python gets at CGI variables. And some is because I've made the program do more.

Now, instead of simply reading a text file and converting it to includable javascript, it reads one file containing links and tags, and a second, template, file which decides what to show. That means the same "data-base" of links can be presented differently on different sites.

The template allows three kinds of line :

Plain-text which is output unmodified. For example, the subheadings you see on the left such as "Synaesmedia" and "IRL" are straight HTML <H3>s.

Single links, selected by name. For example, the data-file contains a link called "Synaesmedia Home" which links to my home-page. That's included in the template with the line ?Synaesmedia Home

Groups of links selected by tag. For example, all the links shown-under the IRL heading are links which are tagged "IRL", and the template contains a single line : * IRL which produces them in alphabetical order.

There are a couple of odd things about the script. The sorting is obviously by first name, whereas my old, hand-sorted, file was ordered by surname. So you'll find the links in different places. I should probably change that.

The other is that the same link can appear more than once if it has more than one tag. That looks a bit funny occasionally, but I think it's right in principle. Multiple-tags mean that the link should be seen in different categories.

Obviously, something that was initially trivial has started to grow, and that raises more questions ... how far do I want to continue with this? Is it worth adding to it further, or are there already existing services which handle the next level of complexity such as online editing of blogrolls and OPML sharing?

I don't know. For example, I've been wondering how I can integrate it with Zbigniew's Active Bookmarks. Or with del.icio.us?

I recommend a listen to that Kode9 mix (linked below). Love the Ghost Town cover it starts with. And a lot of other good stuff.

On the other hand, if you're expecting Rolldeep, Riko and Lady Sovereign, then you're in the wrong place. Kode9 seems to be part of that whole "intelligent" grime / dub-step thing like the Rephlex "Grime" compilation last year.

I'm writing this at around 1.10 am. There's been a lot of shouting and running about by the gang of teenagers who usually hang about at night in the public space outside our apartment. As it's winter, they usually build a fire and sit around it, smoking dope. In the normal course of events they appear fairly peaceful. Some loud talking and occasional raucus laughing or screaming, or some crashing as they break up pieces of wood for their fire, but not much else.

But tonight they're very agitated, and Gisel, listening to their conversation, at the window, says it sounds like they were just held up and robbed by someone with a gun.

There seems to be a lot of discussion and people shouting "Let's go there" so maybe this is some kind of gang thing and they know the people who did it. Or may be it was just random.

Whatever. They don't seem inclined to disappear back home and hide (my natural reaction.) So maybe they're plotting some kind of retaliation? Or maybe they know it's some kind of personal thing. Or maybe they still feel safe hanging around out there.

Thursday, May 26, 2005

A question has started nagging me. If all these stories are plausible, shouldn't I be doing something to plan for the end of cheap oil? And if so, what?

After our alcohol car was stolen we had to buy a new one. And this one uses petrol. We didn't have much choice, as they'd cut down production on making alcohol cars in the late 90s, which is when we're buying from (second hand)

We can get our new car modified to use alcohol though, and I think we will, very soon.. Brazil has a tradition of alcohol based cars, and should have the right kind of environment to produce it. So there's some hope of the country as whole making a fairly straight-forward transition to alcohol.

But what about Brasilia?

Brasilia is an absurdity of a city, totally designed around the car. And in the middle of nowhere. OK, it's not as bad as Las Vegas or other desert cities. But there's no railway, or large river. Transport is by (not good condition) road. Or air.

Flying will certainly be hit if oil gets too pricey.

Brasilia isn't in the desert, but it's not great agricultural land either. It's "cerado", sort of savannah-like. Don't see many cows locally. Although a couple of hours (drive) out and there are farms. Near the rivers you can grow fruits, vegetables, bananas and nuts etc.

But the real question about Brasilia is what the hell it's doing out here. Brasilia exists as a challenge to geography rather than an effect of it. It's a city of around 2 million stuck somewhere where few people chose to live. In fact, it was deliberately put here to try to encourage population influx to this part of the country, despite the lack of incentives.

There used to be mining in the region. There is agricultural. But it is a centre of neither. It would be absurd to put industry this far from any transport networks. So there's not much of an economy except for the government.

I'm sure that's been an extraordinary expense over the 40 years since the city was started; simply in terms of transport and energy to get goods here. And for government employees and representatives to shuttle backwards and forwards between the city and all the "real" places where they would rather be.

So as transport costs go up, will it be worth Brazil continuously paying to keep this displaced bubble of bureaucracy on life-support? Or will it become more sensible to cut the losses and move the command centre back to Rio or Sao Paulo?

Hmmm. It's not a beyond imagination, that this could happen within the next 30 years unless an alternative cheap energy is found. And that will turn the place into a ghost-town.

Sure a few banks have moved their headquarters and data-centers here. But they're just following the government. They'll be glad to return to Rio if the government reverses policy.

The first thing I noticed was that he was making the common claim about things which were reader-optimized : that the writing would later be optimized by better tools. Because I wanted to respond to that, it was the most natural thing in the world to link to the paragraph where he said it. Using its purple number.

But then a much bigger thought struck me.

Purple numbers totally make sense in the context of a blog.

Because when I'm making links to a blog entry, I'm typically already looking at it on the screen in front of me.

Grabbing the purple number attached to the paragraph is as natural as grabbing the url of a page, or a permalink to an entry. And indeed there's a continuity between the fine-grained permalink of blog posts and the finer-grained purple-numbered paragraph. The action and logic of linking is the same.

My initial discussion with Chris, was, in contrast, about purple-numbers in wiki, which is what E.E.C. and Blue Oxen have always appeared to me to stand for.

Now, the reason I felt a disconnect was that my typical link-making behaviour in my wiki is different from my link-making behaviour in blogs.

In ThoughtStorms, what usually happens is I get or find an idea to write about, I think up a suitable name for it, and create a page about it. As I'm writing this page, all kinds of associations with other ideas pop into my mind.

Some of those ideas I know have pages about them already, and I remember well what the name of that page is. Some of those ideas I know probably have pages but I don't know for sure what the name is. Some, I know don't have a page, but should. In the first case, I put in the name I know. In the second, I guess the most likely name. In the third, I think of the first sensible name that seems to get a handle on the idea.

In other words, I'm essentially doing something rather like tagging. I'm conjecturing how to reach other pages about an idea, based on my own ordinary language assumptions.

After I save the page, I sometimes find out that my guess was wrong. A page I thought existed under a particular name, didn't, in fact, exist.

At this point I might do one of two things.

I might realize the name was just wrong. And re-edit the page to fix it. Or I might decide that the name I conjectured was another worthy way to get at the idea; and so go to that non-existant page and turn it into a #REDIRECT to the real page. (In other words, I define it as a synonym.)

If I'm sure a page exists, but I simply can't remember the name exactly right (a common occurance with this page) I use UseMod's search facility, either with a term that I'm fairly certain is a substring of the name, or some other words that seem likely to be on the page. If I see the page-name in the search-results list, I copy it, return to the new page I'm making, and fix the link.

I other words, the one thing I don't do when making a link to a page is actually go and look at that page first!

After I've made the link, I do go to the page, of course. I go to remind myself of what I actually said there; and to see what other interesting connections it has. And probably to add a reciprocal link back to the new page I've just created. But that's later.

Now, it seems to me, unlike the blog case, there is no continuity between what I do when I make links in a wiki and what I need to do to use purple numbers. In order to find the purple number I'd need to be looking at the target page first, and then making the link to it afterwards. Now considering it's fairly typical that when I create a new 3 - 4 paragraph page, I'm embedding 2 or 3 links in it, plus another 3 or 4 on the "See Also" list at the end, and it becomes obvious why I'm fairly unlikely to make links to purple numbers within these pages. I don't want to visit all those pages before writing this one.

OK. But let's not make this a negative post. Given all that, what are the positive suggestions that could be made? Remember, my issue isn't with finer-granularity addressing or purpleness. It's with the "number" part of things ie. that paragraphs have arbitrary, purely syntactic markers attached to them.

But we could imagine markers that weren't meaningless. For example, if I could make a link which simply included the first X letters of a paragraph to be matched. If I could link to the previous paragraph in this post by making a link like this :

[[PostName#OK. But]]

Now, there is a continuity with how I already make links with known pages. For very well known pages, it's not beyond the realms of possibility that I remember that a paragraph criticising the main proposition of a page starts with "The counter argument against this is". Or that I start evolving within-page conventions like "summary : ", "results : ".

(Aside : Actually I may try to implement this in SdiDesk if I can figure out how to make the HTML control jump to anchors within a page. :-)

Alternatively we can imagine an IntelliSense-type tool. For example, when I write the name of an existing page, a little drop-down, starts floating above my cursor to offer me a selection of paragraph beginings on that remote page.

That sounds about right to me. Ward Cunningham also noted that wiki optimizes writing at the cost of reading.

If you think about it, that's true of a lot of the "bottom-up" / "worse is better" / "democratic" technologies which have become prominent. The web itself, with its ease of creating pages and links, but the tolerance of broken links and initial lack of any inbuilt "findability" strategy. Weblogs which allow anyone to publish, without constraint, but leave the reader lost in a morass of un-fact-checked, "untrustworthy" opinion masquerading as fact. Wiki (and particularly wikipedia) which enable anyone to contribute information, but do little to prevent malicious misinformation.

Optimizing writing rather than reading is a good strategy for a platform (or technology), because if you get a lot of "writers" or builders on it who create some value, this in turn is an incentive for someone else to solve the problems of the readers : search engines solved some of the problems implicit in the web's architecture. Blogrolls and Technorati and tagging solve the problem of finding good and appropriate blog entries. WikiMinion solves the problem of wiki-spam (more or less).

On the other hand, if you optimize for reading, while making writing more difficult, you have a challenge. As Chris notes, there's a hope that eventually technology will rescue you and tools will appear. But in the meantime, the platform is built out more slowly. And there's a danger that it never gets built at all.

I'm not sure what it says about the purple numbers concept that it optimizes reading rather than writing. Or that it's a little enclave of reader optimization embedded in the larger writer-optimized world of wiki. That may be a strength or weakness. Are purple numbers like tags or search engines : a supplimentary technology to improve the readability of a medium that started out optimized for writing? Or are they more like the semantic web : something which would be wonderful for readers if applied, but may never be.

They don't want someone to fork Java and take leadership. After all, without Java, who are Sun now? Just A.N.Other commodity work-station manufacturer / consulting business. It's their identity that's at stake here.

Of course, I think the whole thing is a tragic waste of resources. Re-inventing Java to pass all the compatibility tests is huge effort, which people would be better spending on some more worthwhile aim.

But if that's the itch they want to scratch ...

Update : Being fair, Gosling may have a point about the tests. A lot of them are there to protect Sun from legal attacks. And Sun could be hurt badly by patent violating code creeping into the standard.

Also, I like this :

But doesn't a language that's taken for granted risk losing its vitality and being overtaken by the next great language? "If there isn't a next great language, it says something really tragic about the evolution of computing technology and human civilization," Gosling said.

Does he see any on the horizon that might fit the bill? "I guess I haven't seen any out there yet that get me all excited," he said. "The world is filled with low-performance scripting languages, which I have a hard time getting excited about."

I don't use del.icio.us. My wiki ''is'' my bookmarks, and they are more or less tagged by the name of the page they're on.

So one interesting thing might be this : produce a list of the outlinks of a personal wiki, taking the pagenames as tags. Should be quite easy to compare and merge the collections of outlinks under the pages like "SemanticWeb" and "FolkSonomy".

Could also attatch to the bottom of these pages, all the links tagged on del.icio.us with the same name.

Mr Galloway said he had met Saddam Hussein on two occasions - the same number of times as US Defence Secretary Donald Rumsfeld.

'The difference is Donald Rumsfeld met him to sell him guns and maps - the better to target those guns. I met him to try to bring about an end to sanctions, suffering and war,' he said.

Update : Having watched the video, Galloway comes up with some nice rhetoric, but relies on way too much US bashing and aggressive smokescreening of his own. Hard to tell if he's innocent, decided to turn a blind eye to wrong-doing by the guy who was co-running his charity, or is as guilty as sin.

If you go there it says "this site is for sale". It's worth nothing to these people. It'd be worth them selling to me for the cost of a normal domain name (eg. 10 quid). I'd probably buy it for that price too ... if they told me the price on the page. Eg. "This domain is for sale : 10 quid".

I am absolutely not going to fill in a form to ask how much my own domain name will cost me.

I suspect what's happened is that it's changed hands, and this is another harvester of old names, or perhaps NameGiant have bought the other people ...

For some strange reason, the people who've stolen my domain name (synaesmedia.com) seem to have turned it into a search page.

Although they're clearly looking for RSS feeds. The Synaesmedia Homepage link seems to be going to ... oh no, on second try its really going to my page. Why are they being so nice to me? I'm suspicious.

A mixture of good links and intro for the uninitiated, with some slightly tawdry, breathless naivities. (Bill Gates backing his own "dollars"? Why? With what? Buy the "access to the music" of the opening act of the concert with your mobile, in the headlining band's currency? WTF? You're at the gig, you already have "access to the music"!!!! )

" PayCircle is a non-profit organization that is working towards open APIs for payment systems based on XML, SOAP, Java,"

I've decided to ditch slick production standards in favour of improvised, mumbling voice : umming, erring, heavy breathing, stumbling over bugs and shrieking in surprise, then burried under the weight of yet more bugs, giving up. The influence of Podcasting-style amateurism much in evidence.

Obviously, I hope people will be charmed and encouraged by the humanity of it all to want to join in. More likely this is where I lose any potential user-base I might have had. :-(

Nevertheless, you get a fair run through of quite a lot of the SdiDesk features. And some little blobs of my philosphy of UI design etc. You'll see what's possible and what needs to be done etc.

What the fuck?!!!

That has now concluded the gangrene came from an existing but unspecified disease but other injuries were caused by cleaning fluids used in the maid's work and that the bruising was self-inflicted or caused by a falling wardrobe.

Wednesday, May 11, 2005

The perenial AI claim gets a makeover : "'It is a short step from robot soccer to other useful domains such as robots that clean the house or work in an office or on the battlefield,' said Professor Balch." :-)

This is a good medium to show off the program, as the user's interaction is an important part of what I'm investigating and that doesn't come across too well in static screen-shots.

The screencast just scratches the surface of what you can do with SdiDesk (nothing about diagramming yet), but it should give a flavour. More screencasts revealing more functionality to follow ...

Oh, and it's quite fun to do the Wineresque rambling bit :-) I haven't jumped on the Podcasting bandwagon, though I think it may be time to dust off the moribund BeatBlog and hitch it to the phenomenon ;-)

"The purple numbers are meant to be entirely meaningless identifiers (not labels or names) and therefore not present any information about the content they identify (no sense of time of creation or of sequence). This makes them portable and stable in the extreme."

Which seems to suggest that the "meaninglessness" is a good thing. That's the natural "computer science" way of thinking, of course, : let's give things unique, arbitrary labels, and bind the semantics later.

The problem is, that it's pretty expensive to make in-links to chunks of data with arbitrary names. I have to go look at the page and look up the number. Or I need a good UI to help me - one I have yet to see.

Contrast this with what's shockingly radical about wiki : when I want to make a link to another topic I just guess the ID for that topic. I know the page about purple numbers will be called PurpleNumbers. I don't have to look. And I don't need the software to do any work to help me find out. The knowledge is extant in my understanding of my language plus a couple of wiki-rules. The semantics of the label are bound by everyday practice.

As well as making link-authoring dirt-cheap, it also has other surprising effects. It allows page-making and link-making to be decoupled in time in both directions. Not only can I make the link without knowing if the page is there, I can make the link to a page which doesn't exist. And if someone creates it later, the link will work.

In any system where nodes have arbitrary labels, the chances I can guess the arbitrary nodeID before the node exists, and that a later author of a node will conveniently use my label, is pretty remote.

The people who are starting to understand this are the Folksonomy people. Tagging is another strategy where semantics are fixed by everyday language use.

So now I'm wondering. The individual tagging of paragraphs is a good thing. The "numbers" are the bad thing. What if the paragraphs could themselves have names (as with Wikipedia's sub-headings) based on standard, easily guessable, conventions. Or could even have several tags, which could be guessed?

Dent is hostile to this notion, complaining of labels posing (miserably) as identifiers.. But I'd say "labels posing as identifiers" is as good a definition of one of the greatest wiki virtues as one can imagine. Not to mention the core of folksonomy.

Monday, May 09, 2005

Thatcher looked impregnable and popular, except to others within her own party.

There was a very loyal core of support, which kind of obscured how unpopular she'd become. Probably she didn't realize herself.

She got unpopular, outside the party, because of something she forced the British people for ideological reasons, against their will : the poll tax.

She got unpopular, inside the party because of a) a dictatorial style of premiership, and b) because a large part of the MPs in her party profoundly disagreed with her on another issue. (European integration)

She left a legacy of infighting that wrecked the Tory party. Partly because she didn't promote talent so much as ideological faithful.

Well now. Is Iraq Blair's poll tax or his Europe? Ie. is it the thing which finishes his external popularity or his internal? I'm betting on "internal". When he does something that really pisses off the electorate, that'll be the end of him.

How the hell can Labour avoid the subsquent melt-down?

Possibly Gordon Brown is the answer here. As someone who can play both the "continuity with Blair success" card, and the "I'm really different and NOT Blair, and anyway we've always hated each other" card. He has a better chance than John Major. (Seen as ideological continuation, and then traitor)

I feel pretty guilty about this. I tried to get to the consulate to register, earlier this week. When I finally spoke to someone there, I found I needed to have registered back before the beginning of March (when I was in Buenos Aires).

Monday, May 02, 2005

This is a tough read for those of us who are socialists, who are influenced by Marx, or who identify with left-wing movements throughout recent history. However much exageration and gleeful spinning there is over at Catallarchy, the basic story is true : the 20th century has been a blood-bath in the name of left-wing causes.