So, about a year ago, I wanted to use XPath 2.0 on a project. Turns out no non-toy, non-alpha versions existed except in Java land (where Saxon is quite good). Has the situation changed at all? Anything on the horizon? Libxml2? Anybody?? -m

The nofollow setting on an outbound link should be a user-editable option, subject to the same community process that all other content on wikipedia already is. (Site guidelines, dispute resolution, restricted editing on certain articles for unregistered users, etc.) By default, links would get nofollow, but over time, they could be ‘blessed’, perhaps after a certain amount of time or human review. Wasn’t this how nofollow was supposed to work in the first place?
The community process works. Why maneuver around it? -m

(Press release) Starting today, Y! is the exclusive search partner for Opera Mini across more than 100 countries. The release also names “oneSearch”, going live later in Q1–definitely something to keep an eye on. -m

This Wednesday, I’m visiting Berkeley to speak with visiting professor Erik Wilde and his School of Information students. It’s an open-ended discussion, but will almost certainly center on XForms, the intentional web, and related information flow technologies. If you’re in Berkeley this Wednesday, drop me a line. -m

I’ve written before about the xslt2xforms project by Sébastien Cramatte. The project is not only still alive, but expanded into an entire utility kit including a PHP5 framework and forming “a complete xforms/xml toolbox based only on w3c standards”. Check it out on sourceforge. -m

Most of the censorship stories you hear on the news involve public libraries, but right now I’m writing this from a hospital, which has free wi-fi. Someone providing a service like this has latitude to do pretty much as they please, including censorship, but is it a good idea?

The system here evidently consists of a monitor observing every HTTP access, either forwarding it on or bouncing to another server, one that seems to be down. That second server, referred to only by numeric IP, has yet to ever actually respond, so trying to load any page with a blocked site requres a lengthy timeout of about two minutes before landing on a browser error page with a URL something like this:

Let’s take a look at what kind of sites this inane system prevents hospital visotors from viewing directly:

flickr.com (“Personal Pages”) — because honestly, who in a maternity ward would ever need to upload pictures of something?

360.yahoo.com (“Dating&Personal”) — because who in a maternity ward would consider posting to a blog?

my.yahoo.com as a (“Portal Site”) — because who, away from home for a few days, might want to check up on news of the world around them?

thinkbabynames.com (“Personal Pages”) — thankfully, this dangerous and immoral content too has been shielded from the eyes of maternity ward visitors.

At some point, somebody must have pointed out a flaw in their system–that any named site can also be viewed through a numeric IP. Instead of actually thinking about the problem, they also banned all numeric IPs, even for sites that would otherwise work.
The upside to retarded filtering is that it’s easy to get around. Techniques that work here include using a search engine cached page, Coral Cache (.nyud.net:8080), SSH tunneling, VPN, and adding a new entry to hosts to access the same site under a different name. The access is so slow, however (hmm… in a way another form of censorship) that the strain of the additional measures often leads to timeouts and various other errors.
Fortunately, the filtermasters haven’t caught on to dubinko.info yet, thus allowing this post to appear. I hear that site is pretty subversive.

What’s the net?

It’s obvious their list of sites to filter is woefully generic, not at all adjusted to the environment in which people will be actually using the system. And still, I’d wager they’re paying someone fistfuls of cash to keep updating the generic list.

I can imagine there are a few sites on the internets that wouldn’t be appropriate in this environment. The majority of well-adjusted adults are perfectly capable of choosing not to visit those sites.

In cases where supervision is needed, it is effective on a one-on-one basis, often parent-to-child. Witness how many ways there are to easily bypass the filters: software, particularly bad software, isn’t clever enough to replace human judgement.

Yay for the mobile web, which allowed me to upload my pictures anyway.

I dug into my mail configuration a bit more and made a few changes. In the past, I had been lazy, so when I needed new email addresses like webmaster at xformsinstitute.com and contact at xformsinstitute.com, I just set up a catch-all. I knew catch-alls would collect lots of spam, but I didn’t know (until now) that the particular skew of the spam would be such that tends to get around the filters.
So all the catch-alls are turned off. I set up explicit forwards for used email addresses, and I think I got them all, but if you get a bounce from any email address on any of my sites, let me know. After another 24 hours, I had:

38 spam incorrectly delivered to inbox (manually marked as spam)

120 messages automatically delivered to spam folder

1 of the above incorrectly (manually marked as not-spam)

A significant improvement. I wonder if it’s worth resetting the training data from scratch at this point? -m

Hmm, this seems like a new feature, auto-installed after my last mail client restart. Unfortunately, there’s no “what’s this?” link for further information.

I find it interesting that the scam message wasn’t also labeled as “Junk”. Also, for some reason, the word ‘scam’ feels unexpectedly slangy in this setting. Great feature, I just wish I was a little more transparent. -m

From mnot: the return of the Link: headers, last seen in RFC 2068, and a new header, Link-Template, which has me salivating over the possibilities.

I wonder, will this lead to better libraries for dealing with HTTP headers? Or at least better developer understanding of the benefits of not just taking whatever Apache or Tomcat or whatever yields by default? -m

Maybe their influence is starting to rub off on me. Here’s what I want: Dear readers, can you provide comments on any tips to achieve any of these in Emacs?

I keep about 20 files open at a time, in multiple “sessions”. With one dropdown in jEdit, I can switch to a different 20 files in a different session, all open and ready for editing. When I start the editor, I don’t need to individually open files.

I use a plugin to show a bunch of tiny tabs at the bottom, so I can see what’s open at a glance.

Text selection with shift+arrow keys, and copy and paste with Ctrl+C and Ctrl+V. PgUp and PgDn working. (Just like my web browser)

Ctrl+W to close a tab or workspace. Ctrl+T to open a new tab. (Just like my web browser)

Ctrl+S to save (Just like my…you get the picture)

I’m not a heavy mouse user, but when I do use a mouse, I should at least be able to select text with it.

Line numbers showing on each line.

Nice fonts (no small feat on BSD).

Here’s the kicker: I want to attach in from a remote computer (on a different OS) and have the same experience, same files already open, and so on. Here, jEdit isn’t helping (unless I go VNC, but that’s a big hammer…)

I’ve talked about this before, though my environment now is a little different. (For one, I am now making basic use of GNU Screen for my terminal sessions.) Basically, I want an editor that works like all the other software I use all day, instead of making me remember an entirely different set of key bindings. Every extra bit of my limited wetware storage claimed by my tools detratcts from the stuff I really need to be thinking about. Comments? -m

A while back, documenting my Windows XP SP 2 horror story, I mused about when Microsoft would have to throw out the code base and start fresh. Now, I see this, with additional commentary from Rick Jelliffe. Hmm. -m

Gray Knowlton, who indentified himself as a Senior Product Manager for InfoPath 2007 said the next version of SharePoint will “include InfoPath Forms Services, which will render InfoPath forms to browsers and html-enabled mobile devices, and this will not require InfoPath on the form fillers’ desktop, nor will it require any advance download on the part of the person completing the form.”