Thursday, October 30, 2008

Also on the UCSD music blog I mentioned (which, on closer examination turns out to be, ahem, the "music" category of the UCSD Arts Libraries blog) is a post on a Musopen, a free classical music site that, among other things, will let you pledge toward getting someone to record the public domain work of your choice and place the recording in the public domain (They call it "bidding", but I'm not sure that's quite the right term to use -- generally a bid is for the whole price and the highest wins).

The performer sets the price (e.g., $60 for Für Elise or $3500 for Mozart's Requiem). You tell them how much you'd be willing to pay to see it recorded. When the total pledges match the asking price, the transaction goes through, the performer performs and everyone in the world can listen for free, whether they contributed or not. This is effectively one of the business models Stallman mentions in the GNU manifesto.

The economics of this look interesting. I doubt anyone is going to get rich off of it, but there would appear to be some value to people in making a recording happen and free riders be damned.

[Museopen is still around, but without the business model. The business model is still around, without Museopen, under the name of KickStarter, of course --D.H. May 2015]

This site was basically a carefully-moderated wiki to which people could upload music scores. The Canadian administrator of the Canadian-hosted site had been very careful only to put up scores in the Canadian public domain, and the Canadian Supreme Court had ruled that such sites are entitled to presume that they are being used in a lawful manner.

Austrian music publisher Universal Edition sued on the grounds that it had put up works (it didn't say which particular ones) still under European copyright. At stake, then, is the question of whose copyright law applies: the host or administrator's or (effectively) the longest copyright period on the books anywhere. The situation seems entirely analogous to that of international libel laws.

As a result of the suit, the site was taken down in October 2007. At this writing, the IMSLP is back up, having been put back up in June 2008. The re-opening letter accompanying the re-launch is well worth reading. Among other things, it discusses (and endorses) the idea of collaborating with commercial music publishers.

Though the preamble at the top of Google's cached version claims that someones and password "only appear in links pointing to this page" (one day I shall track down what they're driving at with that), both password and the uninflected someone appear in the page, along with how, to, guess, on and worldscape.

In other words, Google is perfectly correct in bringing up this page. Normally, it would have put millions of other articles ahead of such a hit in the list, but in this case there just don't seem to be that many candidates to choose from. Sometimes dumb can only be so smart. Fortunately for our searcher, some of the other hits have to do with the art of guessing passwords, a good thing to read up on next time you have to make one up.

For the curious and/or obsessive, here's the rundown:

guess shows up mostly in the articles on anonymity, with regard to guessing someone's identity, and otherwise where I say I'm guessing (four times in that apparently uncertain month)

password shows up in the post on trusted computing

someone shows up because it's a reasonably common word. It's probably a bit more common than chance in the anonymity articles

how and to are most likely dropped on the floor, but of course they appear in numerous places

And the punchline: worldscape refers to the Worldscape Laptop Orchestra, probably not what the searcher was trying to gain access to. But I do see a note there that I need to find a better link. Don't they have a home page yet?

Tuesday, October 28, 2008

Re-reading, I notice I said in the post on tinyurl that URLs are "specifically designed to be written down, printed, put on billboards, read over the phone and otherwise transmitted non-electronically."

That's not quite right. Reading something over the phone is transmitting it electronically. How about "non-digitally"? As far as I'm concerned, text is digital, whether you put it on the wire, store it in a disk or spell it out in print. OK, how about "not transmitted over the net"? Well, OK, but what about VOIP? A significant portion of phone traffic goes over the net.

How did the RFC I was referencing get around this? It says a URL "may be represented in a variety of ways: e.g., ink on paper, or a sequence of octets in a coded character set." In other words, it talks about representation and not transmission. That's a good call for the RFC, but it leaves me out in the cold.

OK, so what's different about sending a URL in an HTTP request and reading it over the phone to someone over what happens to be a VOIP connection? Clearly the VOIP-ness of the connection is accidental, not essential, whereas HTTP has to run over a TCP connection. It's incidental because the audio stream is opaque to the software. The VOIP infrastructure will carry conversations that contain URLs and ones that don't with equal ease. In the case of an HTTP request, the URL is very much visible and needed.

Is this all just quibbling over definitions? Sort of. But the same problem crops up in other, more practical situations. An unstructured block of text is semi-opaque to the web infrastructure. You can scrape it and parse it and extract nuggets of meaning, but that takes a lot of effort. About the only thing you can really do with it easily is index it using what are largely crude but effective methods. Images and audio are even more recalcitrant.

As I understand it, the semantic web and its cousins like microformats are attempts to dig out otherwise obscured information and present it in machine-friendly form. From that point of view, if I were able to, say, press a "mark URL" button on my phone and cause the URL to be extracted via speech recognition, the phone call would cease to be opaque and would become at least that bit more webby. If it were stored somewhere accesible by a URL along with any extracted information, it would be a full-fledged resource.

Not sure there's anything deep or notable in all that, but there it is. I should also note that I previously touched on the metadata-scraping theme from a slightly different angle in this post.

Frankly, I'm not up to speed on this one, but I saw it float by and thought I'd mention it. Cable provider Cox Communications is looking to get into the cell phone business. Since like most cable companies they're already in the broadband internet business, and they also provide wireless internet, this seems like a bid to expand its presence into yet another means of delivering bits.

I have no idea whether this will work. If it does, you've got one-stop shopping for pretty much everything people do with bits these days and it's a brilliant coup. If it doesn't you've got a company with more fingers in pies than it has fingers and it's a colossal blunder. Judgment will be delivered in retrospect, as usual ...

[The Wikipedia article on Cox calls it the third largest cable provider and the seventh largest telecoms provider in the country. Looks like it worked out OK. Even if on a later sweep through Cox is defunct, I don't see how you could blame it on the decision to enter the cell phone business. --D.H. May 2015]

The Christian Science Monitor has just announced that, starting in 2009 it will no longer publish a print edition but will shift to "an online publication that is updated continuously each day." Since this would describe most papers' online editions (including the Monitor's I would think), the upshot is that they're dispensing with the print edition. I heard about this on American Public Media's Marketplace, which pointed out that the Monitor is something of an outlier, since it's a non-profit publication funded by a church.

As such, the Monitor is more concerned with maximizing exposure while minimizing cost. That sounds an awful lot like the usual newspaper business model, except for the small detail of advertising revenue. In other words, it's nothing at all like the usual newspaper business model.

Except, maybe it's not so different. It's not the Daily Monitor, it's the Christian Science Monitor. The whole idea is to get the Christian Science "brand" out there. In that sense, the Monitor's model is also advertising based, albeit with a single, fairly deep-pocketed and indulgent advertiser. So ... maybe it's not such an outlier.

Sunday, October 26, 2008

One of my favorite measures of how well something is doing on the web is whether I see it mentioned in the news (excluding, say, the San Jose Mercury News). Today I saw a couple of tinyurl links in the Sunday paper.

Tinyurl is one of those cool ideas that I personally don't find much occasion to use. In fact, I had forgotten they were around, but I'm glad they still are. The whole concept is just so, well, webby. You can explain in a short sentence what it does: Provides nice short URLs that work in place of the monsters you sometimes run across. You can also explain in a short sentence how it does it: Keep a database of short identifiers and long URLs and use HTTP's redirect facility to make one stand in for the other.

It's so fundamentally cool that I'm surprised that I don't use it. Most likely it's because I send and receive most email in URL-friendly HTML, and of course Blogger is right at home with any URL-from-the-deep. But as the RFC makes clear, URLs aren't just intended for computers to see. They're specifically designed to be written down, printed, put on billboards, read over the phone and otherwise transmitted non-electronically. That's not something I seem to do much, but it's a good place for tinyurl. My newspaper seems to agree.

Monday, October 20, 2008

It's fund-raising season at NPR again, and some stations are giving out "radio bookmarks" as premiums. What's a radio bookmark? It's basically a USB device that you can push a button on while you're listening to the radio. It will record enough information for you to go to the station's website when you next go online, find out what was on just then, and even play it back.

In other words, it records the time.

Sort of disappointing when you put it that way, but on the other hand there's no rule that says an idea has to be complex for it to be at least moderately useful. Back in the dotcom days, the same idea was supposed to make Sony a lot of money. Dan Bricklin has a review of it (then called "emarker") from Comdex 2000. The original emarker could store (drumroll, please) up to 10 events, which seems ridiculously small even for the times, but what do I know?

Bricklin makes the point in real time that I was going to make eight years after the fact, that the setup "puts the appropriate intelligence at the right places". In other words, the device itself is dumb. The intelligence comes from the underlying database via the web. I agree with Bricklin that this is the right way to do it, if only because you can easily correlate with other databases as well (video springs to mind). In a way it's a good example of "dumb is smarter", though in this case it's a particular piece, not the system as a whole, that's deliberately dumb.

It's a separate (and interesting) question whether the emarker/radio bookmark "uses the Internet in ways that show the future". Eight years on, I'm thinking not so much, but then the question becomes "why not?". The general idea of connecting a fairly dumb device up to the web to get a smart system overall seems good. Maybe there are other examples hiding in plain sight?

Thursday, October 16, 2008

I just went through and tagged a bunch of posts intellectual property, including all the ones tagged copyrights or DRM, and several tagged copy protection (there is some overlap, naturally). We now have a new number one with a bullet on the list of tags.

I'm not crazy about the term "intellectual property". It hash-collides with "internet protocol", its exact meaning is not particularly obvious, and the usual connotations of "intellectual" feel a bit out of place here. But it's certainly useful to have a general term for, well, what shall we say ... "data having economic value"? I'm not as particular which exact term we use, so long as we more or less agree on what it means and when to use it.

Actually ... one of the great things about the web is it's so easy to look things up. Dictionary.com gives the Random House Dictionary's definition, which includes the key phrase "property that results from original creative thought," and -- this was a pretty big surprise to me -- traces it back to the 1840s. So, while we might object to the term as "marketing speak" or to the whole concept on the basis that "information wants to be free", the term has a legitimate pedigree and is clearly here to stay.

Amidst everything else that's been going on in Washington DC lately, congress recently passed the "PRO-IP" bill (here's the Senate version). One of the many things it does is establish a cabinet-level "Intellectual Property Enforcement Coordinator". Apparently the Justice Department was not best pleased with this, but (as always keeping in mind that I'm not a lawyer), it doesn't seem completely out of line that such a post would exist. Regulation of patents, at least, is on the short list of powers granted to congress in Article I, Section 8 of the constitution, so this is not like establishing a cabinet-level post for, say, design of postage stamps.

On the other hand, we seem to have done fine for a couple of centuries without such a post, and as always the devil is in the details. This being legislation, there are a lot of details. Since I'm still not a lawyer, and legislation is written as deltas against the mammoth U.S. Code, making it essentially impossible to just read through a bill and know what it means, and as far as I can tell the emphasis of the thing is on piracy and not on rationalizing IP law in general, I'm not prepared to say whether we might need this particular realization of the idea.

That said, it's at least worth noting that such a thing has happened. New cabinet-level posts don't get created every day.

Monday, October 13, 2008

The latest Economist has not one but two articles -- from its print edition, mind -- on the "Paperless Office". The first is more about the fact of the increasingly paperless office, the second delves into implications.

Back in the 1980s, when PCs started to take off and people used terms like "Desktop Publishing", it was obvious that before long there would be no need for paper in an office. Why xerox a memo when you can send an email? Why keep a paper ledger when an Accounts Receivable application will do all that for you? Why work up a spreadsheet when you could use, well, a spreadsheet?

By the year 2001, which saw the publication of a book called The Myth of the Paperless Office, it was clear that the hype was not going to pan out. People were using more paper than ever. A lot of places wanted a physical, paper backup of important files. Some people -- myself included -- just wanted to see certain things on paper, things like email (for executives) or code (for geeks like myself). In my case, I had a hard time navigating a 1000-line code listing on a 24-line screen. More on that in a bit.

Exept the year 2001 was also when paper consumption in the office peaked. It turns out that people really did start to use less paper. Email really has replaced memos, and so forth. When people do print things out, they seem more interested in nice, high-contrast color prints (e.g., of photographs or brochures) than just printing draft text. So what happened? The prevailing theory seems to be that kids these days are just more comfortable with a paperless environment. I'm sure they are, but I don't think that explains it all.

Like I said, I used to have a hard time navigating code without printing it out and scribbling on it. What changed? For one thing, I got more used to editing on a screen. For another thing, screens got bigger and sharper. There is a world of difference between paging through 24 monochrome 80-character lines at a time and looking at more than twice as many nice wide lines. For another, interfaces got nicer. It's much more convenient to search back and forth, to navigate from one place to another, to flip back and forth between several different documents, and so forth, than it was in the 80s.

Navigability is a big deal. In coding, it means I can easily get, say, from where I use something to where it's defined -- one of the big reasons I'd want to print out a bunch of code and lay it out on the floor. For non-geeks, it's the ability to click on a link on a web page and Just Go There. You can't do that on paper. Until the web came along, you (generally) couldn't do that on your screen either. Now it's routine.

In other words, it's not just (or perhaps not even primarily) a societal change. The technology, and particularly the software, made significant advances between the 1980s and the turn of the century. One change -- the widespread adoption of hyperlinks -- was seismic, but many changes -- higher screen resolutions, nicer UI widgets, nicer text formatting -- were incremental.

From that point of view, I'd agree with the second article that technological change and external shocks can bring technologies back into favor, but I'm not completely sold on the societal angle. In particular, I disagree that "The paperless office shows how a sociological shift can make the difference: although the technology did not change very much, its users did." The technology did change quite a bit, just not all at once or all that visibly. The gradual improvements were too gradual to draw notice, the web too pervasive. Who notices the air?

[Nowadays I often go for days, weeks even, without printing anything out, and whole days from time to time without writing anything down on a physical medium. Along with better displays and better tools, I've gotten more comfortable working without the tactile cue of, say, circling something and drawing an arrow from it to something else. In other words, it's partly technology, and partly gradual shifts in behavior from using the technology --D.H. May 2015]

Sunday, October 5, 2008

In a commentary on recent happenings in the US economy, linguist/evolutionary psychologist Steven Pinker asserts that "The past decade has shown us that unplanned, bottom-up, productive activity can lead to huge advances in social well-being, such as Linux, Wikipedia, YouTube, and the rest of Web 2.0." I generally like Pinker's work and tend to agree with many of his points, but ...

Personally, I wouldn't call any of these "huge" advances in social well-being, though I'll certainly agree they've moved the ball forward in their own ways. I'm also not sure that "unplanned" is quite the right word and I suspect Linus might agree on that (at least regarding Linux). But that's a separate quibble.

What I find interesting here is that he appears to include Linux as part of Web 2.0 (The sentence is slightly ambiguous -- he might conceivably have meant "YouTube, and the rest of Web 2.0" as distinct from "Linux, Wikipedia", but that seems unlikely given the punctuation and that Wikipedia is very much a web thing).

Leaving aside the finer points of parsing the sentences of linguists, what's interesting about including Linux, and not specifically mentioning "social networking" sites and not even coming close to mentioning technologies like AJAX, is the emphasis on "unplanned, bottom-up productive activity". So to Pinker, Web 2.0 appears to be more about a democratic "open source" ethos than about any particular product, and even less about any particular technology.

While it differs fairly sharply from the more familiar "social networking" or "AJAX" or "social networking plus AJAX" versions of Web 2.0 one runs across, the position is certainly worthy of consideration. I have a particular sympathy for it since I consider technology more a reflection of human nature than a shaper of it.

However, I don't think that "Linux, Wikipedia, YouTube, and the rest" is an appropriate definition for Web 2.0. First, I try to defer to current usage, and most people in the biz don't seem to use "Web 2.0" that way. Second, the very real and useful thing Pinker seems to be trying to capture here considerably predates "Web 2.0".

In fact, it predates Web 1.0. According to Wikipedia (see, I told you it was socially useful), the Web debuted in August 1991. It can, of course, trace its technical roots back earlier, but as a living, wide-scale collaboration it didn't really get going until 1992 or so. Meanwhile, the Linux kernel alsodebuted in August, 1991. I cheerfully admit I had no idea that these two major developments were sprung (quietly) on the world in the exact same month.

If we're litigating here over "which came first, the kernel or the web?" we'd have to declare a virtual tie. But that's not my contention. My contention is that the notion of internet-driven "unplanned, bottom-up productive activity" was well-established long before the web was the web. The announcement of Linux just happens to provide a useful case in point. Consider:

Like everything else, Linux did not appear from nowhere. It draws directly on BSD, an open-source UNIX kernel with its origins in the late 1970s. One can quibble over whether BSD, with its institutional sponsorship, would qualify as "unplanned", but the 'B' does stand for "Berkeley".

Linux is also intimately related to, though decidedly not identical to, Richard Stallman's GNU, first announced in September 1983.

And here's the punchline: Linux, GNU and the Web were themselves first announced on Usenet. Usenet seems at least as good an example of what Pinker is driving at as any of the examples he does cite, including being "productive" in roughly the same sense that YouTube is "productive". In retrospect, Usenet could be considered Web 0.1. Usenet dates to 1979, roughly the same vintage as BSD.

Also of that vintage (depending on just how you count): The internet itself.

In short, Web 2.0 has been around for a few years. Linux is older than Web 2.0 and about as old as Web 1.0. The kind of bottom-up social collaboration Pinker seems to be referring to has occurred over the internet for roughly as long as there's been an internet. In keeping with my theme of human nature shaping technology, I'll leave it to the reader to ponder whether such activity might have been occurring off the internet for considerably longer.