Saturday, December 31, 2005

As I previously commented, lists of the best and worst have a broad appeal. Today being New Year's Eve, there's a flurry of such lists (with the suffix "of 2005").

Now, here's a best-and-worst question for my readers: What was the best course you ever took, and why? (Could be high school, university, whatever.) How about the worst course?

I've taken a number of very good courses, but one that stands out for me was a 1st-year course on intellectual history. The subject matter was fascinating, the seminar format allowed for lots of interaction, and the instructor, Duff Crerar, was very good. For me it was such a striking contrast with a course I took in high school on Democracy (can't remember the exact title). Everyone was required to take the course, the textbook was mediocre, and the instructor was autocratic (an irony that was not lost on me).

There are a few other examples that come to mind, but I'd love to hear from other people.

Wednesday, December 21, 2005

This is important. Canada has not yet ratified the Optional Protocol to the United Nations Convention against Torture and other forms of Cruel, Inhuman or Degrading Treatment or Punishment. Click here to read a letter to Prime Minister Paul Martin, then decide if you'd like to sign on. It's been said before (originally by Edmund Burke, apparently), but bears repeating: "All that it takes for evil to triumph is for good people to do nothing."

Monday, December 19, 2005

I certainly don't agree with everything that the Cato Institute puts out, but I do read their Daily Commentary, which magically arrives on my antique Palm m105 every day (thanks to AvantGo), and often find it thought-provoking. Today's piece is about how the city council of Washington, D.C. is giving Major League Baseball a sweet deal on a new stadium. It's written by Dennis Coates, a professor of economics at University of Maryland, Baltimore County, who is co-author of a Cato Institute report on the subject. Apparently their research suggests that the economic benefits of this kind of corporate welfare are ... nonexistent! (The commentary includes a link to a 12-page report you can download in PDF, which in turn has references to two related journal publications.)

Another recent Commentary I found useful was a review of books on Economics. Even if they don't get everything right (IMHO), there are some bright people at the Cato Institute.

Sunday, December 18, 2005

In the last few years, interactive data visualization seems to have really taken off. Some Google counts:

"visualization": about 53,300,000

"interactive visualization": about 243,000

"interactive data visualization": about 28,100

"interactive statistical graphics": about 362

(By the way, "Google metrics" are an interesting topic in themselves, and certainly there are all kinds of methodological questions about their use. See g-metrics.com for some longitudinal data. But they probably give a rough indication. To get a sense of scale, I turned off Google's "safe search" feature and did a search for "sex". Result: about 224,000,000.)

Saturday, December 17, 2005

As the year-end nears, there are any number of "best of 2005" lists (about 2,510,000 according to Google) and more than a few "worst of 2005" (about 38,700). For example, voting on the best blogs of 2005 (in any number of categories) is taking place at weblogawards. Amazon.com has an obvious interest in promoting the best books of 2005. In the worst-of category, check out The Year in Media Errors and Corrections.

Why all this focus on the extremes? When it comes to the best, a straightforward explanation is that we want to make the most of the time we spend reading books, listening to music, etc. But I think that only scratches the surface. A lot of it may be wanting to be "with it", which may stem from insecurity. Instead of choosing what everyone else seems to like, why can't we make up our own minds? Even the term "the best" makes the assumption that some ordering exists for the things under consideration. Maybe that's true for contests that are deliberately set up to have a single quantitative outcome, like the time to run 100 metres. But even there it can get murky: who's the fastest sprinter in the world? With or without performance-enhancing drugs?

As for the worst, it seems to me there's more than a touch of schadenfreude involved. Or at least a feeling of relief: "There, but for the grace of God, go I". The practical use of these lists tends to be limited -- most of the time one doesn't need to be told that a movie was one of the worst of 2005 to recognize that it's better avoided.

Focusing on extremes can be quite important, for example in mechanical engineering where a structure has to be able to withstand extremes of temperature or what have you. Statistical problems of this sort are the focus of extreme value theory.

Focusing exclusively on extremes can be misleading or counterproductive. In the case of healthcare waiting lists, we often hear about extremely long waits without the context necessary to make sense of them. For example, was the person on the waiting list for a long time because their problem was just a minor inconvenience? What was the overall distribution of waiting times like? One approach to managing waiting lists is to adopt a minimax approach, where you minimize the maximum wait. But this seems to me to be a pretty dubious strategy.

I think we need a word for our love of extremes: how about extremaphilia? (There is a word extremophile but it has quite a different meaning, which is why I've spelled my word with an a instead of an o.)

Any thoughts on all of this? (Any best-of/worst-of lists to share? I confess I still enjoy them!)

Update 18Dec2005: Here's a very useful list of the The Best Web 2.0 Software of 2005. (Web 2.0 is the buzzword-du-jour, meaning, as far as I can tell, 2nd-generation web-based software. Notably, it's pretty much *free* at the moment!)

Wednesday, December 14, 2005

Air travel has its frustrations: delayed flights, luggage that doesn't arrive until the next day, ... But what if the motorized wheelchair you depend on arrived at your destination in a dozen pieces? My friend Joe Dawson has experienced that and more. He complained to the Canadian Transportation Agency, who ruled that Air Canada had to change their ways (way to go, Joe!). Check out today's CBC news item about it, particularly the audio of an interview with Joe.

Tuesday, December 13, 2005

I'm not a big fan of the bar chart, although it has its uses (for more on this topic see the discussion on the blog Junkcharts and a previous post of mine). But if the ordinary bar chart is less than inspiring, what to make of this graphic from today's issue of an Ottawa newspaper called "Dose":In discussing "self-promoting graphics", Edward Tufte writes:

"When a graphic is taken over by decorative forms or computer debris, when the data measures and structures become Design Elements, when the overall design purveys Graphical Style rather than quantitative information, then that graphic may be called a duck ..." (The Visual Display of Quantitative Information, 2nd ed., p.116)

Why a duck? In his book, Tufte showed a rather remarkable building shaped like a duck where "the whole structure is itself decoration". I think Tufte's words are particularly applicable to the figure above. A simple table gives the same information in a much more straightforward way:Hmmm ... I wonder if the kind of graphical nonsense the press routinely produce has something to do with their credibility problem?

Monday, December 12, 2005

In the comments of my previous post, I noted that Elections Canada has voter turnout figures going back to 1867. There's lots more interesting data on their website, which is quite easy to navigate (kudos to those folks). So here's what the turnout was like in each election (I've excluded referenda) since 1867:

A few notes:

I used the adjusted percentage for the 1993 election ("adjusted to account for electors who had moved or died between the enumeration for the 1992 referendum and the election of 1993, for which a separate enumeration was not carried out except in Quebec, as the 1992 electoral lists were reused")

I also used the adjusted percentage for the 2000 election (similar reasons; see the footnotes of the Elections Canada table previous cited).

Unlike the scanned graph in my previous post, the graph above uses a linear time scale.

I leave my gentle readers to interpret the graph (it's bedtime for me). One final note: I made the graph using my own custom R program (hooray for R!). If you have any suggestions on how to improve the graph, I'd be interested in hearing them.

Update (17Dec2005): I've redrawn the figure with the elections numbered and the dates and turnouts listed below. To see the graph properly, click on it to see the higher resolution version. (You'll need to click again to convince your browser to show the full-size image.)

Sunday, December 11, 2005

So how well is Canada's electoral system working? One gauge is voter turnout: Looks like a trend to me. (But note that the election dates are evenly spaced along the horizontal axis even though they are not evenly spaced in time.)

In a proportional representation system, there's no danger of wasting one's vote, which I think would encourage a greater turnout.

Thursday, December 08, 2005

A few people (i.e. non-math-geeks) have asked me what the title of this blog, Log Base 2, means. In mathematics, log base 2 is a logarithm function. It's actually not too hard to understand, and I have a nice example below that I hope will clarify it.

This kind of growth is often described as "exponential". Exponential growth means that the rate of growth is proportional to the current size, so it starts out slowly but soon grows very quickly. The figure notes a "doubling in size approx. every 5 months". If we let N0 be the number of blogs in March of 2003 (roughly 100,000), and N(m) be the number of blogs m months later, then we can write this as

N(m) = 2m/5N0

So, after 5 months, we have

N(5) = 21N0 = 2N0

and after 10 months, we have

N(10) = 22N0 = 4N0

and after 15 months, we have

N(15) = 23N0 = 8N0

and so on. Which brings us to log base 2, or log2 as it's usually written.

The log2 of a number means the power of 2 that gives that number. So

log2 2 = 1 because 21 = 2

and

log2 4 = 2 because 22 = 4

and

log2 8 = 3 because 23 = 8

Let's try redrawing the figure using log2 of the number of blogs (expressed in hundreds of thousands):

An increase of 1 unit corresponds to a doubling of the number of blogs. If you look at the start of 2004, the curve is at roughly 4, and by the end of 2004, it's at 6. This is an increase of 2 units, meaning a quadrupling of the number of blogs in 12 months (that is, a doubling every 6 months, not 5). At the start of 2003, the curve is steeper, meaning an even more rapid growth. So the Technorati figure wasn't quite right in saying there was "consistent doubling over the last 36 months."

Although the log base 2 units are very convenient for working out the doubling time, they hide the true numbers. So it's convenient to re-label the graph:

This is called a logarithmic scale. It takes a little getting used to, but it's very good at showing relative changes for quantities that vary over several orders of magnitude.

There! Clear as mud? Log base 2 is particularly useful in computer science because computers use binary (base 2). And in addition to being a bit of a math geek, I'm also a bit of a computer geek, and, well you get the point ... (and the name was a pun too, of course).

Sunday, December 04, 2005

A national election has been called here in Canada. To me democracy means that ordinary people are engaged in the decision-making process. To what extent does our system measure up to this ideal?

Our system is a riding-based representative democracy (with various additional peculiarities like an appointed Senate, which are not my focus here). Every so often (but at least every 5 years) we have a national election in which voters in each of 308 local districts ("ridings") select a representative to sit in parliament. Each representative belongs to a political party, and the party with the largest number of elected representatives tries to form a government (perhaps with the conditional support of other parties).

This system has several consequences. First, political engagement by ordinary people tends to be limited. In between elections, we don't generally have much say. A related phenomenon is that at election time, political parties (and candidates) make all kinds of promises about what they'll do if elected. Once in power, things don't always play out this way. That's one reason for the prevailing cynicism about politics and politicians in this country.

Second, because of the riding system, the overall national pattern of votes is somewhat weakly correlated with the number of seats each party wins in parliament. As I have pointed out previously, this makes prediction of election results quite difficult. To me, this kind of uncertainty is regrettable because it heightens the "horse-race" atmosphere around an election, distracting the focus from the real issues. (Unlike in the U.S., the timing of Canadian elections is also unpredictable, adding to the uncertainty.)

The riding system also means that the proportion of seats a party wins can be very different from the popular vote at the national level. For example, take a look at the results of the last election. The Bloc Québécois got 12.4% of the national vote. Their proportional share of the 308 seats in parliament would have been 38, but in fact they got 54 seats. The Green Party got 4.3% of the national vote, but 0 seats! If seats had been assigned in proportion to their share of the national vote, they would have received 13 seats. Clearly, viewpoints that are spread relatively thinly across the country are under-represented in parliament -- or even totally excluded.

A related phenomenon is so-called "strategic voting", in which you vote not for the party you'd like to win, but for the part most likely to defeat the party you don't want to win. Another side of this is "vote splitting". For example two parties with similar views may be defeated by a third party, even though their combined share of the votes is larger. Of course strategic voting and vote splitting can happen in any multi-party system, but the winner-takes-it-all riding system magnifies the problem. And it is a problem. A supporter of the New Democratic Party (NDP, left) may vote Liberal (middle) to keep a Conservative (right) from winning their riding. This is made more complex by the virtues of particular candidates in a riding. (Incidentally, I'm reminded of an election in Louisiana a few years back where the choice was between a former KKK member and a notoriously corrupt candidate. The slogan was "Better a lizard than a wizard!") In the riding where I live, the Liberals have won by a wide margin in each election since 1935, when the riding was formed. I think the only chance to beat them this time would be for everyone on the left to vote Conservative! This seems perverse.

Naturally, political-party strategists are well aware of these aspects of our system. They realize that elections are often decided by what happens in "key ridings", where a few votes can make the difference between winning and losing a seat. Don't expect any of the parties to invest much effort in my riding, where the Liberals have it in the bag. Where do the party leaders visit? Where is political polling the most intense? Where are pre-election funding announcements made? Just where you'd expect. I believe that all this happens at the expense of a genuine focus on the issues and to the detriment of democracy.

Thursday, December 01, 2005

My brother pointed me to this New Yorker review of a book titled “Expert Political Judgment: How Good Is It? How Can We Know?” by Philip Tetlock. One part I found particularly interesting concerns a common error:

" ... like most of us, experts violate a fundamental rule of probabilities bytending to find scenarios with more variables more likely. If a prediction needstwo independent things to happen in order for it to be true, its probability isthe product of the probability of each of the things it depends on. If there isa one-in-three chance of x and a one-in-four chance of y, the probability ofboth x and y occurring is one in twelve. But we often feel instinctively that ifthe two events “fit together” in some scenario the chance of both is greater,not less. The classic “Linda problem” is an analogous case. In this experiment,subjects are told, “Linda is thirty-one years old, single, outspoken, and verybright. She majored in philosophy. As a student, she was deeply concerned withissues of discrimination and social justice and also participated in antinucleardemonstrations.” They are then asked to rank the probability of several possibledescriptions of Linda today. Two of them are “bank teller” and “bank teller andactive in the feminist movement.” People rank the second description higher thanthe first, even though, logically, its likelihood is smaller, because itrequires two things to be true—that Linda is a bank teller and that Linda is anactive feminist—rather than one."

Apparently, Tetlock's research (he's a psychologist who teaches at Berkeley) shows that experts tend not to be all that good at making predictions. It reminds me of a book I read years ago (Futurehype by Max Dublin) which made a similar point. Prediction is a tricky business.