Archive for September, 2008

When I learned about General Relativity at university I sometimes used to wonder if there was a metric in which this object could exist:

And be what it appears to be, i.e. the straight lines are (lightlike) geodesics and the corners are (locally) right-angles.

Initially, I imagined that such a metric might be possible without any sort of topological defect. In particular, while the triangle would look like the picture above when viewed "face on" it would appear to curve as you moved around it and examined it from different angles. While lightlike geodesics always look straight when you look along them from a point on the geodesic, they can still look curved when viewed externally. The photon sphere around a black hole is an example of a set of such geodesics thought to occur in our universe.

Thinking about it some more, I suspect something strange is going to have to happen in the middle of the triangle - if you try to shrink the triangle to a point, what happens?

Imagine being in this universe and travelling all the way around the triangle. I think that upon doing so, one would find that one had actually only turned through 270 degrees instead of the full 360 (and would also have rotated about the axis of travel). I suspect that this means that such a metric would have to have a topological defect (a cosmic string) passing through the center of the triangle. This would cause a discontinuity when viewing the triangle, so one would not be able to see the entire illusion in all its glory as it is shown above.

My post yesterday kind of devolved a bit into ranting about how Microsoft's processes have the unindented consequence of causing the code to increase in complexity in the long term. This begs the question "how could Microsoft change to avoid this?"

Back in the 90s, Microsoft was generally considered to be an excellent place to work (much as Google is now). They still are to some extent but today that "excellent" means more "sensible" than "exciting". In the 90s, Microsoft was often described at having a culture that resembled a collection of many small startup companies rather than one large corporation. I think that was pretty much gone by the time I started working there in 2001 - by then it was very corporate (sometimes embarrasingly so). Meanwhile, startup companies are doing the kinds of development that Microsoft can only dream about, on tiny budgets.

Reading Paul Graham's essays about startups has given me some hints about how these startups can be so effective in the way that Microsoft can't, as well as why the ones that fail do.

There is a whole spectrum of corporate cultures, from two or three guys in an apartment at one end (let's call it the left end for the sake of argument) to Microsoft at the other (the right end). The left (when successful) is great at motivating their employees to do great things, rapidly prototyping and turning over problems quickly. The right is great at (eventually) producing very large, complicated, highly polished pieces of software, being consistent and avoiding risk.

To really be successful you need to work at both ends of the spectrum (and everywhere in between). The ecosystem of people founding startups, writing some great software and then getting bought by Microsoft (or Yahoo, Google, whoever) has done some pretty great things. But does Microsoft need all those startups to exist externally to itself in order to do the startup-y things? Perhaps not, if it can return to the culture it had in the 90s.

Startups need only a couple of things to exist - talented people and money. Both of these Microsoft has in spades. But they also need these people to be very motivated. Microsoft isn't a great motivator of people anymore - the difference in rewards between those who are putting in a reasonable minimum effort and those who are going all out is (in the grand scheme of things) small and it takes a long time of working very consistently hard to rise above the crowd. The average employee knows that it is very unlikely that they will ever be able to make a big difference to anything. Certain philosophical positions adopted by Microsoft mean that they are no longer highly regarded by the geeks that they need the most.

One thing Microsoft can do to revitalize itself would be to incubate startups inside itself. Allow employees to form small teams to solve some specific problem. If they succeed, they are rewarded with lots of money. Even if this approaches the amount that they might get from a successful startup this would probably be much cheaper for Microsoft than buying the equivalent startup, assuming a reasonably high success rate. These internal startups could even compete (especially if there are multiple approaches that seem promising).

These startups should be autonomous for the most part - they should be able to choose the software and programming languages that makes the most sense for them. They should not have to coordinate with other teams who might be doing similar things, or worry about treading on the toes of others. They should be allowed to concentrate on writing great software. Unlike startups in the real world, they won't have to worry about getting funding (just getting a first version working in the agreed upon time). They can have office space on campus or work in somebody's apartment if they prefer. If Paul Graham is to be believed, these factors will cause the success rates of these internal startups to be much greater than those of startups in the real world.

While the resulting software might not meet Microsoft's standards for security, localization etc., these things are probably better done at the more corporate end of the spectrum, by those employees who prefer the stability and "sensible"ness that you get with that corporate culture. The left will quickly produce imaginative, highly competitive software and the right will polish it to corporate standards - the best of both worlds.

The downside for Microsoft of this approach would be the drain of talent from the "classic" product units, and possibly also from the company altogether (if successful "startup" employees were rewarded with what Neil Stephenson describes as something rhyming with "luck you money"). But if that drain is happening already, maybe there's nothing to lose.

There are several ways in which computer programming and physics are very similar. Possibly the most important is that both disciplines are, fundamentally, a search for simplicity.

In physics, we have a big pile of experimental results and we want to find the simplest theory that satisfies them all. Just listing all the experiments and their results gives you a theory of physics, but not a particularly useful one since it's not very simple and doesn't predict the results of future experiments (only the past ones). Rather than just listing the results, we would like to find a general theory, an equation, a straight line through the points of data which allows for interpolation and extrapolation. This is a much more difficult thing to do as it requires insight and imagination.

In computer programming, we generally have a big pile of specifications about what a program should do - maybe a list of possible interactions with the user (what they input and what they should expect to see as output). These might be encapsulated as testcases. To write a program that satisfies all the testcases, we could just go through them all one by one, write code to detect that particular testcase and hard-code the output for that particular input. That wouldn't be very useful though, as the program would fail as soon as the user tried to do something that wasn't exactly one of the scenarios that the designers had anticipated. Instead we want to write programs for the general case - programs that do the right thing no matter what the input is. When the "right thing" isn't precisely specified, we get to choose the output that makes the most sense according to our internal model of how the program should act.

I think a number of software companies in recent years (Microsoft in particular but others as well) have started to fall into the trap of writing software that concentrates too much on what the behavior of the software should be for particular (sometimes quite specific) scenarios, at the expense of doing the right thing in the most general case. Windows is chock full of "special case" code ("epicycles" if you will) to work around particular problems when the right thing to do would have been to fix the general problem, or sometimes even to explain that this is how we should expect it to work. Here is one example of this kind of band-aiding. I discovered another the other day - I was running some older Windows software in Vista and accessed the "Help" functionality, which was implemented an old-style .hlp file. Vista told me that it no longer includes the .hlp viewer by default (I guess it was a piece of the OS that doesn't get a lot of use these days, and they had just dropped it from the default distribution to avoid having to bring it up to the latest coding standards). I was pointed to the download location (where I had to install an ActiveX control to verify that my copy of Windows was genuine before I was allowed to download the viewer).

Part of the problem is that (at Microsoft at least) it's very difficult to make big changes. Rewriting some core piece of functionality, even if the programming itself is easy, would involve months of planning, scheduling, designing, specification writing, testcase writing, test-plan reviewing, management sign off meetings, threat modelling, localization planning, documentation planning, API reviewing, performance testing, static analysis, political correctness checking, code reviewing and integrating. And of course everyone whose code might possibly be affected by the change needs to sign off on it and put in their two cents about the correct design. And it must comply with the coding standards du jour, which change every year or two (so delay too long and you'll probably have to start all over again.) When you come to understand all this, the long gap between XP and Vista becomes less surprising (in fact, it's quite a surprise to me that it only took a little over 5 years, considering how many pieces were completely rewritten). All this process exists for a reason (mostly the politician's fallacy) but is rigorously justified and widely accepted.

Because it's difficult to make big changes, people tend to make little changes instead ("hey, we can work around this by just doing x in case y - it's just one extra line of code") - these don't require much process (usually just a code review - most of the rest of the processes for such small changes is automated). All these small changes add up to a great deal of extra code complexity which makes it very difficult for newcomers to understand the code, and even more difficult to rewrite it in the future because people will have come to depend on these edge cases.

Following on from this post, a natural generalization is that to non-Euclidean spaces. This is important for simulating gravity, for example rendering a scientifically accurate trip through a wormhole (something I have long wanted to do but never got to work). The main difference is that ones rays are curved in general, which makes the equations much more difficult (really they need to be numerically integrated, making it orders of magnitude slower than normal ray-tracing). One complication of this is that generally the rays will also curve between the eye point and the screen. But the rays between your screen and your eye in real life do not curve, so it would look wrong!

I think the way out of this is to make the virtual screen very small and close to the eye. This doesn't affect the rendering in flat space (since only the directions of the rays matter) and effectively eliminates the need to take into account curvature between the screen and the eye (essentially it makes the observer into a locally Euclidean reference frame).

Another complications of simulated relativity is the inability to simulate time dilation. Well, you can simulate it perfectly well if you're the only observer in the simulated universe but this would be a big problem for anyone who wanted to make a relativistically-accurate multiplayer game - as soon as the players are moving fast enough with respect to each other to have different reference frames, they will disagree about their expected relative time dilations.

I am having a wonderful holiday here on the Olympic peninsula. I will write more about it another time but I wanted to put this post up now for reasons that will become clear.

One thing has marred this holiday a little, though - we somehow managed to lose my camera on Wednesday. I think I left it on the table at the restaurant in Neah Bay, but when we went back to look for it a few minutes later there was no sign of it and the waitresses hadn't seen it. It's possible it was stolen (either from there or from our car) or that we left it somewhere else (maybe at the trailhead for Cape Flattery). I'm not too bothered about the camera itself (it was more than 6.5 years old and quite obsolete, it's battery low sensor was becoming confused and I was going to replace it after this trip anyway). We bought a disposable camera to document the rest of our trip but the most annoying part is the 2 days worth of photos (about 144 of them I think) that we've lost.

I'm posting this on the remote chance that someone finds the camera, looks through the photos and decides to try to locate the owner (i.e. me) by Googling keywords from the photos. The camera was an 3.3 megapixel Olympus C3000Z with a 128Mb SmartMedia memory card, a lenscap attached by a cord and a battery compartment with bits of tin foil and electrical tape to replace corroded contacts. On the memory card were pictures of a 1000 year old giant Spruce tree, another big tree (a dead Cedar with other trees growing out of its remains), various pretty pieces of scenery taken from rural Washington roads, vampire merchandise and vampire-related signs in Forks, driftwood at Ruby Beach and the beach at La Push, and lots taken at Cape Flattery (a woodland trail and some impressive seascapes and islands). The three of us (an adorable toddler, a man with dark hair and glasses and a woman with long dark hair) are visible in some of the photos - the toddler is riding in a green backpack carrier in the Cape Flattery ones.

Most software projects with more than one programmer seem to enforce some kind of formatting style for the code - brace positions, indent width, use of tabs - that sort of thing.

At Microsoft, we didn't spend a lot of time talking about the style but we did have one rule - you should try to make your code consistent with the code around it. (If you were in the fortunate position of starting a brand new project from scratch, you got to choose the style yourself.) At least until the tyrannical StyleCop showed up. I left before using it for very long but I hated having to placate it (especially when I disagreed with its rules - for example, it wouldn't let me insert extra blank lines to group related functions, or arrange my functions in a more logical order than the standard one).

The GNU coding standards are similarly strict. I haven't disagreed with them very much (though I dislike the convention of having two spaces after a full stop).

I suppose having style guidelines (provided they are good ones) does make the source code look prettier and more consistent. However, I'm not convinced that it is worth the effort, especially since any programmer will have to be able to read code written in any style anyway (lest they start making assumptions and get fooled by a malevolent patch). In fact, there may be certain benefits to allowing every programmer to adopt their own personal favorite style. For one thing you'd be able to tell at a glance just who wrote any particular piece of code in your program (assuming that there were a small number of contributors and you're familiar with all their work, which I don't think are particularly bad assumptions in many cases).

Programmers' personal style also changes with time, so this can also be a good gauge for how old a particular piece of code is.

However one chooses to format their code, it is important that is readable - not having indentation at all, or having inconsistent indentation in a given class or file, or having indentation that misleads you about which "if" an "else" is paired with should not be acceptable.

One might get the impression from reading the above that my own preferred coding style is not particularly consistent. Nothing could be further from the truth - I've spent many hours (probably far too many) reformatting code to my own personal taste (K&R style with the exception of putting opening braces of global functions and classes on column 1) to make it look prettier. I used to prefer indenting by 2 spaces but now I prefer 4 (a habit I picked up at Microsoft). As I like to avoid very deeply nested constructs (and like to be able to see how deeply nested I am easily), I may even increase that further in the future.

I have an unfortunate habit of getting addicted to computer games - this is one reason why I don't play them as much as I used to, and why when I do now play games, I usually pick one with a definite ending so that there's a natural place to stop.

But occasionally I do slip into excess playing of Tet4, Freecell or Spider Solitaire. I think what makes these games particularly addictive is that restarting them is a move which seems to bring you closer to winning. In all these games, the state of the game starts out as quite simple and the moves are obvious, but as you play things get more complexified and constrained until either you lose or (if possible) winning becomes inevitable. Restarting decomplexifies so when you lose your brain (which is still thinking in terms of the game) naturally reaches for the "restart" key.

Many years ago, I decided to write a Tetris game, just to see if I could. I succeeded, but the play area was very wide. So I added controls to make it adjustable. I figured the minimum sensible width was 4 (otherwise not all orientations of all pieces can be used). Playing Tetris on a board of width 4 (or Tet4 for short), I discovered, is very different to normal Tetris. Things change much more quickly, every piece has a profound effect on the state of the board and the normal Tetris strategies don't apply because it's impossible to play without leaving some gaps.

After playing for a while, I discovered that Tet4 was even more addictive than normal Tetris, and that it was much easier to get to the state of mental concentration known as "the zone" - where the entire rest of the world seems to melt away and there is nothing left except you and the falling blocks. When you notice this (and can do so without leaving the zone), it becomes almost an "out of body" experience - your conscious mind almost seems to observe from outside as your unconscious mind plays the game as if on some kind of automatic pilot.

Once I experienced this, I wanted to intensify it. I realized that a gradual increase in speed is intrinsic to the zone experience (otherwise it's just repetitive). But at a particular speed, the limiting factor becomes how fast your fingers can move to maneuvre the piece into place - this can take as many as 4 or 5 keystrokes with the normal Tetris controls. I realized that Tet4 only had at most 10 possible combinations of orientation and position for each block. Most people have 10 fingers so let's just assign one combination to each of 10 keys and have a finger corresponding to each key. This is the control mechanism that Tet4 uses, and it worked exactly as I had hoped.

This version is a rewritten version in JavaScript (because I wanted to learn the language, and because by making it run in a web browser more people would be able to play it). I've tested it on IE7, FireFox 3 and Chrome and it seems to work fine but there may be bugs with other browsers. Let me know if you find any. I've made some fairly substantial changes with this rewrite, which makes this a rather unusual and minimalist version of the game:

A display which shows the position and orientation for each key (on my original version I learned the combinations by trial and error). This display is really just to reduce the game's learning curve a bit - to truly get into the zone you'll need to commit all the combinations to muscle memory.

In my original version, the 10 keys just set the position and orientation of the tetroid - you had to press the spacebar (or wait for gravity) to drop the piece to the bottom and get the next piece. This meant a decrease in zone at the point where it gets too fast to use the space bar and you switch from 11-key mode to 10-key mode. In this version, the 10 keys drop the piece as well, so you always press 1 key per piece.

In my original version, the game ends rather suddenly when your tower gets so high that there isn't time to think before your active tetroid locks. In this version, there is a "curtain" which falls behind the playfield, and you always have until it gets all the way to the bottom before your tetroid drops. This means that the amount of time you have isn't dependent on the height of your tower. Also, the piece doesn't enter the playfield until it is dropped, so you can take care even with the endgame (you lose when you drop a piece and it would protrude from the top of the screen).

The single key control mechanism is rather unforgiving of misdrops, so I added an undo feature - at any time you can press Q to undo the last drop. This helps in learning the keys, but for getting the best scores it is best avoided as your score is also undone but the time speedup is not. This also doesn't allow you to "look into the future" (further than you can anyway with the next piece indicator) - a new tetroid is chosen at random for the new next piece whenever a piece is placed.

I added a persistant, global high score table which includes a facility for replaying the top 10 games.

I modified the colours and key to position correspondances to better take advantage of the symmetries. This puts me at a bit of a disadvantage (since I'm used to the original keys) but should be easier to learn.

One other thing I'd like to do in the future is make an ActionScript/Flash version so that it can be embedded into other pages. I started this but it turns out that without the official Adobe development kit, Flash is really hard to learn.

Syntax highlighting is an indespensible feature of a programmer's editor - I don't know how I managed without it and always miss it if for some reason it doesn't activate.

One aspect of syntax highlighting that editors don't seem to do very well, however, is multiple levels of meaning. For example, if I have some code that's commented out the "top level" of meaning of this text is that it's a comment. The next level underneath that is that it is code. Syntax highlighting just treats it as a comment, however, and makes it a single colour. It would be neat if, within a comment (or other section that can be interpreted on multiple levels) the editor would try to parse the text and tweak the colour based on that. Commented out code would then be syntax highlighted, but would also be tinted green to show that it was within a comment. Similar techniques can be used for "#ifdef 0"ed out code, code within macros and code within literal strings. This would make it much easier to work with this sort of "multi-levelled" code.

Some people like to use GUIDs for everything - every interface, every class, every type library (I'm looking at you COM) - sometimes even every record in a database. This is ridiculous. GUIDs are overly verbose and difficult to work with, and far more unique than they need to be. We should be using hierarchical identifiers instead. There is exactly one circumstance I can think of in which a GUID would be necessary - you want to establish your own sub-namespace in some global registry and want to avoid colliding with anyone else who might happen to choose the same name as you. In the absence of some universal arbitrator who can dole out names, a GUID is an acceptable alternative. Once you have your GUID you can use it in any such situation - it only needs to be unique within that particular registry - it doesn't need to be universally unique.

No person or organization needs more than one GUID because a person or organization can make up their own namespace under that GUID.

If you follow the Java path and use domain names as the top level of your hierarchy, you don't even need a GUID. That's not a perfect solution, though, because sometimes domain names expire and fall into the wrong hands.