Archive for December, 2010

I’ve been thinking about Ted Ts’o’s recentposts about whether it’s possible to do engineering or work on technology at startups. I’m not going to argue that you can’t work on technology at Google or another big company (although articles likethese do point out the difficulties). It would be easy to pick on Google’s failures and point out how many of their successes were actually acquired by buying a startup, but what I really wanted to talk about is how (IMHO) Ted is misunderstanding startups.

Ted’s central point seems to be:

But if your primary interest is to doing great engineering work, then you want go to company that has a proven business model.

Phrased so broadly, that’s bad advice. The reasoning that leads Ted to that bad advice starts with two contradictory misunderstandings of startups:

These days, the founder or founders will have a core idea, which they will hopefully patent, to prevent competitors from replicating their work, just as before. […] most of the technology developed in a typical startup will tend to be focused on supporting the core idea that was developed by the founder.

and

Because if you talk to any venture capitalist, a startup has one and only one reason to exist: to prove that it has a scalable, viable business model.

In my experience, startups typically start with the founders deciding they’ve found a problem they can solve better, cheaper or faster — but it’s rare for founders to have an idea that’s developed enough to patent the whole thing. Ted I think implies that at a startup, the founders have figured everything out and everyone else is just filling in the details of the idea. To me, that seems completely backwards: if you go to a big company with an established business model, then almost certainly you’ll be working within the outline of that model (Innovator’s Dilemma and all that); at a startup, you’ll have to help the founders figure out just what the hell your company is supposed to be doing. And that gets to the second quote: a startup is an exercise in adapting the technology you’re building until you find the right business model. In other words, nearly every startup will get it wrong to start with and have to change plans repeatedly; the hope is that the technology you build along the way is valuable enough that you can survive until you find the right way to make money.

To give one example from personal experience, when I was at Topspin working on InfiniBand products, early in the InfiniBand hype cycle (around 2001 or so), we thought that every OS would soon ship with InfiniBand drivers, so we focused on building switches and other networking gear, without worrying about the hosts that would be connected to the network. It turned out that the first open source project for a Linux InfiniBand stack fizzled, and Windows also gave up on InfiniBand, so we ended up having to build an InfiniBand host stack — fortunately the embedded software from our switches already had most of the ingredients, and so we were able to pull it off by reusing our embedded work. (That Topspin host stack ended up getting released as free software, and it became one of the ingredients that went into the current Linux InfiniBand stack — and I ended up as the InfiniBand maintainer for the Linux kernel, while working for a startup)

So as I said before, I think it’s bad advice to suggest to someone that “real” engineering can only be done at a large company. Certainly there are huge differences between working at a big company and a small company, and I do believe that there are “big company people” and “small company people.” If your goal is to spend nearly all your time making incremental improvements in ext4, sure, it’s probably easier to do that at a company that is a big enough ext4 user for that work to pay off; on the other hand if you’d rather work on something that you’re making up as you go along and where your decisions shape the whole future of the company, then a startup is probably a better place for you. Similarly, Ted’s assertion

For most startups, though, open source software is something that they will use, but not necessarily develop except in fairly small ways.

misses the real distinction. There are plenty of startups where open source is the main focus (Cloudera, Riptano and Strobe are just a few that spring to mind; and I don’t mean to dis all of the others that I’m not namechecking here), and there are gazillions of big technology companies that are actively hostile to open source. So really, if you want to get paid to work on open source, make sure you go to an open source company; the size of the company is a completely orthogonal issue.

To summarize my advice: if you think you might be a small company person, don’t let Ted scare you away from startups. Oh, and happy holidays!

I recently moved the VPS that hosts this blog from Slicehost to Linode. Both are very nice hosting providers that give you full control over a Xen virtual machine, including root access to the distribution of your choice and a slick web control panel, but right now at least, Linode gives you roughly twice the RAM as well as substantially more storage and bandwidth for the same price as Slicehost.

The main point of this post is really just to include my Linode referral link — if you’re going to sign up for Linode anyway, why not use my link and save me a few bucks on hosting?

I want to mention two things about IBoE. (I’m using the term InfiniBand-over-Ethernet, or IBoE for short, for what the IBTA calls RoCE for reasons already discussed)

First, we merged IBoE support on mlx4 devices into the upstream kernel in 2.6.37-rc1, so IBoE will be in upstrea kernel for the 2.6.37 release — one fewer reason to use OFED. (And by the way, we used the term IBoE in the kernel) The requisite libibverbs and libmlx4 patches are not merged yet, but I hope to get to that soon and release new versions of the userspace libraries with IBoE support.

Second, a while ago I promised to detail some of my specific critiques of the IBoE spec (more formally, “Annex A16: RDMA over Converged Ethernet (RoCE)” to the “InfiniBand Architecture Specification Volume 1 Release 1.2.1”; if you want to follow along at home, you can download a copy from the IBTA). So here are two places where I think it’s really obvious that the spec is a half-assed rush job, to the detriment of trying to create interoperable implementations. (Fortunately everyone will just copy what the Linux stack does if they don’t actually just reuse the code, but still it would have been nice if the people writing the standards had thought things through instead of letting us just make something up and hope it there are no corner cases that will bite us later)

The annex has this to say about address resolution in A16.5.1, “ADDRESS ASSIGNMENT AND RESOLUTION”:

The means for resolving a GID to a local port address (i.e. SMAC or DMAC) are outside the scope of this annex. It is assumed that standard Ethernet mechanisms, such as ARP or Neighbor Discovery are used to maintain an appropriate address cache for RoCE ports.

It’s easy to say that something is “outside the scope” but, uh, who else is going to specify how to turn an IB GID into an Ethernet address, if not the spec about how to run IB over Ethernet packets? And how could ARP conceivably be used, given that GIDs are 128-bit IPv6 addresses? If we’re supposed to use neighbor discovery, a little more guidance about how to coordinate the IPv6 stack and the IB stack might be helpful. In the current Linux code, we finesse all this by assuming that (unicast) GIDs are always local-scope IPv6 addresses with the Ethernet address encoded in them, so converting a GID to a MAC is trivial (cf rdma_get_ll_mac()).

This leads to the second glaring omission from the spec: nowhere are we told how to send multicast packets. The spec explicitly says that multicast should work in IBoE, but nowhere does it say how to map a multicast GID to the Ethernet address to use when sending to that MGID. In Linux we just used the standard mapping from multicast IPv6 addresses to multicast Ethernet addresses, but this is a completely arbitrary choice not supported by the spec at all.

You may hear people defending these omissions from the IBoE spec by saying that these things should be specified elsewhere or are out of scope for the IBTA. This is nonsense: who else is going to specify these things? In my opinion, what happened is simply that (for non-technical reasons) some members of the IBTA wanted to get a spec out very quickly, and this led to a process that was too short to produce a complete spec.