While working on SpanDeX.io I would often need to deploy code to our production servers, which resulted in (at worst) 30 seconds to (at best) 6 seconds of downtime in which users would be disconnected from documents and the server would appear dead to the outside world.

I'm sure I wasn't the first person to think of this, but I had a simple intuition: just don't take down the old server process until the new server process is ready to listen on an HTTP port. Between deploys and server restarts, this result in close to 0 milliseconds of downtime.