What I'm Thinking About

Monthly Archives: September 2009

Ganeti is open source software from Google to manage a cluster of linux hosts running virtual machines using the Xen or KVM hypervisors. I was attracted to it because it handled everything from creating new instances to managing network disk mirroring (via DRBD) to help with instance availability. It even supports migrating a running instance from one host to another with one short command line. Now that I have it working, its possible to have a new Ubuntu vm running in about 5 minutes, probably less if I don’t need disk mirroring.

That’s the good part, the bad part is that it is developed to run Debian instances on Debian hosts with the Xen hypervisor. Ok, that’s not the bad part, the bad part is that even though it works with other linux guests and hosts and the with KVM hypervisor, there are a number of hitches and gaps, even when using the Debian-derviced Ubuntu Linux distribution. I still need to update an earlier post I made on trying to get it to work on Ubuntu to reflect what I finally did to get things working, but I wanted to share a big piece of the work I did.

As many online commentators have pointed out, a good college experience has always involved much more interaction with both the experts (professors) and peers than reading a newspaper. What most of them miss is that the very fact that they are discussing the article online is a reminder that there may already be reasonable substitutes for those interactions.

I like low powered computers. It’s nice to use dedicated hardware for certain functions like file servers, routers, media players, etc. Dedicated hardware means you don’t have to disrupt other functions to do an upgrade. It also means that the dedicated hardware can be sized appropriately for the task, which helps avoid wasting energy for systems that are running around the clock.

For these reasons, I’m always on the lookout for hardware options.

VIA’s mini-ITX products have long provided a combination of small size and relatively low power consumption. I’ve had a second hand motherboard/CPU combo powering my home file server for the last 5 years or so. And their Pico-ITX boards shrink things down even further.

In the past year or so, Intel has gotten into the act with their low-powered Atom CPUs. For under $100 you can get a barebones system with a compact case, a low-wattage power supply, a motherboard and an Atom CPU from NewEgg. Unfortunately, these cheaper Atom systems pair their efficient CPUs with relatively inefficient supporting chipsets that can consume 4x the power of the CPU. Options are starting to emerge. Boards with the NVidia ION chipset draw less power and the included GPU can help with playing back 1080p video, but these carry a big price premium. In addition, some time this year, Intel is supposed to be shipping revisions more integrated options that draw less power.

Right now though, there are some interesting options based on embedded Atom chips. Fit2-PC sells tiny barebones systems (4″ x 4.5″ x 1.05″) that pair the 1.1GHz Atom z530 with a chipset that includes hardware assisted HD video decoding and uses a max of 8W. The basic version, with 1GB of memory, room for a laptop hard drive and gigabit ethernet is $245. Not exactly cheap, but these things are TINY. For another $10 you can add WiFi. All of them have HDMI for video, and 6 USB ports.

Predictable, this progress has just turned up the volume on the nay-sayers. Some of their criticisms are reasonable, but once again the Internet reminds me how willing people are to speak with authority that is exceeded only by their ignorance (I leave it to the reader to decide what my arrogance to ignorance ratio is). I should know better than to wade in and engage with such people, but I did it anyway after seeing some of the comments on one critical post. Having made the effort, I thought I might as well repost it on my own blog. The comment that put me over the edge was from someone calling himself “Roger” flinging a criticism of those (Dave Weiner, presumably) who’d failed to learn the lessons of the past:

Roger (not Rogers), as our self-anointed historian, could you please recount the the casualties of the first great RSS aggregator invasion that you think fell in vain?

I remember the fear. I don’t remember much though in the way of casualties. People adjusted the default retry intervals their aggregators shipped with and implemented the client side of Etag’s and proper HTTP HEAD requests, server side software followed suit. The rough edges weren’t fixed overnight, but that was fine, since RSS didn’t take off overnight, and even with Automattic’s support, RSSCloud isn’t going to either.

As for those worried about all the poor webservers just getting hammered every time an update notification goes out to the cloud, is that really an issue? I mean, event driven webservers like nginx or lighttpd can retire something like 10K requests a second on relatively modest hardware and support thousands of concurrent at a time out of a few MB of memory. Yes, that throughput is for static files, but just how often is that RSS feed changing? Even if your RSS feed is served dynamically, you can put nginx in front of apache as a reverse proxy or whatever and set up a rule to cache requests to whatever your RSS feed URL is for 1s.

As for the strain caused by delivering the notifications themselves, the same techniques that have made it possible to serve thousands of requests a second from a modest server are applicable for sending notifications, thought, at this point. Someone just has to write a cloud server that uses them. They can probably start with the software script-kiddies use to send spam 🙂

The critique of the firewall issues, etc, are the only ones that make sense to me. It seems like that needs to be turned around to use one of the “comet” techniques.

I do hope Dave will reconsider the aggregator-cloud interface. Inspiration might be found in the ideas behind “comet,” which allows clients to open a connection to a server and then allows the server to push notifications back to the client. Most of the focus in Comet is on clients running as javascript in web browsers, but the techniques for pushing information back from the server seem applicable. The overall approach avoids a lot of firewall and security issues with having a webserver running on the client as it does in the current RSSCloud proposal, and simplifies the cloud servers need to age out clients (it still has to deal with sockets that go dead).

If this makes sense to you, HELP, otherwise, nothing to see here, move along.

I’m trying to simplify the task of configuring and maintaining Linux servers at work and I want to build on some existing configuration management system to do so. We are using Ubuntu Linux distribution, and I was thinking of just building on the APT package management tools they’ve borrowed (among other things) from Debian, but I decided to look for something distro agnostic.

I’ve spent a lot of time and frustration the last day trying to get the server working for Chef, a new system written in Ruby. I spent time scanning their bugtracker and asking for help in their IRC group to no avail. It still doesn’t work, and I have no more idea why than when I started.

I’m really doubting my decision:

Chef has only been packaged for the bleeding edge version of Ubuntu. Um yeah, great, I really want to use beta software on my SERVERS.

The installation documemtation advises that I dowload and install Ruby Gems from a tarball becuase the version in the Ubuntu repositories isn’t to their liking. Great, I have to install extra shit by hand before I can use the software I want to use so I don’t have to install shit by hand. That’s efficient, right?

Chef relies on OpenID for authentication. Sweet! I can use my MySpace account to manage my servers! Well, I could, if only I could figure out the appropriate URL for the myspace authentication endpoint (and I was batshit insane). As for how I integrate OpenID authentication with anything else I’m using, I’m sure it will be easy and obvious what to do, in a year or two.

Oh yeah, I forgot the most important thing: It doesn’t work. At least it doesn’t work for me. I’ve installed all the prerequisites, I’ve run their “installer,” and I can even get to the login page of “chef-server” but when I actually try to log in, it falls down and goes BOOM. I get a generic error page warning me about a socket error. I tried to diagnose it myself to no avail, there wasn’t anything in the log files because…

Chef server truncates its log files willy nilly. It actually writes a fair amount of info to its log file, but you’d never know by looking at it after the fact, because after every request, it ends up as a zero-length file. Useful, huh? The trick is to ‘tail -F’ the file before restarting chef-server. This prints the output as it is written to the file, and reopens the file each time it gets truncated, which happens multiple times during the request.

For what it is worth, I figured out what was wrong here, for some bizzarre reason, the hosts file on the machine was only readable by root, which casued lookups for localhost to fail when chef-server was trying to connect to the couchDB server.

Now, to be fair, the Chef site makes it clear in a nice green sidebar that Chef is young and a work in progress. I knew that when I started with it. I didn’t expect it to be production ready, but I thought it was far enough along to start working with. Clearly, I’m reconsidering that.

I’m also reconsidering the assumption that sent me to Chef in the first place, that it was desirable, at this point, not to take a dependancy on a specific Linux distribution by trying to build off of APT, the package distribution and management system at the heart of Debian and Ubuntu. The truth is APT is awesome. One of the reasons given for creating Chef was that Puppet, an earlier Ruby-based configuration management system choked on dependancy management. I haven’t seen that complaint about APT, not lately, in fact, that’s one of the things they love most about Debian and Ubuntu, people love it so much that say things like “I want apt to bear my children,” or words to that effect.

So, my thought is that I create my own apt repository. I’ll create derivates of the ubuntu packages I need custom versions of, and I’ll create configuration packages derived from their configuration packages whenever possible. Machine and role specific packages can be used to manage rollouts and/or I can different repository tiers for different classes of servers, in much the same way that Debian and Ubuntu have different tiers for testing, stable, unstable, etc. I’m sure I’ll run in to headaches on the way, but it least they will be headaches that other people have suffered and I can learn from their experience.