An Introduction to openQRM

Imagine managing virtual machines and physical machines from the same console and creating pools of machines booted from identical images, one taking over from the other when needed. Imagine booting virtual nodes from the same remote iSCSI disk as physical nodes. Imagine having those tools integrated with Nagios and Webmin.

Remember the nightmare you ran into when having to build and deploy new kernels, or redeploy an image on different hardware? Stop worrying. Stop imagining. openQRM can do all of this.

openQRM, which just reached version 3.1, is an open source cluster resource management platform for physical and virtual data centers. In a previous life it was a proprietary project. Now it's open source and is succeeding in integrating different leading open source projects into one console. With a pluggable architecture, there is more to come. I've called it "cluster resource management," but it's really a platform to manage your infrastructure.

Whether you are deploying Xen, Qemu, VMWare, or even just physical machines, openQRM can help you manage your environment.

This article explains the different key concepts of openQRM and demonstrates how it can provide an easy-to-manage test platform environment.

Concepts

openQRM consists mainly of four components:

A storage server, such as iSCSI or NFS volumes, which can export volumes to your clients.

A filesystem image, captured by openQRM, created, or generated yourself.

A boot image, from which the node boots, consisting of a kernel, its initrd, and a small filesystem containing openQRM tools.

A virtual environment, which is actually the combination of a boot image and a filesystem.

Of course, you also need plenty of network-bootable machines. openQRM allows you to take any given boot image suitable for your specific hardware and combine it with a filesystem image.

The fact that you can mix and combine boot images and filesystems makes openQRM extremely interesting. The most persistent problem I'm having these days is that each time I get a new hardware class (every 18 months), we have to make minor tweaks to our bootstrap environment. Deploying old hardware on new platforms is often tricky. The problem isn't really the technical issues but the human aspect of realizing you are about to use older hardware, and mapping the right initrd to the right machine. Of course, you can build and test and get new machines to work on older distros, but it takes time.

With openQRM, you can build a booting environment that works for all your hardware independently of the image that will actually boot, whether it's an old or new platform. You don't need to take care of the hardware dependencies.

Unless you are using the Local Deploy plugin, deploying a node does not mean that you are installing an OS on a server; it just means you are booting a platform and putting it in production. Within this kind of deployment, there are still different types:

Single deployment, with one image running on one machine.

Shared deployment of the same filesystem on multiple machines. You can define pools where you need a number of resources per type filesystem, and load balance between those instances. (Add openMosix or some other SSI technology for your own pleasure.)

Partitioned deployment; rather than using full machines, you can partition a machine with different virtual machines, also with the same filesystem images. There are plugins for Xen and VMWare, and Virtuozzo and Qemu support are almost ready.

Once you boot your nodes from the network, they arrive in an idle state. When you tell openQRM you need a new virtual platform, it promotes one of the idle nodes to a production node based on existing resources and on the matching requirements from the metadata in a virtual environment.

Obviously, you need to have a working PXE and DHCP environment, but openQRM packages that for you, so you don't need to worry about setting it up yourself; although, you are free to use an existing platform.

Installing openQRM

openQRM comes in a bunch of prebuilt RPMs or as source code. The RPMs are the quickest way to get up and running from a minimal Centos 4.2 with MySQL already installed. Note that you need a working MySQL database and plenty of memory to get started, as openQRM starts its own Java application server and will create its own tables in a MySQL database.

It will first guide you through several options, then create its databases and start configuring its services. It will also create a boot image from your running kernel. Finally, the tool will try to start the QRM server.

Would it surprise you if I told you that the majority of logfiles openQRM creates will be in /var/log/qrm/? Typical problems are running out of memory or failing database connections.

Your openQRM installation is almost ready, Point your browser to http://yourserver:8080/ to see what's happening there. (The username is qrm and so is the password.) If everything went well, you now have a working openQRM setup with no resources and no virtual environments but at least one boot image.

Using openQRM

The first example is how to create an NFS storage section, though for production environments I recommend iSCSI.

When you log on to your openQRM web dashboard, you'll see a menu on the left with the entry Management Tools. From there, choose Storage/Server and click on the top-right Tools pull-down. Select Add New Storage Server. Give your storage server a name, choose NFS and give it an IP address, and finally click Save.

Creating Filesystem Images

Filesystem images come in different flavors. You can create them in different ways, either manually or automatically. Manual creation requires either the command line or the web interface. With the command-line version, you have to have an nfs server ready. The client will log on to a remote machine via SSH (with the root account) and then rsync the image to the NFS server.

OPENQRM:/opt/qrm/bin # ./qrm-filesystem-image create -u qrm -p qrm -s \
FC6INSTAL -l 10.0.11.172:/ -t /vhosts/FC6INSTALL
The next step will create a QRM image from the system 10.0.11.172
(you will be prompted for the password of root@10.0.11.172)
Press ENTER to continue
Creating filesystem image FC6INSTAL from 10.0.11.172
Transfering the image content from 10.0.11.172:///
(this procedure may take time)
root@10.0.11.172's password:

With automatic creation, you go to the web interface and create a new image. Make sure that it has an NFS export ready. When you initially boot that server (after creating a virtual environment mapped to that image), openQRM will create that export.

You can also work with images you download from sites such as Jailtime.org.

Creating Virtual Environments

It's easy to create a virtual environment from the web interface.

Once again from the left menu, select the Virtual Environments/New Virtual Environment option. Don't use partitioning yet. Choose a kernel from the list and choose a filesystem. Then save your new environment. From there you will see the freshly created virtual environment show up in the list of virtual environments.

Getting Ready to Boot

The concept of openQRM is that your system images over the network. This means you need other plugins, such as the dhcpd and tftp, or you can use existing services on your network. You can still configure and install these plugins via ./qrm-configurator in /opt/qrm. By selecting the plugin option, you can enable plugins. To start, enable dhcpd and tftpd. (The plugin has an example dhcpd.conf.)

Now that you have openQRM up and running, and you have configured the DHCP server and have an operational tftp server, you still need to define a boot image.

A virtual environment is the combination of a boot image and a filesystem image. You can mix and match them to make images boot on different platforms, physical and virtual. Installing openQRM produces a default boot image, but you can build other boot images from kernels you prefer. Tftpboot expects a file called vmlinuz-qrm in /opt/qrm/tftpboot/boot. If that doesn't exist, regenerate it by running qrm-admin init-system from /opt/qrm/sbin.

The client will come up with an interface eth0:qrm bound to an IP address obtained through DHCP. Then it sends a starting_signal message to the QRM IP address (via eth0:QRM) and gets information on itself from the QRM server.

It prepares /opt/qrm, runs QRM services, and runs node-agents. Then it finishes its idle node configuration.

Now look at your web console. I have one resource available on mine.

The next step now is to boot an actual virtual environment, as you already created one via the web interface. Pull down the Actions menu on the virtual environment you want to boot and start your virtual environment.

The idle node will reboot and come up again with its intended image. You now have a working openQRM environment ready to play with. Go ahead and define more virtual environments. Add more nodes and shut down nodes while new nodes take over the services of the virtual environment

Conclusion

This article hasn't even touched on the more advanced openQRM features such as partitioning, local-deploy, how their plugin system eases you to add more tools you already know in the same dashboard where examples such as webmin and nagios are already available, or even the other offerings such as High Availability and Workload migration that Qlusters offers.

However, the next time your development team wants to test something risky on its platform, just boot the team members an idle node from their virtual environment, and have them play around, test, and screw things up. Don't worry; it will only take you moments to give them back what they started with--and after they are done, you can have your machines back, ready to use for another service.