The ABCs of virtual private servers, Part 2: Getting started

In Part 2 of our series on virtual private servers, Ars looks at the nuts and …

In Part 1 of this series on virtual private servers (VPS), we looked at the rationale behind going virtual. In this installment, we take you through some of the details involved in getting up and running.

As you might imagine, your VPS experience all starts with an account. Whichever service you want, you first establish an account. Some hosts may require a separate confirmation stage even after a credit card number is validated. This is clearly to prevent spammers, phishers, and crackers from setting up VPSes using a stolen (but not yet reported or discovered) credit card.

For example, Rackspace says it will call you to confirm within 15 minutes of setting up an account. However, in setting up two accounts, I wasn't called in either case. In the first, an emergency, I called after an hour or so, and stayed on the line for tens of minutes to get activated. In the second, a test setup for this article, I was never called (I let the account remain dormant).

With an account in place, you next set up an instance size. Most hosts have a menu of standard setups to choose from; a few let you pick more à la carte. Because of how virtualized host servers are set up, adding more memory or hard disk storage often comes at what seems like a ridiculously high price. That's because the change may prevent the company from offering a full additional server on the same machine.

As part of setting up an instance, you nearly always have to choose a Linux distribution to work with. Some VPS firms also offer Windows Server options, often at about 5 to 10 percent higher cost than a comparable Linux option. While every host varies in which Linux distros are supported, most include CentOS, Debian, Fedora, Red Hat, and Ubuntu. ArchLinux, Slackware, and others are available at particular hosts, and some are available only at certain data centers run by a given company. In some cases, you can choose between 32-bit and 64-bit virtual machines and operating systems, too. (I'm a CentOS convert after years of Red Hat. It was an easy transition. Yes, I know [your flavor of Linux/BSD here] is much better than CentOS, but I can use that distro without pain to do everything I want.)

The standard distribution image is loaded into the virtual machine, and your instance's installation becomes immediately persistent. The flip-side is that if you shut down an instance on any service but Amazon, you continue to pay for it at the hourly rate. You have to delete the instance to stop the hourly clock. Amazon doesn't charge for halted instances that were booted from persistent volumes. With some services, you can take a running or halted instance, write its image to a storage area (for which you pay a per GB/per month fee to use), and later restore from that image. (Amazon as the constant exception lets you boot from both non-persistent model images that you can customize, but which are erased when you shut them down, and persistent volumes that retain all charges. You can also boot from stock, non-persistent images, and specify a script which can mount separate persistent volumes.)

An instance includes a single publicly reachable IP address, but you can add more for a monthly fee. You usually only need an extra IP address if you're using an older version of Apache with SSL/TLS support. Private IPs can also be set up, and are generally free. These private IPs (as noted earlier) allow intra-instance communication within a single data center operated by a host at no additional cost. That's extremely useful when you're moving data back and forth quite a bit among your own operations.

In all the cases I've checked, instances launch with a firewall enabled, sometimes allowing just SSH remote access. Even SSH access might require using a more elaborate key-based authentication, as with Amazon, rather than just a user name and password. The most deeply rooted task you'll carry out is configuring the firewall at the command line to open up services for which you need remote access. (Amazon and some other firms have Web-based firewall configuration wizards, however.)

Most Linux distributions come with associated update services, such as yum with Fedora and CentOS, and apt with Debian and Ubuntu. Since some of my hardware servers dated back several years, and required tons of customization to get to work, I had rarely had the pleasure of automated updates. You can get used to it.

The only place I ran into trouble with my distribution setup was with SSL/TLS and Apache. The CentOS 5.5 distribution of Apache doesn't include SNI (Server Name Indication) support that allows shared used of the same IP address for multiple SSL/TLS Web servers. I had to roll up my sleeves and compile newer versions of openSSL and Apache. That solved the problem, but it does take a bit away for the joy of not needing to compile one's own software. I'm currently in a holding pattern on MySQL 5.5, which is generally available, but not yet part of the CentOS 5.5 updated distributions.

You can go so far as to install upgraded kernels or even swap out kernels altogether. That gets risky, however. VPS hosts typically only provide limited support for the logical machine you're running, and even then only for the standard distributions the provider offers. If you perform a kernel upgrade or OS sidegrade, and things go pear-shaped, you may have to revert to a previously saved image.

Once you've set a machine up to your liking, you almost always have the option to save an image: this is sometimes tied into a separately priced or on-demand backup system. A saved image is a precise clone of your server, and can be used at most hosts to launch new instances. The image can also usually be moved among data centers run by the same firm if you have a reason to switch, or to create redundancy.

With an instance up and running, what could possibly go wrong? Lots. But a well-run VPS host can do quite a bit right to offset potential problems.

Remote Hands through a Dashboard

The point of using a VPS is to be less concerned about hardware. And that has been the case for me and colleagues I've surveyed. But such anxiety reduction doesn't mean that all the hardware that hosts your VPS will work perfectly. Sometimes hardware fails. A good host will have reasonably large amount of redundant hardware on-site to replace hosts and drives as they inevitably fail. (For details on how providers set up their servers' drives, see the backup section, next.)

If a virtualization host fails, which happened to me just a few weeks after moving entirely to VPS hosting, the service provider moves or repoints the drive array, assuming it's undamaged; or migrates images by copying them to unreserved and unused space on other servers. In my case, my server was copied over and relaunched on a new host with a short interruption in service. The IP address and other characteristics naturally remained the same.

But if the hardware's fine, and you're having a virtual machine issue, you turn to the dashboard. Service providers offer a variety of dashboards, including widely supported open-source front-ends, in-house developed Web apps, and commercially licensed software. The basic dashboard shows you the health of a server, lets you control parameters (including upgrades), provides charts and other statistics about use, and offers access to recover, restore, and back up an instance.

The dashboard's remote access feature is key in a couple of circumstances. First, if you can't reach your instance, the ability to fire up a Web-based Java of AJAX terminal session to gain access directly through the host hardware is a powerful tool. This has let me figure out on a couple occasions that a routing issue is the problem, rather than the instance having gone bad. You can also use this Web-based access to fix network interface problems, if you've foolishly disabled an interface or set an adapter to an unreachable IP address. (Let me be the first to raise my hand here and say, "Hi, I'm Glenn, and I'm a foolish remote network adapter misconfigurer." "Hi, Glenn!")

You can also use remote access to figure out in how bad a shape your instance is in before taking the next step. In my examination of the services I've used directly and those surveyed in the chart, recovery comes in one of four forms. Some services combine these, or offer even more fine divisions among them.

Soft restart. Press a Web button to push a reboot to the virtual image that halts all processes and acts just like a soft restart on a physical machine. This is sometimes necessary if a machine becomes so bogged down or otherwise unreachable that you can't carry this out via an SSH session. I've had to use this when tuning Apache in recent weeks: we simply didn't have enough RAM allocated to a VPS, and Apache kept clogging. Being able to soft reset (or sometimes halt and restart Apache through the Web-based terminal) kept us sane.

Hard restart. This option is used to simulate power cycling. The current image's memory is dropped, and the virtual machine reloads from its stored image. This can be fatal at times, depending on what was wrong, and journaled or other disk recovery may be necessary.

Recovery. Some hosts, such as Linode, will let you boot an unmodified distribution identical to the one you're using and mount your damaged instance as a drive on the boot disk. You can then attempt to repair whatever problem was caused, or transfer data off if it seems unrecoverable as a bootable system.

Restore. If all else fails, you can select a disk image backup you've created (see the backup section, next) to restore from, wiping out whatever changes were made in the interim in the interest of returning to a usable instance.

Consider each of these operations if you had to work with your own hardware stored at a colocation data center, or even in your own office. Each operation would require phone calls or the use of remote power-cycling devices. A recovery or restore from an image could take hours and a lot of hassle, including a need for separate hardware to help with recovery.

Our last stop in the VPS tour is with backups, which underlie how robust a VPS instance can be, and help you deal with other problems that may arise.

22 Reader Comments

One major issue with VPS is how over-subscribed the servers are. That is, how many times have they sold the same limited RAM/CPU and disk multiple times to a bunch of customers. If you application needs to use lots of resources for a short period will it be able to? Or will that other user running SETI in their instance impose a cap or how well your application performs?

If people are looking for a really cheap VPS for non-critical personal use, I've found HostRail to be far and away the cheapest (I pay ~$1/mo after discounts for quite reasonable spec). However, I do get random reboots or a laggy connection so definitely not suitable for commercial use. (At least the cheap package)

The smaller Windows market penetration for VPS is indeed a problem. This is mostly because a VPS is deeply tied to business requirements. The same way I wouldn't think of installing a Linux server to manage an SQL Server instance or source control a .net project, I won't be choosing a Linux VPS for those purposes.

Windows offerings do exist. But the saturation of Linux solutions means there's here a big gap between the available options for one and the other operating system. With cloud services I suspect hosting business will even have less of an incentive to adopt Windows and the current offerings will only stagnate or even decrease. Another issue is the nature of Windows based VPS when seen from the perspective of the potential market for them. These are mostly already established businesses for whom the thought of TI infrastructure outsourcing, dependency on 3rd parties and worst, on net availability, is simply too risky.

That leaves very little breathing room for people like me and my small business who only wants to find a local Windows-based VPS hosting service to provide a server development platform for 2 developers some 20 miles from each other. And naturally we simply can't take Linux for our solution.

I recently discovered Arcuscloudbrokers who, when I asked if they offered a Windows-based offering, they said they were looking to have one available soon. Might be worth checking it out. I ended up with a Linux-based VPS in the end because I needed it just for personal use.

> One could see an additional service level offering here for VPS hosts in automating non-local data center image copying, even if only a daily and weekly backup were kept at another location, coupled with some automated way to restore.

I would go even further: it seems natural for a VPS host to provide a service of "continuous" replication between colocation facilities, this would prevent the several hour outage described on the author's blog (link is in the first part), well at least for those who have the money to pay for this 'high availability' service.

One thing I am weary about before making a move is how these servers would be able to interact with resources that can't generally be outsourced, such as a local AD controller or file server. Are most of these hosted servers able to connect via VPN without breaking their default remote control connections (RDP, ssh, etc)? There is always port forwarding, but a VPN would be much preferred as long as it didn't break other functionality.

One major issue with VPS is how over-subscribed the servers are. That is, how many times have they sold the same limited RAM/CPU and disk multiple times to a bunch of customers. If you application needs to use lots of resources for a short period will it be able to? Or will that other user running SETI in their instance impose a cap or how well your application performs?

That is specifically not an issue with VPS hosts, or at least the ones I surveyed. In shared hosting, that's a possibility. With virtualized hosting, providers can strictly allocate resources among customers.

One major issue with VPS is how over-subscribed the servers are. That is, how many times have they sold the same limited RAM/CPU and disk multiple times to a bunch of customers. If you application needs to use lots of resources for a short period will it be able to? Or will that other user running SETI in their instance impose a cap or how well your application performs?

That is specifically not an issue with VPS hosts, or at least the ones I surveyed. In shared hosting, that's a possibility. With virtualized hosting, providers can strictly allocate resources among customers.

One can probably safely assume that the cheapest plans use thin provisioning for hardware resources, while more expensive plans use thick provisioning.

Once can probably safely assume that the cheapest plans use thin provisioning for hardware resources, while more expensive plans use thick provisioning.

That's possible, but there's no good way to test that. It's possible that the cheapest plans have acquired powerful and inexpensive commodity hardware (from failed dotcoms and companies), and have low capital costs, too. The only way to know is to find benchmarks (I found none that didn't appear to be sponsored by hosting firms who paid for placement), or perform them yourself.

One major issue with VPS is how over-subscribed the servers are. That is, how many times have they sold the same limited RAM/CPU and disk multiple times to a bunch of customers. If you application needs to use lots of resources for a short period will it be able to? Or will that other user running SETI in their instance impose a cap or how well your application performs?

That is specifically not an issue with VPS hosts, or at least the ones I surveyed. In shared hosting, that's a possibility. With virtualized hosting, providers can strictly allocate resources among customers.

They can, but will they? Especially those who provides OpenVZ-based VPS service.

This article series seems to be a marketing piece dedicated to RackSpace and outlining their feature set.

Amazon's EC2 offering has all of these features and is a much larger service with an established track record. The author consistently misrepresents the EC2 feature-set and downplays this fact.

EC2 instances have "persistent" storage through EBS (more durable than RAID10), which is now the default way providing the root filesystem for an instance. The "ephemeral" disks seem to just be a way to access local disk spindles and store data that is local to a single host. Even through instance reboots, I've never lost any "ephemeral" data. The only catch is that when you boot a new instance on a new host, that new host doesn't have the data from the previous host's disks (not surprising). On the other hand, EBS volumes can be detached and reattached to new hosts in seconds. EBS allows S3 backups (12+ 9's of durability) to be taken on any cadence the customer chooses, and the customer is only charged for compressed deltas, not the full image size. A 100 GiB volume would be $10 / mo, and my backup usage has been a small fraction of this.

If the physical host fails, simply detach the root volume and boot another instance, allowing a 5min recovery from even the worst hardware failures. Restoring from a snapshot backup is similarly quick (or cloning a production server).

While EC2 is targeted at serious usage, hosting top-tier websites like Netflix (http://techblog.netflix.com/2010/12/5-l ... g-aws.html), there is also a free tier which covers basically all of the usage outlined in this article (http://aws.amazon.com/free/). Amazon's free tier gives you one 2core/600MiB instance, a 10 GiB EBS volume with backups, 5 GiB of S3 storage, an ELB load balancer, plus a bunch more for $0 / mo. For trying out a VPS, it's not a bad place to start.

This article series seems to be a marketing piece dedicated to RackSpace and outlining their feature set.

This is part 2 of a two-part piece. Rackspace is mentioned on the first page of this article in terms how the firm failed to call to confirm an account. Most of the article is about general features. Both Rackspace and Linode, the two VPS hosts I have sites hosted at (as noted in the first article) are discussed. The first part also includes a chart comparing feature offerings from several other hosts.

Quote:

Amazon's EC2 offering has all of these features and is a much larger service with an established track record. The author consistently misrepresents the EC2 feature-set and downplays this fact.

As noted in the previous part and in the comments for that article, I found Amazon EC2's EBS boot volumes did not not work correctly for me during my testing. In subsequent testing, the problems I experienced over several weeks and multiple tries seem to have disappeared. I can't report that things work that don't work for me. Substantial discussion in the comments for the first piece contribute reader expertise in use of EC2's persistent boot volumes, and that's how readers help expand the vision that a single author can contribute.

That's an opinion, not a fact. I know of no side-by-side testing of reliability (or performance) of EBS against RAID10 VPS systems. (Does this sound like an advertisement for Amazon EC2?)

Quote:

For trying out a VPS, it's not a bad place to start.

I agree. However, for people without significant IT infrastructure background, I don't recommend Amazon EC2 because of the learning curve and the complexity of managing the console or using a CLI. I have kept Unix boxes in near continuous operation since 1994 (switching to standard Linux distros over a decade ago). Despite that, and despite being able to run and recover my own hardware, etc., Amazon EC2 was quite opaque and it took me quite a while to get precisely what I needed out of it.

That contrasts with my experience with Rackspace and Linode (and the putative experience at other non-Amazon VPS hosts) where I selected a size, a distro, and clicked go, logging in with a standard account-based SSH transaction.

I moved over to Linode a few months ago, best decision I ever made.I am on the Linode 512 plan and it is plenty fast. Anyone who says that VPSs are slow are full of it or being cheap and going with a low quality DC.

Even Linode's control panel for managing your account is pretty amazing, plus being able to manage your DNS all in one place.I was on a dedi for a few years before and sure having my own hardware was nice but it is expensive and you have little security over hardware failure. If it dies you are pretty much stuck for a while.

I also have Linode's backup service, $5/mo for peace of mind is great.

At work we use Rackspace for the many hundreds of sites that we host, along with JungleDisk, personally I don't care for it too much but it works and doesn't cost too much. For storing hundreds of GB worth of sites that you can't risk being down then either Rackspace/Amazon is a good solution.

Amazon has non-persistent instance storage, but you can launch instances with scripts that mount persistent, separately managed storage. You can also customize and store your boot images for EC2.

When you keep saying this, it makes me think you are having trouble with EBS because you are doing it wrong.

There's no customization required to simply launch and EBS-backed instance on EC2. You launch it in exactly the same way as an ephemeral instance, but the AMI you use is EBS-based.

Using Ubuntu as an example (sourced here: http://uec-images.ubuntu.com/releases/maverick/release/ ) launching ami-e59ca991 would result in an EBS-backed instance that persists when you issue a "stop" command. Issuing a "terminate" command causes the instance to go away completely, including the storage.

The only thing that isn't ideal in the basic launch: the size of the EBS volume, which defaults to 8GB. You can expand this after the fact.

Amazon has non-persistent instance storage, but you can launch instances with scripts that mount persistent, separately managed storage. You can also customize and store your boot images for EC2.

When you keep saying this, it makes me think you are having trouble with EBS because you are doing it wrong.

This is an error. The articles were written weeks ago, edited a bit ago, and posted a week apart -- a correction in last week's part I didn't fold over into this one. I'll fix this. Part 1 had a similar error, which was corrected.

I agree. However, for people without significant IT infrastructure background, I don't recommend Amazon EC2 because of the learning curve and the complexity of managing the console or using a CLI.

If a user has the background to manage his own server (Windows or Linux) he should have no problem also managing Amazon's concepts and terminology.And if he doesn't, he shouldn't be using Amazon in the first place.Otherwise, I can already imagine those poorly configured web servers.... how do you set the NT permissions? Everyone Full Control.

Once can probably safely assume that the cheapest plans use thin provisioning for hardware resources, while more expensive plans use thick provisioning.

That's possible, but there's no good way to test that. It's possible that the cheapest plans have acquired powerful and inexpensive commodity hardware (from failed dotcoms and companies), and have low capital costs, too. The only way to know is to find benchmarks (I found none that didn't appear to be sponsored by hosting firms who paid for placement), or perform them yourself.

A batch script running a quick CPU and RAM test every hour would help test. Copying a small compressed file into RAM and decompressing it (recording both times) would give you a fairly good metric for both RAM and CPU. I have a perl script I install on many of my VMs.

Another guy simply used Apache's deflate mod, hitting a certain URL on the server and then checking the site statistics for that URL to see how long it took to process and load.

This article series seems to be a marketing piece dedicated to RackSpace and outlining their feature set.

This is part 2 of a two-part piece. Rackspace is mentioned on the first page of this article in terms how the firm failed to call to confirm an account. Most of the article is about general features. Both Rackspace and Linode, the two VPS hosts I have sites hosted at (as noted in the first article) are discussed. The first part also includes a chart comparing feature offerings from several other hosts.

Quote:

Amazon's EC2 offering has all of these features and is a much larger service with an established track record. The author consistently misrepresents the EC2 feature-set and downplays this fact.

As noted in the previous part and in the comments for that article, I found Amazon EC2's EBS boot volumes did not not work correctly for me during my testing. In subsequent testing, the problems I experienced over several weeks and multiple tries seem to have disappeared. I can't report that things work that don't work for me. Substantial discussion in the comments for the first piece contribute reader expertise in use of EC2's persistent boot volumes, and that's how readers help expand the vision that a single author can contribute.

That's an opinion, not a fact. I know of no side-by-side testing of reliability (or performance) of EBS against RAID10 VPS systems. (Does this sound like an advertisement for Amazon EC2?)

Quote:

For trying out a VPS, it's not a bad place to start.

I agree. However, for people without significant IT infrastructure background, I don't recommend Amazon EC2 because of the learning curve and the complexity of managing the console or using a CLI. I have kept Unix boxes in near continuous operation since 1994 (switching to standard Linux distros over a decade ago). Despite that, and despite being able to run and recover my own hardware, etc., Amazon EC2 was quite opaque and it took me quite a while to get precisely what I needed out of it.

That contrasts with my experience with Rackspace and Linode (and the putative experience at other non-Amazon VPS hosts) where I selected a size, a distro, and clicked go, logging in with a standard account-based SSH transaction.

I may have come off a bit harsh. Was still an interesting read, but I do feel that your familiarity with a subset of the available vendors shows through. And while no one seems to publish numbers, I believe Amazon is the biggest game in town. (someone, please, show me the data that proves I'm wrong. or right. preferably right.

I'll grant you that for setting up a single server, many of the smaller VPS vendors have an "all-in-one" turnkey approach. This works well for the $30 / mo small website crowd, but starts to fall apart when you don't fit inside the lines. When your bandwidth usage exceeds the quota and you pay overage charges. Amazon targets a more ... (for lack of a better word) sophisticated market, with utility pricing for everything, disk, compute hours, network, etc. This confuses a lot of people at first - eg, "how many million IO's will I do per month?", but in the long run gives the most predictable "surprise-free" pricing model.

Even so, setting up a basic EC2 instance isn't so bad:1) setup an account and open the management console2) create a public-private key pair (gives you the private half for download and storage)2b) add port 22 (SSH) to your default firewall rules (aka "security group")3) click "launch new instance..." and follow the wizard. There are a couple of vanilla Linux distros that work fine, or use Google to find an AMI someone else has cooked up.4) ssh -i <<your private key file>> <<the server's public DNS name>>

AWS CloudFormation gives developers and systems administrators an easy way to create a collection of related AWS resources and provision them in an orderly and predictable fashion.

Developers can use AWS CloudFormation’s sample templates or create their own templates to describe the AWS resources, and any associated dependencies or runtime parameters, required to run their application. You don’t need to figure out the order in which AWS services need to be provisioned or the subtleties of how to make those dependencies work. CloudFormation takes care of this for you.

Customers can create a template and deploy its associated collection of resources (called a stack) via the AWS Management Console, CloudFormation command line tools or APIs. CloudFormation is available at no additional charge, and customers pay only for the AWS resources needed to run their applications.

So now I can create a template for a 3-tier setup (load balancer, web server, database server, and all the supplemental EBS volumes I want) and then roll out multiple copies of that setup by pushing a button. My job today is about 20% easier than it was yesterday.

Server's public DNS name as long as it's running. Turn it off and you lose it and the IP.

Amazon provides ~30 AMI's, including SUSE, Amazon Linux, Windows, Ubuntu, and various LAMP / Rails / Perl / etc stacks. There are also hundreds of more specialized AMI's provided by name-brand companies such as Oracle, IBM, Novell, and Sun. And yes, users can create their own customized AMI's and even share them. http://aws.amazon.com/amis/AWS?browse=1

Maintaining a long-term public IP-address is pretty easy. Amazon calls it "Elastic IP", and it's free as long as you have an instance running. With no instances running, holding on to a public ip-address costs $0.01 / hr.