Sunday, November 29, 2009

Why use Amazon's EC2 or Google's cloud computing services when you can set up your own private cloud with a few open source tools?

Conventional wisdom has it that if you want to make use of "the cloud," you've got to use someone else's service -- Amazon's EC2, Google's clouds, and so on.

Canonical, through its new edition of Ubuntu Server, has set out to change all that. Instead of using someone else's cloud, it's now possible to set up your own cloud -- to create your own elastic computing environment, run your own applications on it, and even connect it to Amazon EC2 and migrate it outwards if need be.

It's an implementation of the Eucalyptus cloud-computing architecture, which is interface-compatible with Amazon's own cloud system, but could, in theory, support interfaces for any number of cloud providers.

Since Amazon's APIs and cloud systems are broadly used and familiar to most people who've done work with the cloud, it makes sense to start by offering what people already know.

A UEC setup consists of a front-end computer -- a "controller" -- and one or more "node" systems. The nodes use either KVM or Xen virtualization technology, your choice, to run one or more system images.

Xen was the original virtualization technology with KVM a recent addition, but that doesn't mean one is being deprecated in favor of the other.

If you're a developer for either environment, or you simply have more proficiency in KVM vs. Xen (or vice versa), your skills will come in handy either way.

Keep in mind you can't just use any old OS image, or any old Linux image for that matter. It has to be specially prepared for use in UEC.

As of this writing Canonical has provided a few basic system imagesthat ought to cover the most common usage or deployment scenarios.

Hardware Requirements

Note also that the hardware you use needs to meet certain standards. Each of the node computers needs to be able to perform hardware-accelerated virtualization via the Intel VT spec. (If you're not sure, ZDNet columnist Ed Bott has compiled a helpful shirt-pocket list of recent Intel CPUs that support VT.

The node controller does not need to be VT-enabled, but it helps. In both cases, a 64-bit system is strongly recommended. Both nodes and node controller should be dedicated systems: they should not be used for any other functionality.

The computers in question also need to meet certain memory and performance standards. Canonical's recommendationsare 512MB - 2GB of memory and 40GB - 200GB for the controller, and 1 - 4GB of RAM and 40 - 100GB of storage for the nodes.

It probably goes without saying that you'll want the fastest possible network connection between all elements: gigabit Ethernet whenever possible, since you'll be moving hundreds of megabytes of data among the nodes, the controller, and the outside.

Finally, you'll need at least one machine that can talk to the server over an HTTP connection to gain access to Eucalyptus's admin console.

This can be any machine that has a Web browser, but for the initial stage of the setup it's probably easiest to use an Ubuntu desktop client system.

The scripts and other client-end technology provided all run on Ubuntu anyway, so that may be the best way to at least get the cloud's basic functionality up and running.

Implementation

Setting up a UEC instance is a bit more involved than just setting up a cluster, although some of the same procedures apply.

It's also not a good idea to dive into this without some existing understanding of Linux (obviously), Ubuntu specifically (naturally) and cloud computing concepts in general, as well.

The first thing to establish before touching any hardware is the network topology. All the machines in question -- controller, nodes, and client-access machines -- should be able to see each other on the same network segment.

Canonical advises against allowing another DHCP server to assign these machines addresses since the node controller can handle that duty.

I've found you can get away with having another DHCP server, as long as the IP assignments are consistent (e.g., the same MAC is consistently given the same IP address).

The actual OS installation is extremely simple. Download the Ubuntu Server 9.10 installation media, burn it to disc or place it on a flash drive (the latter is markedly faster), boot it, and at the installation menu select "Install Ubuntu Enterprise Cloud." You'll be prompted during the setup for a "Cloud installation mode"; select "Cluster" for the cloud controller. (You can optionally enter a list of IP addresses to be allocated for node instances.)

After the cluster controller is up and running, set up the nodes in the same way. At the "Cloud installation mode" menu it should autodetect the presence of the cloud controller and select "Node."

If it doesn't do this, that's the first sign something went wrong -- odds are for whatever reason the machines can't see each other on the network, and you should do some troubleshooting in that regard before moving on.

Once all the nodes are in place, you'll need to run the euca_conf command on the node controller. This insures that the controller can see each node, and allow it to be added to the list of nodes available.

The controller doesn't automatically discover and add nodes; this is a manual process for the sake of security. (If you have more than one cluster on a single network segment, you can quickly see why this is a smart idea.)

A UEC cluster doesn't do anything by itself. It's just a container, inside which you place any number of virtual machine images.

Virtual machine images for UEC have a specific format, too. You can't simply copy a disk image or .ISO and boot that into a UEC cluster, as you might be able to with a virtualization product like VMware or VirtualBox.

Instead, you have to "package" the kernel and a few other components in a certain way, and then feed that package to the cluster.

Canonical has several of these images already available. They're representative of the most common Linux distributions out there, so most everyone should be able to find something that matches what they need or already use.

The process for uploading a kernel bundle is a multi-step method that you should take as slowly and gradually as possible.

In fact, you may be best off taking the instructions described on the page, copying them out, and turning them into a shell script.

That way, you won't be at the mercy of typing the wrong filename or passing the wrong flags -- and not just once, but potentially over and over.

The same goes for the scripts used to build custom images (read on for more on that).

Finally, once an image has been uploaded and prepared, the administrator can start the instance through one of Eucalyptus's command-line scripts.

Note that an image might take some time to boot depending on the hardware configuration (and whatever else might also be running on the cluster), so you might see the image show up as "pending" when you use the euca-describe-instances command to list running instances.

Walrus And Amazon S3

Those who've used Amazon's EC2 will be familiar with Amazon S3. That's the storage service provider for EC2, which lets you preserve data in a persistent fashion for use in the cloud.

Eucalyptus has a similar technology, Walrus, which is interface-compatible with S3.

If you're familiar with S3 and have already written software that makes use of it, retooling said software for Walrus shouldn't be too difficult.

They use many of the same command-line functions -- e.g., S3'simplementation of Curl -- but you can also use Amazon's own EC2 API/AMI toolset to talk to Walrus as if it were Amazon's own repositories.

Note that Walrus and S3 have a few functional differences. You can't, for instance, yet perform virtual hosting of buckets, a feature typically used for serving multiple sites from a single server.

This is something that's probably more useful on Amazon's services than in your own cloud, so it's not terribly surprising that Walrus doesn't support it (yet).

Making Your Own Image

I mentioned before that you can't just use any old OS with UEC; you have to supply it with a specially-prepared operating system image.

Canonical has a few, but if you want to create your own image, you can.

As you can guess, this isn't a trivial process. You need to provide a kernel, an optional ramdisk (for the system to boot to), and a virtual-machine image generated using the vmbuilder tool ().

It's also possible to use the RightScale, a cloud management service that works with Amazon, RackSpace, and GoGrid as well as Eucalyptus-style clouds.

Obviously you'll need a RightScale account to take advantage of this feature, but the basic single-account version of RightScale is free, and has enough of the feature set to give you a feel for what it's all about.

The Future

So what's next for Eucalyptus on Ubuntu? One possibility that presents itself is using Eucalyptus as an interface between multiple cloud architectures.

The folks at Canonical have not planned anything like this, but the potential is there: Eucalyptus can, in theory, talk to any number of cloud architectures, and could serve as an intermediary between them -- a possible escape route for clouds that turn out to be a little too proprietary for their own good.

Another, more practical possibility is more in the realm of a feature request -- that the existing process for creating, packaging, uploading and booting system images could be automated that much more.

Perhaps the various tools could be pulled together and commanded from a central console, so the whole thing could be done in an interactive, stepwise manner.

What Eucalyptus and UEC promise most immediately, though, is a way to take existing commodity hardware and make it elastic without sacrificing outwards expansion.

What you create with UEC doesn't have to stay put, and that's a big portion of its appeal.

Friday, November 27, 2009

Solid state drives (SSDs), as compared to their spinning counterparts, have no moving parts, require less power, have a smaller footprint, produce a fraction of the heat, enjoy a longer life span and perform better in some systems.

That first sentence should have sold you — what else do you need to know? Oh, right, the downside. You're right; it's the price tag.

They currently range in price from two or three times for smaller drives (about 30GB) to more than 10 times that for drives in the 120GB to 250GB range.

Don't let the prices scare you away from SSDs. As the technology matures, the prices will drop significantly.

When deciding on your next move in storage technology, keep in mind you don't need a huge amount of disk space to install an operating system.

Hypervisors use about 4GB and full installations of Windows Server 2008 require that same 4GB.

A $90 32GB SSD provides more than enough space for the operating system and any future patches, service packs and related operating system support files.

Green Technology
At first glance at the prices, you might think that the "green" in this technology is the price, but it isn't. It's the technology behind the high price.

Lowering the amount of heat produced by hundreds of disk drives adds up fast.

Data centers will run at near-normal office temperatures instead of the current frosty temperatures around which they now hover.

Requiring less power from your utility company proves that this new technology saves money and not just in theory (See the table below).

SATA vs. SSD (Watts)

Drive Type

Idle

Seek

Start Up

SATA

8

10

20

SSD

0.08

0.15

ND*

The table shows the average power consumption from a variety of different SATA and SCSI drives. The SSDs are Intel High Performance SSDs.

PerformanceIf you've heard of SSDs, you've also heard about their increased performance over conventional disk technology.

Since SSDs don't have moving parts, their seek times return numbers in the range of 75 microseconds to one millisecond.

The case for SSD adoption is strong, indeed. SSDs transcend the hype that's often associated with new technologies.

Independent case studies show that SSDs create a new storage playing field and manufacturers suggest that conventional spinning disk technology is near its final breath.

I predict within five years, SSDs will populate more than 90 percent of all server systems and NAS. By that time, technology will have caught up to the point where any application will feel right at home on SSDs-even the write-intensive ones.

Have you adopted SSDs in your server systems yet? Write back and let us know.

This scheduler is used to read and write data from the hard disk sequentially. Since an SSD is not a conventional hard disk, disabling the elevator scheduler significantly improves the read and write performance of your SSD.

Finally, you might want to set the file system mount option to noatime. To do this, edit the /etc/fstab file, so it looks something like this:

/dev/sda1 / ext4 noatime,errors=remount-ro 0 1

Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read -- or in other words, this means faster file access and less disk wear.

That's it. Now reboot your machine, and you should notice faster boot and better performance.

Thursday, November 26, 2009

Passwords, user accounts, email verification. I have never liked requiring my website's visitors to register before they can leave a comment.

There is a large segment of people that like to submit quality comments online, but they don't want to be required to leave their personal information there.

So from the beginning, I have always allowed anonymous commenting by unregistered visitors and for the most party, they quality of the comments haven't suffered.

However, allowing for anonymous comments also invited my site into a war against comment spam. My latest weapon to do the fighting for me in this war is Mollom.

I was first introduced to Mollom in the Fall of 2007 as a beta tester. Prior to Mollom, I had been using a number of techniques, modules, and services with limited success in blocking unwanted spam.

While some of these filtering methods did help me filter out unwanted content, I was still spending quite a bit of my time moderating the comments for potential spam.

Worse, in long absences from the site I had to disable anonymous commenting for fear that I would come back to a site riddled with ads for the latest popular pharmaceutical drugs or some girl that wanted to be seen for a price.

That's when Mollom entered the picture and helped stop most of the spam from entering my site.

In the two years since I've used Mollom, the service probably has blocked more than 100,000 pieces of spam from being posted at my site.

Since, the current statistics provided by Mollom only date back to early 2008, the official number of spam blocked stands at around 77,000.

In other words, I receive an average of 120 comments a day that require no moderation on my part.

Mollom Statistics for CMS

Report: Trend of spam/ham from Mollom for CMS Report, February 13, 2008 to November 20, 2009

This isn't to say that Mollom is perfect.

Once in awhile, I will see a half dozen spam comments suddenly appear at my site. In part, this is likely because the spam filtering service hasn't learned that this particular style of spam comment is unwanted content.

I also think it is because of my choice to not pay one dime for this valuable service.

I'm using the free version of Mollom which does not make available access to the high-availability back-end infrastructure that Mollom's paying customers on Mollom Plus are able to access.

However, from what I have observed, even the downtime for Mollom Free is rare.

Luckily, your CMS module or plugin can provide a fallback site access to the Mollom servers is unavailable.

This fallback strategy gives you one additional safeguard in making sure your site doesn't go unprotected when Mollom is down.

For my Drupal site I usually leave the forms unprotected if Mollom's servers fail for fear blocking a quality comment from the site.

However, when I'm going on that week-long family vacation I will toggle the Fallback strategy to block submissions when there are issues with Mollom servers.

If you're not using Drupal, you'll have to check your CMS's own module for Mollom to see what fallback strategies have been made available to you.

CMS modules using Mollom are available for Wordpress, Joomla!, Radiant, SilverStripe, StatusNet, and of course Drupal.

Spam filtering services such as Mollom are well aware that there are smart people out there trying to find new ways around the spam filters.

In a recent online interview with Dries Buytaert, we instant messaged back and forth on what is being planned for Mollom.

That interview started out with a simple question by Dries, "Will it be favorable or not?".

Dries would have given the interview no matter my opinions on Mollom, but I think he wanted to know how much I understood the environment that the spam filtering industry must work in.

The service's nature is such that it isn't perfect. Being in the spam protection business is hard because there is no such thing as a perfect service. So different people have different expectations.

I'm no stranger to the amount of work required by IT folks to make the difficult happen in ways that convinces the user such tasks are easy to deliver.

So the interview with Dries quickly turned away from concerns of any failings Mollom may currently have toward what we can look forward to in Mollom's future.

According to Dries, there are some welcome improvements for Mollom coming down the road. For example, Mollom is planning to introduce an interface for URL reputations.

Currently, Mollom will use CAPTCHA when it doubts the legitimacy of a comment. CAPTCHA is great for preventing automated spam being served to your site, but it doesn't do so great with human beings submitting spam.

Mollom also doesn't ask for a CAPTCHA when it believes the comment to be legitimate. Some spambots that embed good content with bad links can trick spam filters in thinking the content is good.

That's why some of Mollom's users have asked to be able to identify and block legitimate looking comments with the known spam links (called a URL Limiter based on URL reputations).

Mollom is hoping address this request soon.

Below I've compiled a list of of features for Mollom that Dries says are currently in alpha or beta testing.

Although many of these features have not yet been implemented, it is a good bet that we'll likely see them within the next several months.

Possible new features to be in included in Mollom

Improved Classifier on the back-end. The improved classifier is expected to help Mollom better determine whether those comments and articles are spam, no spam, off-topic, etc.

New Site Design for Mollom.com: Mollom is revamping the website with a new design. The design is ready but still needs to be implemented.

A new language detection api to identify what language a post is written in.

Mollom will have APIs to expose URL reputations allowing we'll tell you whether it might be a spam url or not

Support for SSL websites.

So for now, I'm sticking with Mollom. I know a few of you leaving comments will complain about Mollom, but I honestly don't know of a better alternative.

If you are a business or busy site owner that can't afford to allow comments to go unchecked I think Mollom is your answer and would encourage you to look into the Mollom Plus or Mollom Premium packages.

I've been on the hunt for a good spam filter for so long that I know a good thing when I see it. Mollom was the right solution for me and likely the solution to your spam problems as well.

Although I've made up my mind, I'm sure people would appreciate to hear the experiences of others using Mollom.

I'd especially like to hear from those that have subscribed to one of the pay packages their impression of Mollom.

Now that this post has been finished, I suppose we'll also have to see how hard the spammers work into getting a comment submitted to this post.

In the past, I've noticed they have put extra effort in placing spam on posts such as this one.

The Xsupplicant has been around since 2003 and is developed by Open1X and backed by the OpenSEA Alliance.

The wpa_supplicant has been around since 2004 and is developed by Jouni Malinen and other contributors.

Both clients run on Linux and Windows and have a GUI application in addition to text-based configuration.

The wpa_supplicant project also supports BSD and Mac OS X.

Not only is Ubuntu 9.10 already loaded with the wpa_supplicant, its own networking GUI communicates directly with the supplicant.

Configuring 802.1X authentication and connecting to WPA or WPA2 Enterprise networks in Ubuntu is pretty straightforward.

When you're ready to connect, simply click the network icon on the top of the screen and select the network from the list.

If you're using a password-based EAP protocol, like the popular PEAPv0/EAP-MSCHAPv2, you'll be prompted to enter the authentication settings, such as seen in Figure 1.

This also assumes the wireless card and driver supports WPA/WPA2.

First, verify Wireless Security is set to WPA & WPA2 Enterprise. Then choose the Authentication protocol that's supported by the authentication server, such as the popular PEAP protocol.

Unless your authentication server is set to accept anonymous connections, ignore that setting.

Next you should choose a CA Certificate file, so the client can verify it's connecting to a legitimate authentication server before completing its authentication.

Though you can skip this setting, it's recommended to validate the server's certificate for full security.

If the authentication server is loaded with a SSL certificate purchased from a Certificate Authority like VeriSign or Godaddy, you'll have to download their public root certificates from their site since Ubuntu isn't already loaded with them like in Windows.

If you created your own self signed certificates like with openssl, you need to select the root CA certificate that was created.

Now you can set the other settings for the EAP type you selected. If you selected PEAP, for example, you can leave the PEAP Version as Automatic and the Inner Authentication as MSCHAPv2.

Finally, input a Username and Password that's setup in the authentication server or backend database.

When you're done, click Connect. Give it a couple of seconds to complete the 802.1X process and it should successfully connect up to the network.

If not, double-check the settings and check the debug or logs on the authentication server.

Stay tuned--in the next part, we'll see how to manually configure the 802.1X supplicants.

The steady rise in people using IP telephony to communicate -- for personal and business reasons -- has led to the development of a number of different VoIP “softphones” that can be used on a PC or notebook.

Softphones offer the flexibility of making a call without the need for a dedicated device. If you’re a Skype user you’re probably used to the benefits of free and cheap international calls while you’re on Facebook.

In this edition of "5 Open Source things to Watch" we take a look at VoIP softphones. Unlike their proprietary counterparts, open source softphones can be deployed on as many devices as required throughout the enterprise -- without additional licence fees.

1. QuteCom

QuteCom began life as OpenWengo developed by French VoIP provider Wengo as a free softphone for its telephony service.

SFLphone is a SIP and IAX2 (Asterisk) compatible softphone for Linux developed by Canadian Linux consulting company Savoir-Faire Linux.

The SFLphone project's goal is to create a “robust enterprise-class desktop phone” and is designed to cater for home users as well as the “hundred-calls-a-day receptionist”.

Its main features include support of unlimited number of calls, multi-accounts, call transfer and hold. Call recording is another useful feature.

SFLphone has clients for GNOME (integrated options), KDE and Python and it now supports the PulseAudio sound server, so users can experience additional functionality like sound mixing and per-application volume control.

The softphone is designed to connect to the Asterisk open source PABX.

Even though the Linux operating system is very stable and rarely needs a reboot, there are times when an update (such as a kernel update) will make this a requirement.

At least that used to be the case. That is correct. With the help of a newly developed technology (dubbed Ksplice) even a kernel update will not require a reboot.

This is fantastic news to administrators who depend upon constant uptime for their servers and production desktops/machines.

Of course one might think such a technology would be difficult at best to use. Not so. The developers of Ksplice have created an incredibly easy to use system that allows the administrator to handle critical updates, normally requiring a reboot, as easily as those updates that do not require a reboot.

Getting such a system working does requiring the installation of third party software. This tutorial will walk you through installing Ksplice as well as how to go about updating a currently running kernel with the new system.

Installing Ksplice

Figure 1

To install Ksplice navigate your browser to the Ksplice Uptrack page and click on the link for your particular distribution.

If you are using Ubuntu the Gdebi installer will be an option to select from (see Figure 1) . Select Open with and then make sure GDebi is selected. Click OK and the installation will commence.

During the installation a new window will open specific to Ksplice. In this window you will have to agree to a License and then click Forward. Once you have done this the installation will complete.

Using Ksplice

Figure 2

After install is finished Ksplice will automatically open up the update window (see Figure 2) and reveal to you if there are any updates for your currently running kernel. This might very well remind you of the average Linux package management front-end.

In order to install the update(s) click the Install All Updates button to take care of any updates pending.

You will also notice a new icon added to your Notification Area (see Figure 3). This Icon will not only allow you to launch the

Figure 3

Ksplice tool, it will also keep you informed if there are any updates available. Figure 3 shows the Ksplice icon with a pending update.

When your system is up to date the “!” will disappear and leave you with a clean “K” icon.

Command line
What Linux tool is complete without a command line component? Ksplice includes four command line tools for your terminal pleasure:

uptrack-upgrade: This command will download and install the latest kernel updates available for your system.

uptrack-install PACKAGE: Will install a specific update (Where PACKAGE is the package name to update.)

uptrack-remove PACKAGE : Will remove a specific update (Where PACKAGE is the package name to remove).

uptrack-show PACKAGE: Will show more detail about a specific update (Where PACKAGE is the package name).

Final thoughts
I have been using Linux (and computers) for quite some time. I never thought I would see the day when such a major update to the underlying sub-systems could be pulled off without a reboot. And not only that, it is done as simply as using a GUI interface.

But now we are looking at something special. Ksplice is only now beginning to make serious inroads into reaching that goal of 100% uptime. And now, without having to reboot after a major upgrade, that 100% number is looking closer and closer every day.

Occasionally, usually due to an earlier typo, you end up with files with peculiar names. Usually these are easily removable, but if you have a file with a name starting - (e.g., -file, or even -f), the commandline:

# rm -file

will not work. rm will treat this as indicating the use of the four options -f, -i, -l, and -e, and will die on -l, which isn't a valid option.

You might try using a shell escape:

# rm \-file

However, if you think about what this actually does, it still won't work.

The shell will see the escape, remove it, and pass plain -file into rm; so you run straight into the same problem.

What you need is an escape sequence for rm itself, not for the shell.

There are two ways around this. The first one is to use --, which is used by many commands to indicate the end of the options.

If you'd created a directory called -test and wanted to remove that directory and everything in it, this command line would work:

# rm -rf -- -test

The -rf sets actual options; the -- signals that we're done with options, and that everything after this should be treated as something to be passed in to the command.

In this case, that's the name of the directory to get rid of. Test it out like this:

# mkdir -- -test
# ls -l -- -test
# rm -rf -- -test

The other option is to specify the directory of the file. To remove the file -file in your current directory, use:

# rm ./-file

This will work with other commands as well. To test it out, try:

# touch ./-file
# ls -l ./-file
# rm ./-file

Now, when you discover strange and peculiar files lurking in your directories, you can clear them up without any difficulty.

$ bash badfix.sh
mysql -e UPDATE atable SET name='name 'with' quotes and more' WHERE id='1'
ERROR 1064 (42000) at line 1:
You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version
for the right syntax to use near 'with' quotes and more' WHERE id='1''
at line 1

Note, the function at the top (remove_header) removes the header line from the mysql output so that we don't get the name of the field included in the data.

We all know the solution here: we need to escape the quotes in the value so that both bash and mysql are happy.

However, this turns out to be easier said than done, and perhaps I missed the obvious, but after numerous attempts (on more than one occasion) the following finally did the trick:

As you would expect, we don't need to escape double quotes inside single quotes for mysql. However, if we wanted to use a literal value in our SQL command we would need to escape double quotes since our SQL command is contained inside double quotes: