serverchallenge

More than 210,000 users have watched a YouTube video of our data center operations team cabling a row of server racks in San Jose. More than 95 percent of the ratings left on the video are positive, and more than 160 comments have been posted in response. To some, those numbers probably seem unbelievable, but to anyone who has ever cabled a data center rack or dealt with a poorly cabled data center rack, the time-lapse video is enthralling, and it seems to have catalyzed a healthy debate: At least a dozen comments on the video question/criticize how we organize and secure the cables on each of our server racks. It's high time we addressed this "zip ties v. hook & loop (Velcro®)" cable bundling controversy.

The most widely recognized standards for network cabling have been published by the Telecommunications Industry Association and Electronics Industries Alliance (TIA/EIA). Unfortunately, those standards don't specify the physical method to secure cables, but it's generally understood that if you tie cables too tight, the cable's geometry will be affected, possibly deforming the copper, modifying the twisted pairs or otherwise physically causing performance degradation. This understanding begs the question of whether zip ties are inherently inferior to hook & loop ties for network cabling applications.

As you might have observed in the "Cabling a Data Center Rack" video, SoftLayer uses nylon zip ties when we bundle and secure the network cables on our data center server racks. The decision to use zip ties rather than hook & loop ties was made during SoftLayer's infancy. Our team had a vision for an automated data center that wouldn't require much server/cable movement after a rack is installed, and zip ties were much stronger and more "permanent" than hook & loop ties. Zip ties allow us to tighten our cable bundles easily so those bundles are more structurally solid (and prettier). In short, zip ties were better for SoftLayer data centers than hook & loop ties.

That conclusion is contrary to the prevailing opinion in the world of networking that zip ties are evil and that hook & loop ties are among only a few acceptable materials for "good" network cabling. We hear audible gasps from some network engineers when they see those little strips of nylon bundling our Ethernet cables. We know exactly what they're thinking: Zip ties negatively impact network performance because they're easily over-tightened, and cables in zip-tied bundles are more difficult to replace. After they pick their jaws up off the floor, we debunk those myths.

The first myth (that zip ties can negatively impact network performance) is entirely valid, but its significance is much greater in theory than it is in practice. While I couldn't track down any scientific experiments that demonstrate the maximum tension a cable tie can exert on a bundle of cables before the traffic through those cables is affected, I have a good amount of empirical evidence to fall back on from SoftLayer data centers. Since 2006, SoftLayer has installed more than 400,000 patch cables in data centers around the world (using zip ties), and we've *never* encountered a fault in a network cable that was the result of a zip tie being over-tightened ... And we're not shy about tightening those ties.

The fact that nylon zip ties are cheaper than most (all?) of the other more "acceptable" options is a fringe benefit. By securing our cable bundles tightly, we keep our server racks clean and uniform:

The second myth (that cables in zip-tied bundles are more difficult to replace) is also somewhat flawed when it comes to SoftLayer's use case. Every rack is pre-wired to deliver five Ethernet cables — two public, two private and one out-of-band management — to each "rack U," which provides enough connections to support a full rack of 1U servers. If larger servers are installed in a rack, we won't need all of the network cables wired to the rack, but if those servers are ever replaced with smaller servers, we don't have to re-run network cabling. Network cables aren't exposed to the tension, pressure or environmental changes of being moved around (even when servers are moved), so external forces don't cause much wear. The most common physical "failures" of network cables are typically associated with RJ45 jack crimp issues, and those RJ45 ends are easily replaced.

Let's say a cable does need to be replaced, though. Servers in SoftLayer data centers have redundant public and private network connections, but in this theoretical example, we'll assume network traffic can only travel over one network connection and a data center technician has to physically replace the cable connecting the server to the network switch. With all of those zip ties around those cable bundles, how long do you think it would take to bring that connection back online? (Hint: That's kind of a trick question.) See for yourself:

The answer in practice is "less than one minute" ... The "trick" in that trick question is that the zip ties around the cable bundles are irrelevant when it comes to physically replacing a network connection. Data center technicians use temporary cables to make a direct server-to-switch connection, and they schedule an appropriate time to perform a permanent replacement (which actually involves removing and replacing zip ties). In the video above, we show a temporary cable being installed in about 45 seconds, and we also demonstrate the process of creating, installing and bundling a permanent network cable replacement. Even with all of those villainous zip ties, everything is done in less than 18 minutes.

Many of the comments on YouTube bemoan the idea of having to replace a single cable in one of these zip-tied bundles, but as you can see, the process isn't very laborious, and it doesn't vary significantly from the amount of time it would take to perform the same maintenance with a Velcro®-secured cable bundle.

Social network games have revolutionized the gaming industry and created an impressive footprint on the Web as a whole. 235 million people play games on Facebook every month, and some estimates say that by 2014, more than one third of Internet population will be playing social games. Given that market, it's no wonder that the vast majority of game studios, small or big, have prioritized games to be played on Facebook, Orkut, StudiVZ, VK and other social networks.

Developing and launching a game in general is not an easy task. It takes a lot of time, a lot of people, a lot of planning and a lot of assumptions. On top of those operational challenges, the social gaming market is a jungle where "survival of the fittest" is a very, VERY visible reality: One day everyone is growing tomatoes, the next they are bad guys taking over a city, and the next they are crushing candies. An army of genius developers with the most stunning designs and super-engaging game ideas can find it difficult to navigate the fickle social waters, but in the midst of all of that uncertainty, the most successful gaming studios have all avoided five of the most common mortal sins gaming companies commit when launching a social game.

SoftLayer isn't gaming studio, and we don't have any blockbuster games of our own, but we support some of the most creative and successful gaming companies in the world, so we have a ton of indirect experience and perspective on the market. In fact, leading up to GDC Europe, I was speaking with a few of the brilliant people from KUULUU — an interactive entertainment company that creates social games for leading artists, celebrities and communities — about a new Facebook game they've been working on called LINKIN PARK RECHARGE:

After learning a more about how Kuuluu streamlines the process of developing and launching a new title, I started thinking about the market in general and the common mistakes most game developers make when they release a social game. So without further ado...

The 5 Mortal Sins of Launching a Social Game

1. Infinite Focus

Treat focus as limited resource. If it helps, look at your team's cumulative capacity to focus as though it's a single cube. To dedicate focus to different parts of the game or application, you'll need to slice the cube. The more pieces you create, the thinner the slices will be, and you'll be devoting less focus to the most important pieces (which often results in worse quality). If you're diverting a significant amount of attention from building out the game's story line to perfecting the textures of a character's hair or the grass on the ground, you'll wind up with an aesthetically beautiful game that no one wants to play. Of course that example is an extreme, but it's not uncommon for game developers to fall into a less blatant trap like spending time building and managing hosting infrastructure that could better be spent tweaking and improving in-game performance.

2. Eeny, Meeny, Miny, Moe – Geographic Targeting

Don't underestimate the power of the Internet and its social and viral drivers. You might believe your game will take off in Germany, but when you're publishing to a global social network, you need to be able to respond if your game becomes hugely popular in Seoul. A few enthusiastic Tweets or wall post from the alpha-players in Korea might be the catalyst that takes your user base in the region from 1000 to 80,000 overnight to 2,000,000 in a week. With that boom in demand, you need to have the flexibility to supply that new market with the best quality service ... And having your entire infrastructure in a single facility in Europe won't make for the best user experience in Asia. Keep an eye on the traction your game has in various regions and geolocate your content closer to the markets where you're seeing the most success.

3. They Love Us, so They'll Forgive Us.

Often, a game's success can lure gaming companies into a false sense of security. Think about it in terms of the point above: 2,000,000 Koreans are trying to play your game a week after a great article is published about you, but you don't make any changes to serve that unexpected audience. What happens? Players time out, latency drags the performance of your game to a crawl, and 2,000,000 users are clicking away to play one of the other 10,000 games on Facebook or 160,000 games in a mobile appstore. Gamers are fickle, and they demand high performance. If they experience anything less than a seamless experience, they're likely to spend their time and money elsewhere. Obviously, there's a unique balance for every game: A handful of players will be understanding to the fact that you underestimated the amount of incoming requests, that you need time to add extra infrastructure or move it elsewhere to decrease latency, but even those players will get impatient when they experience lag and downtime.

KUULUU took on this challenge in an innovative, automated way. They monitor the performance of all of their games and immediately ramp up infrastructure resources to accommodate growth in demand in specific areas. When demand shifts from one of their games to another, they're able to balance their infrastructure accordingly to deliver the best end-user experience at all times.

4. We Will Be Thiiiiiiiiiiis Successful.

Don't count your chickens before the eggs hatch. You never really, REALLY know how a social game will perform when the viral factor influences a game's popularity so dramatically. Your finite plans and expectations wind up being a list of guestimations and wishes. It's great to be optimistic and have faith in your game, but you should never have to over-commit resources "just in case." If your game takes two months to get the significant traction you expect, the infrastructure you built to meet those expectations will be underutilized for two months. On the other hand, if your game attracts four times as many players as you expected, you risk overburdening your resources as you scramble to build out servers. This uncertainty is one of the biggest drivers to cloud computing, and it leads us to the last mortal sin of launching a social game ...

5. Public Cloud Is the Answer to Everything.

To all those bravados who feel they are the master of cloud and see it as an answer to all their problems please, for your fans sake, remember the cloud has more than one flavor. Virtual instances in a public cloud environment can be provisioned within minutes are awesome for your webservers, but they may not perform well for your databases or processor-intensive requirements. KUULUU chose to incorporate bare metal cloud into a hybrid environment where a combination of virtual and dedicated resources work together to provide incredible results:

Avoiding these five mortal sins doesn't guarantee success for your social game, but at the very least, you'll sidestep a few common landmines. For more information on KUULUU's success with SoftLayer, check out this case study.

As a technical support specialist at SoftLayer, I work with new customers regularly, and after fielding a lot of the same kinds of questions about setting up a new server, I thought I'd put together a quick guide that could be a resource for other new customers who are interested in implementing a few best practices when it comes to setting up and securing a new server. This documentation is based on my personal server setup experience and on the experience I've had helping customers with their new servers, so it shouldn't be considered an exhaustive, authoritative guide ... just a helpful and informative one.

Protect Your Data

First and foremost, configure backups for your server. The server is worthless without your data. Data is your business. An old adage says, "It's better to have and not need, than to need and not have." Imagine what would happen to your business if you lost just some of your data. There's no excuse for neglecting backup when configuring your new server. SoftLayer does not backup your server, but SoftLayer offers several options for data protection and backup to fit any of your needs.

Control panels like cPanel and Plesk include backup functionality and can be configured to automatically backup regularly an FTP/NAS account. Configure backups now, before doing anything else. Before migrating or copying your data to the server. This first (nearly empty) backup will be quick. Test the backup by restoring the data. If your server has RAID, it important to remember that RAID is not backup!

Use Strong Passwords

I've seen some very week and vulnerable password on customers' servers. SoftLayer sets a random, complex password on every new server that is provisioned. Don't change it to a weak password using names, birthdays and other trivia that can be found or guessed easily. Remember, a strong password doesn't have to be a complicated one: xkcd: Password Strength

Write down your passwords: "If I write them down and then protect the piece of paper — or whatever it is I wrote them down on — there is nothing wrong with that. That allows us to remember more passwords and better passwords." "We're all good at securing small pieces of paper. I recommend that people write their passwords down on a small piece of paper, and keep it with their other valuable small pieces of paper: in their wallet." Just don't use any of these passwords.
I've gone electronic and use 1Password and discovered just how many passwords I deal with. With such strong, random passwords, you don't have to change your password frequently, but if you have to, you don't have to worry about remembering the new one or updating all of your notes. If passwords are too much of a hassle ...

Or Don't Use Passwords

One of the wonderful things of SSH/SFTP on Linux/FreeBSD is that SSH-keys obviate the problem of passwords. Mac and Linux/FreeBSD have an SSH-client installed by default! There are a lot of great SSH clients available for every medium you'll use to access your server. For Windows, I recommend PuTTY, and for iOS, Panic Prompt.

Firewall

Firewalls block network connections. Configuring a firewall manually can get very complicated, especially when involving protocols like FTP which opens random ports on either the client or the server. A quick way to deal with this is to use the system-config-securitylevel-tui tool. Or better, use a firewall front end such as APF or CSF. These tools also simplify blocking or unblocking IPs.

Firewall

Allow

Block

Unblock

APF

apf -a <IP>

apf -d <IP>

apf -u <IP>

CSF*

csf -a <IP>

csf -d <IP>

csf -dr <IP>

*CSF has a handy search command: csf -g <IP>.

SoftLayer customers should be sure to allow SoftLayer IP ranges through the firewall so we can better support you when you have questions or need help. Beyond blocking and allowing IP addresses, it's also important to lock down the ports on your server. The only open ports on your system should be the ones you want to use. Here's a quick list of some of the most common ports:

cPanel ports

2078 - webDisk

2083 - cPanel control panel

2087 - WHM control panel

2096 - webmail

Other

22 - SSH (secure shell - Linux)

53 - DNS name servers

3389 - RDP (Remote Desktop Protocol - Windows)

8443 - Plesk control panel

Mail Ports

25 - SMTP

110 - POP3

143 - IMAP

465 - SMTPS

993 - IMAPS

995 - POP3S

Web Server Ports

80 - HTTP

443 - HTTPS

DNS

DNS is a naming system for computers and services on the Internet. Domain names like "softlayer.com" and "manage.softlayer.com" are easier to remember than IP address like 66.228.118.53 or even 2607:f0d0:1000:11:1::4 in IPv6. DNS looks up a domain's A record (AAAA record for IPv6), to retrieve its IP address. The opposite of an A record is a PTR record: PTR records resolve an IP address to a domain name.

Hostname
A hostname is the human-readable label you assign of your server to help you differentiate it from your other devices. A hostname should resolve to its server's main IP address, and the IP should resolve back to the hostname via a PTR record. This configuration is extremely important for email ... assuming you don't want all of your emails rejected as spam.

Avoid using "www" at the beginning of a hostname because it may conflict with a website on your server. The hostnames you choose don't have to be dry or boring. I've seen some pretty awesome hostname naming conventions over the years (Simpsons characters, Greek gods, superheros), so if you aren't going to go with a traditional naming convention, you can get creative and have some fun with it. A server's hostname can be changed in the customer portal and in the server's control panel. In cPanel, the hostname can be easily set in "Networking Setup". In Plesk, the hostname is set in "Server Preferences". Without a control panel, you can update the hostname from your operating system (ex. RedHat, Debian)

A Records
If you buy your domain name from SoftLayer, it is automatically added to our nameservers, but if your domain was registered externally, you'll need to go through a few additional steps to ensure your domain resolves correctly on our servers. To include your externally-registered domain on our DNS, you should first point it at our nameservers (ns1.softlayer.com, ns2.softlayer.com). Next, Add a DNS Zone, then add an A record corresponding to the hostname you chose earlier.

PTR Records
Many ISPs configure their servers that receive email to lookup the IP address of the domain in a sender's email address (a reverse DNS check) to see that the domain name matches the email server's host name. You can look up the PTR record for your IP address. In Terminal.app (Mac) or Command Prompt (Windows), type "nslookup" command followed by the IP. If the PTR doesn't match up, you can change the PTR easily.

SSL Certificates

Getting an SSL certificate for your site is optional, but it has many benefits. The certificates will assure your customers that they are looking at your site securely. SSL encrypts passwords and data sent over the network. Any website using SSL Certificates should be assigned its own IP address. For more information, we have a great KnowledgeLayer article about planning ahead for an SSL, and there's plenty of documentation on how to manage SSL certificates in cPanel and Plesk.

Move In!

Now that you've prepared your server and protected your data, you are ready to migrate your content to its new home. Be proactive about monitoring and managing your server once it's in production. These tips aren't meant to be a one-size-fits-all, "set it and forget it" solution; they're simply important aspects to consider when you get started with a new server. You probably noticed that I alluded to control panels quite a few times in this post, and that's for good reason: If you don't feel comfortable with all of the ins and outs of server administration, control panels are extremely valuable resources that do a lot of the heavy lifting for you.

If you have any questions about setting up your new server or you need any help with your SoftLayer account, remember that we're only a phone call away!

This guest blog was contributed by William Rocca of OS NEXUS. OS NEXUS makes the Quantastor Software Defined Storage platform designed to tackle the storage challenges facing cloud computing, Big Data and high performance applications.

Over the last decade, the creation and popularization of SAN/NAS systems simplified the management of storage into a single appliance so businesses could efficiently share, secure and manage data centrally. Fast forward about 10 years in storage innovation, and we're now rapidly changing from a world of proprietary hardware sold by big-iron vendors to open-source, scale-out storage technologies from software-only vendors that make use of commodity off-the-shelf hardware. Some of the new technologies are derivatives of traditional SAN/NAS with better scalability while others are completely new. Object storage technologies such as OpenStack SWIFT have created a foundation for whole new types of applications, and big data technologies like MongoDB, Riak and Hadoop go even further to blur the lines between storage and compute. These innovations provide a means for developing next-generation applications that can collect and analyze mountains of data. This is the exciting frontier of open storage today.

This frontier looks a lot like the "Wild West." With ad-hoc solutions that have great utility but are complex to setup and maintain, many users are effectively solving one-off problems, but these solutions are often narrowly defined and specifically designed for a particular application. The question everyone starts asking is, "Can't we just evolve to having one protocol ... one technology that unites them all?"

If each of these data storing technologies have unique advantages for specific use cases or applications, the answer isn't to eliminate protocols. To borrow a well-known concept from Physics, the solution lies in a "Unified Field Theory of Storage" — weaving them together into a cohesive software platform that makes them simple to deploy, maintain and operate.

When you look at the latest generation of storage technologies, you'll notice a common thread: They're all highly-available, scale-out, open-source and serve as a platform for next-generation applications. While SAN/NAS storage is still the bread-and-butter enterprise storage platform today (and will be for some time to come) these older protocols often don't measure up to the needs of applications being developed today. They run into problems storing, processing and gleaning value out of the mountains of data we're all producing.

Thinking about these challenges, how do we make these next-generation open storage technologies easy to manage and turn-key to deploy? What kind of platform could bring them all together? In short, "What does the 'Unified Field Theory of Storage' look like?"

These are the questions we've been trying to answer for the last few years at OS NEXUS, and the result of our efforts is the QuantaStor Software Defined Storage platform. In its first versions, we focused on building a flexible foundation supporting the traditional SAN/NAS protocols but with the launch of QuantaStor v3 this year, we introduced the first scale-out version of QuantaStor and integrated the first next-gen open storage technology, Gluster, into the platform. In June, we launched support of ZFS on Linux (ZoL), and enhanced the platform with a number of advanced enterprise features, such as snapshots, compression, deduplication and end-to-end checksums.

This is just the start, though. In our quest to solve the "Unified Field Theory of Storage," we're turning our eyes to integrating platforms like OpenStack SWIFT and Hadoop in QuantaStor v4 later this year, and as these high-power technologies are streamlined under a single platform, end users will have the ability to select the type(s) of storage that best fit a given application without having to learn (or unlearn) specific technologies.

The "Unified Field Theory of Storage" is emerging, and we hope to make it downloadable. Visit OSNEXUS.com to keep an eye on our progress. If you want to incorporate QuantaStor into your environment, check out SoftLayer's preconfigured QuantaStor Mass Storage Server solution.

Believe it or not, "cloud computing" concepts date back to the 1950s when large-scale mainframes were made available to schools and corporations. The mainframe's colossal hardware infrastructure was installed in what could literally be called a "server room" (since the room would generally only be able to hold a single mainframe), and multiple users were able to access the mainframe via "dumb terminals" – stations whose sole function was to facilitate access to the mainframes. Due to the cost of buying and maintaining mainframes, an organization wouldn't be able to afford a mainframe for each user, so it became practice to allow multiple users to share access to the same data storage layer and CPU power from any station. By enabling shared mainframe access, an organization would get a better return on its investment in this sophisticated piece of technology.

A couple decades later in the 1970s, IBM released an operating system called VM that allowed admins on their System/370 mainframe systems to have multiple virtual systems, or "Virtual Machines" (VMs) on a single physical node. The VM operating system took the 1950s application of shared access of a mainframe to the next level by allowing multiple distinct compute environments to live in the same physical environment. Most of the basic functions of any virtualization software that you see nowadays can be traced back to this early VM OS: Every VM could run custom operating systems or guest operating systems that had their "own" memory, CPU, and hard drives along with CD-ROMs, keyboards and networking, despite the fact that all of those resources would be shared. "Virtualization" became a technology driver, and it became a huge catalyst for some of the biggest evolutions in communications and computing.

In the 1990s, telecommunications companies that had historically only offered single dedicated point–to-point data connections started offering virtualized private network connections with the same service quality as their dedicated services at a reduced cost. Rather than building out physical infrastructure to allow for more users to have their own connections, telco companies were able to provide users with shared access to the same physical infrastructure. This change allowed the telcos to shift traffic as necessary to allow for better network balance and more control over bandwidth usage. Meanwhile, virtualization for PC-based systems started in earnest, and as the Internet became more accessible, the next logical step was to take virtualization online.

If you were in the market to buy servers ten or twenty years ago, you know that the costs of physical hardware, while not at the same level as the mainframes of the 1950s, were pretty outrageous. As more and more people expressed demand to get online, the costs had to come out of the stratosphere, and one of the ways that was made possible was by ... you guessed it ... virtualization. Servers were virtualized into shared hosting environments, Virtual Private Servers, and Virtual Dedicated Servers using the same types of functionality provided by the VM OS in the 1950s. As an example of what that looked like in practice, let's say your company required 13 physical systems to run your sites and applications. With virtualization, you can take those 13 distinct systems and split them up between two physical nodes. Obviously, this kind of environment saves on infrastructure costs and minimizes the amount of actual hardware you would need to meet your company's needs.

As the costs of server hardware slowly came down, more users were able to purchase their own dedicated servers, and they started running into a different kind of problem: One server isn't enough to provide the resources I need. The market shifted from a belief that "these servers are expensive, let's split them up" to "these servers are cheap, let's figure out how to combine them." Because of that shift, the most basic understanding of "cloud computing" was born online. By installing and configuring a piece of software called a hypervisor across multiple physical nodes, a system would present all of the environment's resources as though those resources were in a single physical node. To help visualize that environment, technologists used terms like "utility computing" and "cloud computing" since the sum of the parts seemed to become a nebulous blob of computing resources that you could then segment out as needed (like telcos did in the 90s). In these cloud computing environments, it became easy add resources to the "cloud": Just add another server to the rack and configure it to become part of the bigger system.

As technologies and hypervisors got better at reliably sharing and delivering resources, many enterprising companies decided to start carving up the bigger environment to make the cloud's benefits to users who don't happen to have an abundance of physical servers available to create their own cloud computing infrastructure. Those users could order "cloud computing instances" (also known as "cloud servers") by ordering the resources they need from the larger pool of available cloud resources, and because the servers are already online, the process of "powering up" a new instance or server is almost instantaneous. Because little overhead is involved for the owner of the cloud computing environment when a new instance is ordered or cancelled (since it's all handled by the cloud's software), management of the environment is much easier. Most companies today operate with this idea of "the cloud" as the current definition, but SoftLayer isn't "most companies."

SoftLayer took the idea of a cloud computing environment and pulled it back one more step: Instead of installing software on a cluster of machines to allow for users to grab pieces, we built a platform that could automate all of the manual aspects of bringing a server online without a hypervisor on the server. We call this platform "IMS." What hypervisors and virtualization do for a group of servers, IMS does for an entire data center. As a result, you can order a bare metal server with all of the resources you need and without any unnecessary software installed, and that server will be delivered to you in a matter of hours. Without a hypervisor layer between your operating system and the bare metal hardware, your servers perform better. Because we automate almost everything in our data centers, you're able to spin up load balancers and firewalls and storage devices on demand and turn them off when you're done with them. Other providers have cloud-enabled servers. We have cloud-enabled data centers.

IBM and SoftLayer are leading the drive toward wider adoption of innovative cloud services, and we have ambitious goals for the future. If you think we've come a long way from the mainframes of the 1950s, you ain't seen nothin' yet.

When Sun Microsystems VP John Gage coined the phrase, "The network is the computer," the idea was more wishful thinking than it was profound. At the time, personal computers were just starting to show up in homes around the country, and most users were getting used to the notion that "The computer is the computer." In the '80s, the only people talking about networks were the ones selling network-related gear, and the idea of "the network" was a little nebulous and vaguely understood. Fast-forward a few decades, and Gage's assertion has proven to be prophetic ... and it happens to explain one of SoftLayer's biggest differentiators.

SoftLayer's hosting platform features an innovative, three-tier network architecture: Every server in a SoftLayer data center is physically connected to public, private and out-of-band management networks. This "network within a network" topology provides customers the ability to build out and manage their own global infrastructure without overly complex configurations or significant costs, but the benefits of this setup are often overlooked. To best understand why this network architecture is such a game-changer, let's examine each of the network layers individually.

Public Network

When someone visits your website, they are accessing content from your server over the public network. This network connection is standard issue from every hosting provider since your content needs to be accessed by your users. When SoftLayer was founded in 2005, we were the first hosting provider to provide multiple network connections by default. At the time, some of our competitors offered one-off private network connections between servers in a rack or a single data center phase, but those competitors built their legacy infrastructures with an all-purpose public network connection. SoftLayer offers public network connection speeds up to 10Gbps, and every bare metal server you order from us includes free inbound bandwidth and 5TB of outbound bandwidth on the public network.

Private Network

When you want to move data from one server to another in any of SoftLayer's data centers, you can do so quickly and easily over the private network. Bandwidth between servers on the private network is unmetered and free, so you don't incur any costs when you transfer files from one server to another. Having a dedicated private network allows you to move content between servers and facilities without fighting against or getting in the way of the users accessing your server over the public network.

It should come as no surprise to learn that all private network traffic stays on SoftLayer's network exclusively when it travels between our facilities. The blue lines in this image show how the private network connects all of our data centers and points of presence:

To fully replicate the functionality provided by the SoftLayer private network, competitors with legacy single-network architecture would have to essentially double their networking gear installation and establish safeguards to guarantee that customers can only access information from their own servers via the private network. Because that process is pretty daunting (and expensive), many of our competitors have opted for "virtual" segmentation that logically links servers to each other. The traffic between servers in those "virtual" private networks still travels over the public network, so they usually charge you for "private network" bandwidth at the public bandwidth rate.

Out-of-Band Management Network

When it comes to managing your server, you want an unencumbered network connection that will give you direct, secure access when you need it. Splitting out the public and private networks into distinct physical layers provides significant flexibility when it comes to delivering content where it needs to go, but we saw a need for one more unique network layer. If your server is targeted for a denial of service attack or a particular ISP fails to route traffic to your server correctly, you're effectively locked out of your server if you don't have another way to access it. Our management-specific network layer uses bandwidth providers that aren't included in our public/private bandwidth mix, so you're taking a different route to your server, and you're accessing the server through a dedicated port.

If you've seen pictures or video from a SoftLayer data center (or if you've competed in the Server Challenge), you probably noticed the three different colors of Ethernet cables connected at the back of every server rack, and each of those colors carries one of these types of network traffic exclusively. The pink/red cables carry public network traffic, the blue cables carry private network traffic, and the green cables carry out-of-band management network traffic. All thirteen of our data centers have the same colored cables in the same configuration doing the same jobs, so we're able to train our operations staff consistently between all thirteen of our data centers. That consistency enables us to provide quicker service when you need it, and it lessens the chance of human error on the data center floor.

The most powerful server on the market can be sidelined by a poorly designed, inefficient network. If "the network is the computer," the network should be a primary concern when you select your next hosting provider.

At Cloud World Forum in London, I did an interview with Rachel Downie of CloudMovesTV, and she asked some fantastic questions (full interview embedded a the bottom of this post). One that particularly jumped out to me was, "Does North America have a technology and talent advantage over Europe?" I've posted some thoughts on this topic on the SoftLayer Blog in the past, but I thought I'd reflect on the topic again after six months of traveling across Europe and the Middle East talking with customers, partners and prospects.

I was born just north of Silicon Valley in a little bohemian village called San Francisco. I earned a couple of trophies (and even more battle scars) during the original dot-com boom, so much of my early career was spent in an environment bursting at the seams with entrepreneurs and big ideas. The Valley tends to get most of the press (and all of the movie contracts), so it's easy to assume that the majority of the world's innovation is happening around there. I have first-hand experience that proves that assumption wrong. The talent level, motivation, innovation, technology and desire to make a difference is just as strong, if not stronger, in Europe and the Middle East as it is in the high-profile startup scenes in New York City or San Francisco. And given the level of complexity due to the cultural and language differences, I would argue the innovation that happens in the Middle East and Europe tends to incorporate more flexibility and global scalability earlier than its North American counterparts.

A perfect example of this type of innovation is the ad personalization platform that London-based Struq created. Earlier this year, I presented with Struq CTO Aaron McKee during the TFM&A (Technology For Marketing and Advertising) show in London about how cloud computing helps their product improve online customer dialogue, and I was stunned by how uniquely and efficiently they were able to leverage the cloud to deliver meaningful, accurate results to their customers. Their technology profiles customers, matches them to desired brands, checks media relevance and submits an ad unit target price to auction. If there is a match, Struq then serves a hyper-relevant message to that customer. And all of that in about 25 milliseconds and is happening at scale (over two billion transactions per day). Add in the fact that they serve several different cultures and languages, and you start to understand the work that went into creating this kind of platform. Watch out Valley Boyz and Girlz, they're expanding into the US.

One data point of innovation and success doesn't mean a whole lot, but Struq's success isn't unique. I just got back from Istanbul where I spent some time with Peak Games to learn more about how they became the 3rd largest social gaming company in the world and what SoftLayer could to to help support their growth moving forward. Peak Games, headquartered in Turkey, is on an enviable growth trajectory, and much of their success has come from their lean, focused operations model and clear goals. With more than 30 million customers, it's clear that the team at Peak Games built a phenomenal platform (and some really fun games). Ten years ago, a development team from Turkey may have had to move into a cramped, expensive house in Palo Alto to get the resources and exposure they needed to reach a broader audience, but with the global nature of cloud computing, the need to relocate to succeed is antiquated.

I met a wild-eyed entrepreneur at another meeting in Istanbul who sees exactly what I saw. The region is full of brilliant developers and creative entrepreneurs, so he's on a mission to build out a more robust startup ecosystem to help foster the innovation potential of the region. I've met several people in different countries doing the same thing, but one thing that struck me as unusual about this vision was that he did not say anything about being like Silicon Valley. He almost laughed at me when I asked him about that, and he explained that he wanted his region to be better than Silicon Valley and that his market has unique needs and challenges that being "like Silicon Valley" wouldn't answer. North America is a big market, but it's one of many!

The startups and gaming companies I mentioned get a lot of the attention because they're fun and visible, but the unsung heroes of innovation, the intraprenuers (people who behave like entrepreneurs within large organizations), are the clear and powerful heartbeat of the talent in markets outside of North America. These people are not driven by fame and fortune ... They just want to build innovative products because they can. A mad scientist from one of the largest consumer products firms in the world, based in the EU, just deployed a couple of servers to build an imaging ecosystem that is pushing the limits of technology to improve human health. Another entrepreneur at a large global media company is taking a Mobile First methodology to develop a new way to distribute and consume media in the emerging cross-platform marketplace. These intrapreneurs might not live in Palo Alto or Santa Clara, but they're just as capable to change the world.

Silicon Valley still produces inspiring products and groundbreaking technology, but the skills and expertise that went into those developments aren't confined by borders. To all you innovators across the globe building the future, respect. Working with you is my favorite part of the job.

In December, I posted a MongoDB performance analysis that showed the quantitative benefits of using bare metal servers for MongoDB workloads. It should come as no surprise that in the wake of SoftLayer's Riak launch, we've got some similar data to share about running Riak on bare metal.

To run this test, we started by creating five-node clusters with Riak 1.3.1 on SoftLayer bare metal servers and on a popular competitor's public cloud instances. For the SoftLayer environment, we created these clusters using the Riak Solution Designer, so the nodes were all provisioned, configured and clustered for us automatically when we ordered them. For the public cloud virtual instance Riak cluster, each node was provisioned indvidually using a Riak image template and manually configured into a cluster after all had come online. To optimize for Riak performance, I made a few tweaks at the OS level of our servers (running CentOS 64-bit):

Noatime
Nodiratime
barrier=0
data=writeback
ulimit -n 65536

The common Noatime and Nodiratime settings eliminate the need for writes during reads to help performance and disk wear. The barrier and writeback settings are a little less common and may not be what you'd normally set. Although those settings present a very slight risk for loss of data on disk failure, remember that the Riak solution is deployed in five-node rings with data redundantly available across multiple nodes in the ring. With that in mind and considering each node also being deployed with a RAID10 storage array, you can see that the minor risk for data loss on the failure of a single disk in the entire solution would have no impact on the entire data set (as there are plenty of redundant copies for that data available). Given the minor risk involved, the performance increases of those two settings justify their use.

With all of the nodes tweaked and configured into clusters, we set up Basho's test harness — Basho Bench — to remotely simulate load on the deployments. Basho Bench allows you to create a configurable test plan for a Riak cluster by configuring a number of workers to utilize a driver type to generate load. It comes packaged as an Erlang application with a config file example that you can alter to create the specifics for the concurrency, data set size, and duration of your tests. The results can be viewed as CSV data, and there is an optional graphics package that allows you to generate the graphs that I am posting in this blog. A simplified graphic of our test environment would look like this:

You may notice that in the test cases that use SoftLayer "Medium" Servers, the virtual provider nodes are running 26 virtual compute units against our dual proc hex-core servers (12 cores total). In testing with Riak, memory is important to the operations than CPU resources, so we provisioned the virtual instances to align with the 36GB of memory in each of the "Medium" SoftLayer servers. In the public cloud environment, the higher level of RAM was restricted to packages with higher CPU, so while the CPU counts differ, the RAM amounts are as close to even as we could make them.

One final "housekeeping" note before we dive into the results: The graphs below are pulled directly from the optional graphics package that displays Basho Bench results. You'll notice that the scale on the left-hand side of graphs differs dramatically between the two environments, so a cursory look at the results might not tell the whole story. Click any of the graphs below for a larger version. At the end of each test case, we'll share a few observations about the operations per second and latency results from each test. When we talk about latency in the "key observation" sections, we'll talk about the 99th percentile line — 99% of the results had latency below this line. More simply you could say, "This is the highest latency we saw on this platform in this test." The primary reason we're focusing on this line is because it's much easier to read on the graphs than the mean/median lines in the bottom graphs.

The SoftLayer environment showed much more consistency in operations per second with an average throughput around 450 Op/sec. The virtual environment throughput varied significantly between about 50 operations per second to more than 600 operations per second with the trend line fluctuating slightly between about 220 Op/sec and 350 Op/sec.

Comparing the latency of get and put requests, the 99th percentile of results in the SoftLayer environment stayed around 50ms for gets and under 200ms for puts while the same metric for the virtual environment hovered around 800ms in gets and 4000ms in puts. The scale of the graphs is drastically different, so if you aren't looking closely, you don't see how significantly the performance varies between the two.

Similar to the results of Test 1, the throughput numbers from the bare metal environment are more consistent (and are consistently higher) than the throughput results from the virtual instance environment. The SoftLayer environment performed between 1500 and 1750 operations per second on average while the virtual provider environment averaged around 1200 operations per second throughout the test.

The latency of get and put requests in Test 2 also paints a similar picture to Test 1. The 99th percentile of results in the SoftLayer environment stayed below 50ms and under 400ms for puts while the same metric for the virtual environment averaged about 250ms in gets and over 1000ms in puts. Latency in a big data application can be a killer, so the results from the virtual provider might be setting off alarm bells in your head.

In Test 3, we're using the same specs in our virtual provider nodes, so the results for the virtual node environment are the same in Test 3 as they are in Test 2. In this Test, the SoftLayer environment substitutes SSD hard drives for the 15K SAS drives used in Test 2, and the throughput numbers show the impact of that improved I/O. The average throughput of the bare metal environment with SSDs is between 1750 and 2000 operations per second. Those numbers are slightly higher than the SoftLayer environment in Test 2, further distancing the bare metal results from the virtual provider results.

The latency of gets for the SoftLayer environment is very difficult to see in this graph because the latency was so low throughout the test. The 99th percentile of puts in the SoftLayer environment settled between 500ms and 625ms, which was a little higher than the bare metal results from Test 2 but still well below the latency from the virtual environment.

Summary

The results show that — similar to the majority of data-centric applications that we have tested — Riak has more consistent, better performing, and lower latency results when deployed onto bare metal instead of a cluster of public cloud instances. The stark differences in consistency of the results and the latency are noteworthy for developers looking to host their big data applications. We compared the 99th percentile of latency, but the mean/median results are worth checking out as well. Look at the mean and median results from the SoftLayer SSD Node environment: For gets, the mean latency was 2.5ms and the median was somewhere around 1ms. For puts, the mean was between 7.5ms and 11ms and the median was around 5ms. Those kinds of results are almost unbelievable (and that's why I've shared everything involved in completing this test so that you can try it yourself and see that there's no funny business going on).

It's commonly understood that local single-tenant resources that bare metal will always perform better than network storage resources, but by putting some concrete numbers on paper, the difference in performance is pretty amazing. Virtualizing on multi-tenant solutions with network attached storage often introduces latency issues, and performance will vary significantly depending on host load. These results may seem obvious, but sometimes the promise of quick and easy deployments on public cloud environments can lure even the sanest and most rational developer. Some applications are suited for public cloud, but big data isn't one of them. But when you have data-centric apps that require extreme I/O traffic to your storage medium, nothing can beat local high performance resources.

With the global economy in its current state, it's more important than ever to help inspired value-creators acquire the tools needed to realize their ideas, effect change in the world, and create impact — now. I've had the privilege of working with hundreds of young, innovative companies through Catalyst and our relationships with startup accelerators, incubators and competitions, and I've noticed that the best way for entrepreneurs to create change is to simply let them play! Stick them in a sandbox with a wide variety of free products and services that they can use however they want so that they may find the best method of transitioning from idea to action.

Any attention that entrepreneurs divert from their core business ideas is wasted attention, so the most successful startup accelerators build a bridge for entrepreneurs to the resources they need — from access to hosting service, investors, mentors, and corporate partners to recommendations about summer interns and patent attorneys. That all sounds good in theory, and while it's extremely difficult to bring to reality, startup-focused organizations like MassChallenge make it look easy.

During a recent trip to Boston, I was chatting with Kara Shurmantine and Jibran Malek about what goes on behind the scenes to truly empower startups and entrepreneurs, and they gave me some insight. Startups' needs are constantly shifting, changing and evolving, so MassChallenge prioritizes providing a sandbox chock-full of the best tools and toys to help make life easier for their participants ... and that's where SoftLayer helps. With Kara and Jibran, I got in touch with a few MassChallenge winners to get some insight into their experience from the startup side.

Tish Scolnik, the CEO of Global Research Innovation & Technology (GRIT), described the MassChallenge experience perfectly: "You walk in and you have all these amazing opportunities in front of you, and then in a pretty low pressure environment you can decide what you need at a specific moment." Tish calls it a "buffet table" — an array of delectable opportunities, some combination of which will be the building blocks of a startup's growth curve. Getting SoftLayer products and services for free (along with a plethora of other valuable resources) has helped GRIT create a cutting-edge wheelchair for disabled people in developing countries.

The team from Neumitra, a Silver Winner of MassChallenge 2012, chose to use SoftLayer as an infrastructure partner, and we asked co-founder Rob Goldberg about his experience. He explained that his team valued the ability to choose tools that fit their ever-changing and evolving needs. Neumitra set out to battle stress — the stress you feel every day — and they've garnered significant attention while doing so. With a wearable watch, Neumitra's app tells you when your stress levels are too high and you need to take a break.

Jordan Fliegal, the founder and CEO of CoachUp, another MassChallenge winner, also benefited from playing around in the sandbox. This environment, he says, is constantly "giving to you and giving to you and giving to you without asking for anything in return other than that you work hard and create a company that makes a difference." The result? CoachUp employs 20 people, has recruited thousands of judges, and has raised millions in funding — and is growing at breakneck speed.

If you give inspired individuals a chance and then give them not only the resources that they need, but also a diverse range of resources that they could need, you are guaranteed to help create global impact.

In my Breaking Down 'Big Data' – Database Models, I briefly covered the most common database models, their strengths, and how they handle the CAP theorem — how a distributed storage system balances demands of consistency and availability while maintaining partition tolerance. Here's what I said about Dynamo-inspired databases:

What They Do: Distributed key/value stores inspired by Amazon's Dynamo paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.Horizontal Scaling: Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.CAP Balance: Prefer availability over consistencyWhen to Use: When the system must always be available for writes and effectively cannot lose data.Example Products: Cassandra, Riak, BigCouch

This type of key/value store architecture is very unique from the document-oriented MongoDB solutions we launched at the end of last year, so we worked with Basho to prioritize development of high-performance Riak solutions on our global platform. Since you already know about MongoDB, let's take a few minutes to meet the new kid on the block.

Riak is a distributed database architected for availability, fault tolerance, operational simplicity and scalability. Riak is masterless, so each node in a Riak cluster is the same and contains a complete, independent copy of the Riak package. This design makes the Riak environment highly fault tolerant and scalable, and it also aids in replication — if a node goes down, you can still read, write and update data.

As you approach the daunting prospect of choosing a big data architecture, there are a few simple questions you need to answer:

How much data do/will I have?

In what format am I storing my data?

How important is my data?

Riak may be the choice for you if [1] you're working with more than three terabytes of data, [2] your data is stored in multiple data formats, and [3] your data must always be available. What does that kind of need look like in real life, though? Luckily, we've had a number of customers kick Riak's tires on SoftLayer bare metal servers, so I can share a few of the use cases we've seen that have benefited significantly from Riak's unique architecture.

Use Case 1 – Digital Media
An advertising company that serves over 10 billion ads per month must be able to quickly deliver its content to millions of end users around the world. Meeting that demand with relational databases would require a complex configuration of expensive, vertically scaled hardware, but it can be scaled out horizontally much easier with Riak. In a matter of only a few hours, the company is up and running with an ad-serving infrastructure that includes a back-end Riak cluster in Dallas with a replication cluster in Singapore along with an application tier on the front end with Web servers, load balancers and CDN.

Use Case 2 – E-commerce
An e-commerce company needs 100-percent availability. If any part of a customer's experience fails, whether it be on the website or in the shopping cart, sales are lost. Riak's fault tolerance is a big draw for this kind of use case: Even if one node or component fails, the company's data is still accessible, and the customer's user experience is uninterrupted. The shopping cart structure is critical, and Riak is built to be available ... It's a perfect match.

As an additional safeguard, the company can take advantage of simple multi-datacenter replication in their Riak Enterprise environment to geographically disperse content closer to its customers (while also serving as an important tool for disaster recovery and backup).

Use Case 3 – Gaming
With customers like Broken Bulb and Peak Games, SoftLayer is no stranger to the gaming industry, so it should come as no surprise that we've seen interesting use cases for Riak from some of our gaming customers. When a game developer incorporated Riak into a new game to store player data like user profiles, statistics and rankings, the performance of the bare metal infrastructure blew him away. As a result, the game's infrastructure was redesigned to also pull gaming content like images, videos and sounds from the Riak database cluster. Since the environment is so easy to scale horizontally, the process on the infrastructure side took no time at all, and the multimedia content in the game is getting served as quickly as the player data.

Databases are common bottlenecks for many applications, but they don't have to be. Making the transition from scaling vertically (upgrading hardware, adding RAM, etc.) to scaling horizontally (spreading the work intelligently across multiple nodes) alleviates many of the pain points for a quickly growing database environment. Have you made that transition? If not, what's holding you back? Have you considered implementing Riak?