Main menu

introductions

Fourth of July – Independence Day is more than just a day for us to hang out with friends and family across the United States and gather around the BBQ and watching fireworks and bombs blow up. It is a day that we celebrate our founding fathers courage and bravery in the pursuit of liberty and freedom.

If it wasn’t for these men and their dreams, I would not be sitting here at SoftLayer writing this blog for a company that loves us to share our words and views with others. I have been amazed how over the last few weeks how Twitter and other sites have helped the country of Iran speak their voice and let the world know what is going on over there. We would never know what is going on as their government would not allow it to be voiced on the state ran television.

So, as I am camping this Fourth of July in the San Juan Islands, fishing on the lake and watching the skies over Friday Harbor light up, I will be thankful for what our founding fathers accomplished on that day in 1776.

I recently bought a new computer for my wife. Being a developer, and a former hardware engineering student, I opted the buy the parts and assemble the machine ourselves. Actually assembling a computer these days doesn't take too long, it's the software that really gets you. Windows security updates, driver packs, incompatibilities, inconsistencies, broken websites, and just plain bad code plagued me for most of the night. The video card, in particular, has a “known issue” where it just “uh-oh” turns off the monitor when Windows starts. The issue was first reported in March of 2006, and has yet to be fixed.

This is why SoftLayer always tests and verifies the configurations we offer. We don't make the end user discover on their own that Debian doesn't work on Nehalems, we install it first to be sure. This is also why our order forms prevent customers from ordering pre-installed software that are incompatible with any of the rest of the order. We want to make sure that customers avoid the frustration of ordering things only to find out later that they don't work together.

The problem with desktop computers, especially for people who are particular about their configurations, is that you cannot buy a pre-configured machine where all the parts are exactly what you want. We attempted to get a computer from Dell, and HP, but neither company would even display all the specifications we were interested in, nevermind actually having the parts we desired. Usually pre-built systems skimp on important things like the motherboard or the power supply, giving you very little room to upgrade.

At SoftLayer, we don't cut corners on our systems, and we ensure that each customer can upgrade as high as they possibly can. Each machine type can support more RAM and hard drives than the default level, and we normally have spare machines handy at all levels so that once you outgrow the expansion capabilities of your current box, you can move to a new system type. If you're thinking of getting a dedicated server, but you're worried about the cost, visit the SoftLayer Outlet Store and start small. We have single-core Pentium Ds in the outlet store, and you can upgrade from there until you're running a 24-core Xeon system.

What is normal for a server? In support we get that question from time to time. The problem is that normal varies from server to server. A load average of 200 is probably not normal but a load of 5 to 10 very well could be normal, depending on the server's application. What to do?

Baselining to the rescue. The idea behind baselining is to get performance numbers on your application when things are "normal" so that you have solid math to indicate when things are not "normal".

What makes a good baseline? Things like RAM use (overall, per process, rate of change), number and types of processes running, processor usage, disk usage (total, per app), disk speed and network utilization are all good OS metrics. You can also get metrics from your application. E-mails per hour, web page generation time, and number of users logged in are good to know.

You can capture OS metrics using tools like top, free, ps and iostat on Linux. Actually if you have iostat you probably have 'sar' which is great for performance history. Sar has a process that runs every few minutes and records various OS counters including processor info, RAM use, disk I/O and the like.

For the Windows people you have Task Manager and Performance Monitor. Task Manager is pretty simple and gives mostly an overview. Perfmon is really where its at on Windows. Using PerfMon you can track dozens of performance counters on disk, proc, memory, the network and even application specific metrics if you are running apps like MS Exchange that support them.

As with most tasks related to being the lord and master of a server, performance monitoring isn't a one time thing. As you make changes to the system you have to run new baselines. Between changes you should run your performance routines periodically to see how things are changing. It is much easier to look into an issue if you spot it earlier rather than later.

Go forth and make sure all your baselines are belong to you!

*bonus cool points for those who knew the title of this blog was also the title of a "Roswell" episode.

So there I was after work today, sitting in my favorite watering hole drinking my Jagerbomb, when Caira, my bartender asked what was on my mind. I told her that I had been working with clouds and elephants all day at work and neither of those things are little. She laughed and asked if I had stopped anywhere to get a drink prior to her bar. I replied no, I'm serious I had to make some large clouds and a stampede of elephants work together. I then explained to her what Hadoop was. Hadoop is a popular open source implementation of Google's MapReduce. It allows transformation and extensive analysis of large data sets using thousands of nodes while processing peta-bytes of data. It is used by websites such as Yahoo!, Facebook, Google, and China's best search engine Baidu. I explained to her what cloud computing was (multiple computing nodes working together) hence my reference to the clouds, and how Hadoop was named after the stuffed elephant that belonged to one of the founders - Doug Cutting - child. Now she doesn't think I am as crazy.

In catching up on some of my blog reading, I ran across this blog by Jill Eckhaus of AFCOM (a professional organization for data center managers). Yes, I realize that article is four months old, but like I said – I’m catching up.

One of the things that really concerns me with articles and blogs such is this one are the repetitive concerns about “data security” and “loss of control” of your infrastructure. Both of those points are easy to state because they prey on the natural fear of any system administrator or data center manager.

System administrators have long ago come to realize that, in the proper environment, there is no real downside to not being able to physically place their hands upon their servers. In the proper environment the system administrator can power on or off the server, can get instant KVM access to the server, can boot the server into a rescue kernel to try to salvage a corrupt file system, can control network port speeds and connectivity, can reload the operating system, can instantly add and manage services such as load balancers and firewalls, can manage software licenses and naturally, can control full access to the server with root or administrator level privileges. In other words, there is no “loss of control” and “data security” is still up to the system administrator.

The data center managers are understandably concerned about outsourcing because it can potentially impact their jobs. But let’s face it – in today’s economy, the capital outlay required to acquire new datacenter space or additional datacenter equipment is extremely difficult to justify. In those cases sometimes the only two options are to do nothing or to outsource to an available facility. Of course, another option is to jeopardize your existing facility by trying to cram even more services into an already overloaded data center. If a data center manager is trying to build a fiefdom of facilities and personnel, outsourcing is certainly going to be a concern. One interesting aspect of outsourcing is – datacenter management jobs are still there; they are just at consolidated and often times more efficient facilities.

In reality, “data security” and “loss of control” should be of no more or less concern if you are using your own data center versus if you are doing the proper research and selecting a viable outsourcing opportunity with a provider that can prove it has the processes, procedures and tools in place to handle the job for you.

(In the spirit of full disclosure; I am both a local and national AFCOM member and find the organization and the information they make available to be quite useful.)

A customer called up concerned the other day after getting a dire looking warning in Firefox3 regarding a self-signed SSL certificate.

"The certificate is not trusted because it is self signed."

In that case, she was connecting to her Plesk Control Panel and she wondered if it was safe. I figured the explanation might make for a worthwhile blog entry, so here goes.

When you connect to an HTTPS website your browser and the server exchange certificate information which allows them to encrypt the communication session. The certificates can be signed in two ways: by a certificate authority or what is known as self-signed. Either case is just as good from an encryption point of view. Keys are exchanged and data gets encrypted.

So if they are equally good from an encryption point of view why would someone pay for a CA signed certificate? The answer to that comes from the second function of an SSL cert: identity.

A CA signed cert is considered superior because someone (the CA) has said "Yes, the people to whom we've sold this cert have convinced us they are who they say they are". This convincing is sometimes little more than presenting some money to the CA. What makes the browser trust a given CA? That would be its configured store of trusted root certificates. For example, in Firefox3, if you go to Options > Advanced > Encryption and select View Certificates you can see the pre-installed trusted certificates under the Authorities tab. Provided a certificate has a chain of signatures leading back to one of these Authorities then Firefox will accept that it is legitimately signed.

To make the browser completely happy a certificate has to pass the following tests:

1) Valid signature
2) The Common Name needs to match the hostname you're trying to hit
3) The certificate has to be within its valid time period

A self-signed cert can match all of those criteria, provided you configure the browser to accept it as an Authority certificate.

Back to the original question... is it safe to work with a certificate which your browser has flagged as problematic. The answer is yes, if the problem is expected, such as hitting the self-signed cert on a new Plesk installation. Where you should be concerned is if a certificate that SHOULD be good, such as your bank, is causing the browser to complain. In that case further investigation is definitely warranted. It could be just a glitch or misconfiguration. It could also be someone trying to impersonate the target site.

Quite often my friends who are not really that internet savvy ask me what I do at work, I think back to the time in the first grade when my teacher Mrs. Hyde told me: “ Bill you’re going to be a great problem solver when you get older, your problem solving skills are already at a fourth grade level.” Now you’re probably reading this wondering how problem solving problems in the first grade have anything to do with my job. It is, as she told me, all about how you think. She told me I was an outside the box thinker.

My co-workers and I deal with a network of 20,000+ servers, and 5500+ customers, in over 110 different countries, and support over 15 different operating systems. That leads to an almost infinite combination of language, hardware, and software options. When our customers submit an issue for us to work on, it is always different than the time before – whether that is a ticket from the same customer or a ticket on a similar topic. We have a very diverse range of customers using our servers for a number of things, so not every server in here is doing the same thing. In order to be good at supporting our customers, SoftLayer’s management, in my opinion, has hired some of the best problem solvers around the world to address all of our customer issues. So that is what I am: I am a problem solver! Otherwise known as a Customer Systems Administrator. We’re required to know a broad range of technologies and have the passion to learn the new ones as they come along. I think that is why I chose to work in the field that I work in, it is always changing. I tried moving over to telecommunications engineering a few years ago, but got bored with is as it was the same issues day in and day out on the equipment. Working here at SoftLayer is wonderful as there is never a dull moment.

Working the System Admin queue in the middle of the night I see lots of different kinds of tickets. One thing that has become clear over the months is that a well formed ticket is a happy ticket and a quickly resolved one. What makes a well-formed ticket? Mostly it is all about information and attention to these few suggestions can do a great deal toward speeding your ticket toward a conclusion.

Category
When you create a ticket you're asked to choose a category for it, such as "Portal Information Question" or "Reboots and Remote Access". Selecting the proper category helps us to triage the tickets. If you're locked out of your server, say due to a firewall configuration, you'd use "Reboots and Remote Access". We have certain guys who are better at CDNLayer tickets, for example, and they will seek out those kind so if you have a CDN question, you'd be best served by using that category. Avoid using Sales and Accounting tickets for technical issues as those end up first in their respective departments and not in support.

Login Information
This one is a bit controversial. I'm going to state straight out... I get that some people don't want us knowing the login information for the server. My personal server at SoftLayer doesn't have up-to-date login information in the portal. I do this knowing that this could slow things down if I ever had to have one of the guys take a look at it while I'm not at work.

If necessary, we can ask for it in the ticket but that can cost you time that we could otherwise be addressing your issue. If you would like us to log into your server for assistance, please provide us with valid login information in the ticket form. Providing up-to-date login credentials will greatly expedite the troubleshooting process and mitigate any potential downtime, but is not a requirement for us to help with issues you may be facing.

Server Identification
If you have multiple servers with us, please make sure to clearly identify the system involved in the issue. If we have a doubt, we're going to stop and ask you, which again can cost you time.

Problem Description
This is really the big one. When typing up the problem description in the ticket please provide as much detail as you can. Each sentence of information about the issue can cut out multiple troubleshooting steps which is going to lead to a faster resolution for you.

Example:

Not-so-good: I cannot access my server!

Good: I was making adjustments to the Windows 2008 firewall on my server and I denied my home IP of 1.2.3.4 instead of allowing it. Please fix.

The tickets describe the same symptom. I can guarantee though we're going to have the second customer back into his server quicker because we have good information about the situation and can go straight to the source of the problem.

Recently I had the chance to attend the annual Beyond Budgeting Round Table (BBRT) conference to help me keep up on my CPE credits. Those darn accounting licenses have to be maintained, ya know.

I was pleasantly surprised at the conference that SoftLayer was already doing the crux of what this group preaches – namely, that assembling an annual budget and trying to live by it is a colossal waste of time!

One speaker pointed out that budgeting originated back in medieval times long before the Industrial Revolution. During those days, the feudal system was the order of the day. Landowners allowed people to live on their land and raise crops. Once per year, when the harvest came in, the landowners received payment from the people living on the land in the form of a share of the crops or a share of the gold for which the crops were sold. Since the landowners were paid once per year, they had to plan how to make their annual payday last for a whole year. You guessed it – this plan was called “the budget.”

Unfortunately, most companies and organizations today use this horribly outdated financial management technique to run their business in the fast-paced information age economy of today. In most cases, this just flat doesn’t work.

For example, one of the speakers was the CFO of a very large healthcare organization. He said that back in the days when they produced an annual budget, there were 240 budget managers that spent 90 days of full-time effort to produce the annual budget. That equates to 60 man-labor years of total time to produce that budget. If you assume that each of those managers averages $50K per year in compensation, the cost of producing that budget is $3 million. What’s worse is that the CFO said it was worthless before the final version was printed because it was built on stale fundamental assumptions that were several months old.

Once these obsolete documents are produced, they become static financial contracts. They limit spending for each department, and this isn’t always a good thing. Some departments may see some fantastic market opportunities develop halfway through the year, but they can do nothing to take advantage of them because they would exceed their budget. On the other hand, some departments can be allotted too much money, so they go on wasteful spending sprees at year end to be sure and use up their budget or else lose that funding next year. People often ask for permission to exceed budget, but usually no one gives back any unused budget dollars. Even worse, management compensation is often tied to these obsolete financial contracts. Business schools are awash with case studies of bad business decisions that were made to maximize bonus compensation in relation to the budget.

From the beginning, SoftLayer realized the futility of producing an annual budget. In the rapidly developing business of web hosting, the landscape can dramatically change much more quickly than an annual cycle. So we implemented the policy of maintaining a rolling forecast that is updated to the best of our current knowledge each and every month. This practice has served us well, and is one of the “best practices” adopted by the BBRT.

Another best practice recommended by BBRT is to maintain multiple forecast scenarios that factor in macroeconomic possibilities. Then as reality develops, you have a better handle on the tactics to implement because you now know what most of these decisions should be in advance. At SL, we will be implementing the multiple scenario practice over this summer.

How many readers remember being your Dad’s remote control for the TV, heating a bit of oil that covered the bottom of a pan till it sizzled to make popcorn, percolating coffee pots, wondering how long it would take for enough hot water to take a shower after your primping older brother hogged it all? What about “fast” forwarding cassette and VCR tapes or thawing a chicken breast for hours on the counter? The list goes on and on.

My absolute favorite was sitting around on a Friday night at about age 10 at the baby sitters with my brother listening to the radio just hoping that “Shake your Booty” would come on the radio so we could record it instead of having to go buy it.

The amount of time we used to sit around waiting for things to happen was huge! Today, it’s all in an instant!

We have five remote controls or at the very least one really smart one that can do it all. Microwave popcorn that takes minutes and no cleanup, instant coffee – just add water, instant hot water heaters that never go cold, mp3 players that you can just click and go from song to song with no waiting; DVD/DVR that you can just go from scene to scene or skip those boring commercials… and you can use that same microwave to thaw your chicken in no time at all.

Today you can be listening to the radio in your car and click a button and it will tell iTunes what song it was and queue it up for your next download, you just have to love technology and the speed at which it happens.

I also remember the days when we had a rotary phone with an 82.5 foot cord that you could string across the house to the bathroom or in front of the TV and keep talking. Then it became the wall phone with the 84 foot stretchy cord and the number keys were on the handset, how cool was that? It never failed though- no matter how long the cord, you always needed more!

Today, you can Facebook, Tweet, chirp, yell, chat, and instant message from just about anywhere, even from a Jet Blue jet flying through the air. That is just pretty cool stuff.

In my previous life before I became a booth babe and a bloghogger I was known for being fairly technical in the world of Microsoft Windows Server and Citrix MetaFrame. They actually worked pretty well for a few of the company apps I had to deal with along my career path. The hardest part was actually setting up the application server to be just perfect and getting it on the wire to allow the employees to do their jobs.

The real challenge was getting more servers added to the pool in a timely fashion at month end for accounting or at rush times of the year for the sales group. It takes time to blast an OS no matter what method you are using, then get the app installed and functioning and then add it to the pool. Sure, I came up with a few tricks on how to image Citrix and they worked but it was still a waiting game trying to procure the hardware, install the image, get the server racked and cabled, etc. It never failed, a week before I had them ready the sales and/or accounting group managers were all over me because it was MY fault that they had slow applications. A few times just about the time I had the servers ready they didn’t need them anymore, I missed the rush.

Welcome to Instant Servification! CloudLayer, oh CloudLayer, I would have paid out of my own pocket back then to have this technology. With the release of hourly billing you can just use them when you need them even if your peak loads are only a fraction of one day. You create your golden image, save it, and push it out to as many as you need for as long as you need, and then when your peak usage is over, cancel them like high interest credit cards!

That is instant Gratification at its best! Welcome to SoftLayer how can we help you?