current community

more communities

Kyle Brandt

Nearly every time we talk about our infrastructure, people ask us why we own and operate our servers rather than host Stack Overflow and the Stack Exchange network in the cloud. Usually when people ask us this, they seem to want to convince us that we should be in the cloud. The debate usually then centers around cost.

Cloud vs Self Hosting Cost?

More or fewer Sysadmins required? (People say with the cloud you need fewer system administrators, never been convinced of this though)

Licensing Costs

Owned vs Rented Assets

How many cloud “servers” or instances you would need vs real hardware

Cost differences when you consider high availability

To really get this analysis correct you really have to invest a lot of time into the analysis, and even then it will only be an estimate. We have looked at cloud computing costs and we think it would actually be higher. When it comes down to it though the cost debate misses the point.

We Love Computers

and every aspect about them. We don’t just love programming and our web applications. We get excited learning about computer hardware, operating systems, history, computer games, and new innovations. Loving computers is an essential part of our company culture. Many of us have assembled our own workstations and our CTO even blogs about it in seven articles when he does. Most of us have grown up with computers as part of our identity. We all have a shared nostalgia of our first computers — if we haven’t taken our pilgrimage to the The Computer History Museum yet then we dream about it. We like to think about about the past, present, and future of computing. Owning and operating our own servers is part of how we get to live out our love of computers.

This culture means when we hire technical staff, we hire people who share this passion. I believe that this passion translates into a better product. Whenever someone does a cost analysis of cloud vs self hosting there is no row in the spreadsheet for “Work Productivity Increase due to Passion.” We are performance and control freaks and love to tweak everything including our hardware. If we outsourced our hosting to cloud computing, we would be outsourcing part of our passion. If you just want to use someone else’s computers, it means you don’t love computers — at least not every aspect to them. Sometimes cloud computing may be the best fit (for example if you have 20x the traffic around the holidays or tax season), but if you truly love computing, giving up control of computers to someone else will hurt.

We don’t just like computers, we love them. We have an emotional connection to them, and suggesting that we let someone else own, manage, and tweak them is like suggesting we get rid of what we love — just the thought of it offends.

Passion can not be dismissed (though it may be discounted, depending on who is doing the numbers). Going for those extra few percentage points of optimization to get even more out of that fixed asset takes passion, and works. It’s the bottom layer of the engineering that still matters. If you don’t have control over that, you just have to take what you’re given on faith and work around any issues. But y’all do own the bottom layer, so you can fix problems when they come up, like dodgy NIC drivers.

Another thing I’ve noticed is the responsibility of keeping the application live is shifted to the cloud yet the responsibility is still on you in the eyes of an average user. What do you say to a customer when your product goes down because of an AWS outage? There’s nothing you can do and you hope to resume service soon? That’d be a joke.

When you have absolute responsibility for the server, there is a lot you can do, there are things you can say and you can actually plan ahead. More than that you will also be more motivated to get things running ASAP rather than remain powerless, like you would if your whole service is in the cloud.

Besides, if what I’ve said above is invalidated, there’s nothing which can replace the joy of controlling the whole stack from top to bottom. I’m not a sysadmin but I still love computers and I especially love making them do stuff.

Gerardo Armendariz

More reason to admire what stackexchange is doing.

Dude

IMHO, the cloud is more for new risky startups and CPU intensive one-time jobs.

http://twitter.com/melatonedeaf CURMUDGEON

This article is way better than I thought it was going to be, thanks Kyle!

I have successfully managed to convince our Board to stick with letting us host all of our own servers. It looks expensive from a basic cost analysis, but you’re right. There’s a whole other level of passion there. We can’t keep great programmers / admins around by showing them our balance sheet. But by stoking their passions! Yes!

Stephen Ryan

One thing to note is the stackexchange has some pretty amazing admins (and developers) compared to a lot of companies. Even if you were offered the “exact” same hardware in the cloud, you’d need more of it. Think back to things like the intel network fault with micro pauses (http://blog.serverfault.com/post/performance-tuning-intel-nics/) and how that just isn’t possible in the cloud.

That all said, for most companies, the cloud is ideal. It abstracts away the problems they don’t need to let them focus on the problems that are important. For stack exchange, the market is primarily technical people who will notice things “normal” people won’t, so those few micro seconds here and there do make the difference.

http://tomakefast.com PJ Brunet

Sure if you live in NYC, that’s an acceptable solution. But MOST PEOPLE can’t run a server in their closet because a) too many hops b) connection dies, you’re out of business–a data center has multiple backbone carriers c) power goes out (bad weather) you’re out of business.

Matthew Walster

There are so many good reasons not to host servers in your own closet, and yet you chose the three worst ones:

a) too many hops – so you’d never host a website more than 5 “hops” away? Good luck with that. With TTL hiding MPLS tunnels, you’ll never know how far away you are.
b) multiple carriers – many of the server companies I see that aren’t massive only have one carrier. Quite often, they’ll also be working out of someone else’s AS number even if they do have their own PI space.
c) power goes out – power can go out anywhere. Your home is more likely to be up longer if you’ve planned it better, as you control the resiliency, not the data center.

The biggest reason not to host servers in your own closet is simple: It’s not scalable. If you’re just running an office server, that’s fine, but if you’re Stack Exchange and are expecting growth, you want to at the very least build a room and get in a couple of leased lines, and at that point you might as well start consider using a datacenter.

Anonymous

I disagree and the ability to scale at large is amongst the least of my reasons. Cloud or managed hosting (e.g., Rackspace) simply makes far more sense when you’re only running a few servers and put a high priority on availability (if you choose wisely).

Sure you can build a half-ass data center in your home or office closet, but outsourcing this function makes far more sense because they have much better economies of scale and can simply do it better for much less money from a cash flow and TCO point of view.

If you actually do the math on what it takes to actually pay for a truly robust system (including payroll, real estate, etc), it should quickly become apparent that you’re not really saving money by doing it in-house at this scale. This math is even worse with a rapidly growing company since much of your data center investment will be lost as you move into larger facilities (not to mention the loss of expansion flexibility from owning the hardware).

Now if you don’t really care about availability, backup systems, and you’re willing to run your equipment with absolutely zero redundancy and don’t really need fast internet either– you arguably “save” money by doing this in-house, but let’s not pretend that there’s not real trade offs there.

To the contrary, a company like SE can make a far better case for doing this in house since they likely have a legitimate need for several dozen servers, the need and the know-how to run customized hardware, and enough scale in their operations team to make this work economically (especially when combined with their marketing “passion” to customers and employees). I would say though that most IT departments in small to medium sized companies, even ones with 100+ servers, that are starting from scratch are probably being irresponsible if they’re still trying run mini data centers on-site.

Anonymous

Under most scenarios, I would think people are nuts if they run customer facing stuff in their closet. We rent racks from a collocation providers (Peer 1 for NY and Peak Internet for OR).

I have trouble seeing how getting your own facility might start to make sense until at least something like at least ~400 (10 racks) of servers (and I wouldn’t be surprised if the number is even a lot higher)

Jean-Lou Dupont

I’ve learned ~20years ago not get attached to technology: in business, it comes down to money and customers.

Jean-Lou Dupont

I’ve learned a valuable lesson ~20years ago: don’t get attached to technology because being successful in business takes so much more than that.

James Loope

I love computers too. I’ve loved them since I was an eleven year old with a 5.25 floppy disk and no sound card. I love the shell, I love the internet, and I love hacking with technology. You know what I also love? I love being able to throw away an instance and make a new one in 3 minutes. I love not having to manage spares. I love not having 4 vendor relationships. I love not having to predict capacity 6 months ahead of time. I love automating my “hardware” with a REST API rather than dealing with some god awful hardware vendors mediation layer written in visual fortran 2.9. Is it panacea? No, it’s not. But it does have plenty of merit, and my passion is simply not into babysitting buggy firmware and failing backplanes at 2:30AM on a saturday.

Anonymous

The amount of time spent debugging backplanes, replacing drives, updating firmware, talking to vendors etc is not that large. The hardware layer of computing is a major part of the stack. The networking infrastructure is what the Internet is built on. If someone would rather give up the entire hardware experience of computing just to avoid some annoyances — I don’t see how they can really love computing as a whole.

If someone just wants a blank slate to run programs on, it sounds like what they really love is programming. There is nothing wrong with that, that is fine. But it is programming that this person loves.

If practicability requires that you use the cloud, perhaps you are boot strapping, you have a bursty processing model so it saves a lot of money, or its just what the company you want to work for uses, then of course that what you would go with. But with that setup you don’t get to enjoy the full gamut of computing.

http://profiles.google.com/donnyv Donny V

The fact that your arguing that he doesn’t really “love computing as a whole” means your at the point where your arguing about ideology. Its cheaper and easier to host in the cloud and frees you up to do other more important things. There’s a reason no one programs in assembly anymore and uses higher level languages. Yes the higher level languages aren’t as fast as assembly but they remove all the annoyances that are involved with assembly and saves time. That’s what cloud services are, a higher level abstraction service to hosting.

http://profiles.yahoo.com/u/XE4JL6FVDKO3P4H22AUPULKFQA Stilgar

What’s wrong with not loving the administration part of computers? This means it leaves more passion for the other aspects.

Anonymous

There is nothing wrong with that at all. Even in an extreme example: If you just love programming and hate everything else about computing that is okay. To find happiness in your job you should look for a position that avoids administration as much as possible and takes abstracts away your code from physical resources.

The cloud doesn’t eliminate administration though (just changes the way some of it done and eliminates the physical aspect of it), and if you enjoy the hardware side of computing the cloud takes that away.

I am just describing how I personally, view a major aspect of our choice to own and operate or own servers. It wasn’t my choice and I am sure Jeff and Joel have a whole lot of other reasons. If you are looking for any advice in this article, I think I only really want to get two things across:

1) Don’t only use cost to make your choices
2) If everyone loves the full stack of computing, you should consider that before you go with the cloud.

Raffraffraff

This makes me want to work for you guys. ..

http://twitter.com/HartMichael Michael Hart

“We are performance and control freaks …”

Is this why your website frequently takes a long time to load, and has background assets loading for up to 30 seconds after the page should have finished loading?

Nick Craver

If you’re seeing this we certainly want to hear about it on meta, it sounds like you’re not being properly routed to the closest CDN node. Please post on meta with a timeline snapshot (e.g. chrome or firebug), with where you’re physically located (also, VPN?) and which CDN node you’re hitting (http://debug-02.netdna-cdn.com/) and we’ll help track down the issue.

Ryan3

you guys are using windows servers, so that’s explain all…

Nick Craver

No offense, but that’s a fairly ignorant statement (and one I’ve heard far too many times). There are plenty of farms running on LAMP stacks, etc. serving far fewer requests with far more hardware without much to spare. stackoverflow.com runs on 1 database server and 6 web servers, the DB hovers around 20% CPU, the web tier hovers around 10%, all other resources are even more abundantly over-provisioned. We don’t consider Windows a problem.

Stephen

What version of Windows do you use?

Nick Craver

We’re on Windows 2k8 R2 for all servers, all up to date (we automatically patch and rotate servers out for updates, no human effort or clicking required), etc. The database servers are the exception (to the updates) here, they’re patched less often as serving SO off a cold DB is killer, so it coming up with 0 of the normal 92GB cached is really slow to get going…we do take an outage for maintenance there, but this is incredibly rare.

guest

How do you automatically patch the servers

Jason

I would imagine through wsus or some other vendor. If you have multiple boxes (like on the web tier) you just rotate them out and patch them. Heck, even manually patching would be fine since the server is out of live rotation.

Weddingcoo

Well said, the cloud has no soul.

Garet

What about your own cloud?

havexz

rightly said…it gets tricky to work with love and business together….but in the end passion wins…

Thanks for putting a passion I share into words. But while I arrive at the same place, I have a slightly different reason: I love data. I love my data, specifically. I (painfully) run my own cyrus-imap email server, wiki, blogs, etc. I want it all in one place under my control. I would include Stack Exchange in that list, for tracking issues with my software, if it were open source. wink wink nudge nudge