What does it take to run a business cloud?

There are consistent truths that cannot be ignored. Speed of light is one. There are others below.

Cloud means distributed systems connected by networks. A vast majority of cloud implementations are really rebranded platform services with their infrastructure in a fluffy icon. So I’m going to replace cloud with platform for the rest of this post.

Platform services exist to support your front end. Period. End of story. That means platform in most cases, is the L in your P&L ( Profit and Loss.) The L must be smaller than the P. I’ll write a follow up post on economics for those that disagree with me.

What make money leads the direction of the services required. So a solid, well documented, consistent feedback loop with the platform customers is a sign of a healthy organization. There are some cases of planning on what your front end services need before they ask, like CI/CD or adding new features to existing products, but a majority of what platform does is respond the front end services.

All your services need to be deployable by code rather than by hand. There are many options to accomplish this with no silver bullets yet. Your organization will have its own flavor. Those skills and knowledge must be automated for maximum rate of success. Even those few organizations that rely solely on public infrastructure still have many processes around managing all your platform services.

Your engineering culture must serve your DevOps needs. Meaning do everything within reason to keep your engineering staff happy, healthy, and productive. Listen to their needs. Snacks, activities, and nice chairs are a start. Root access to their laptops, IT kiosks staffed with friendly, available staff, and a culture of mentoring new skills is much, much better.

Your compute physical infrastructure must be network wise close to your data. Latency and throughput can be mitigated, but not ignored.

Public infrastructure does not solve all platform problems. It only makes someone else responsible for them. Do you trust another company exclusively with your future? I didn’t think so. So even if you truly believe public infrastructure all the way, you still need an alternative option. The complexity just gets moved around.

So for those of us that still need physical infrastructure, it is always going to be your largest and longest term cost. Retain the best and the brightest people for where, when, and how to build your physical infrastructure. For example, what are the tax implications of building in Singapore versus Hong Kong, data privacy laws in Switzerland, or what’s the electricity availability In San Francisco? The right people can help you to avoid making long term, expensive mistakes.

The time and people required for acquiring, installing, configuring, and testing hardware and software is a constant. Public infrastructure or containers doesn’t make these things disappear, it just moves the problem to another system. Constants must be dealt with a robust capacity planning team. These are the people that are in the middle of your customer feedback loop and know your operations and physical infrastructure developers very well.

Operations that doesn’t thoroughly understand their workloads will suffer major outages. It’s not if, but when. The best way to understand your workloads is by exhaustive research and development. Everything breaks. Figure out how your hardware and software breaks so you can mitigate the pending failures. Software pipelines, Performance Engineering, Quality Assurance, and a culture of R&D as a practice for all of your engineers will get you most of the way there.