Kong Yang is a Head Geek at SolarWinds® with over 20 years of IT experience specializing in virtualization and cloud management. He is a VMware vExpert, Cisco Champion, and active contributing practice leader within the virtualization and cloud communities.

Yang’s industry expertise includes application performance management, virtualization sizing and capacity planning best practices, community engagement, and technology evangelism. Yang is passionate about understanding the behavior of the application lifecycle and ecosystem – the analytics of the interdependencies as well as qualifying and quantifying the results to empower the organization’s bottom line.

He is also the owner of United States Patent 8,176,497 for an intelligent method to auto-scale VMs to fulfill peak database workloads. Yang’s past roles include a Cloud Practice Leader at Gravitant and various roles at Dell Technologies.

While the RackN team and I have been heads down radically simplifying physical data center automation, I’ve still been tracking some key cloud infrastructure areas. One of the more interesting ones to me is Edge Infrastructure.

This once obscure topic has come front and center based on coming computing stress from home video, retail machine and distributed IoT. It’s clear that these are not solved from centralized data centers.

While I’m posting primarily on the RackN.com blog, I like to take time to bring critical items back to my personal blog as a collection. WARNIING: Some of these statements run counter to other industry. Please let me know what you think!

By far the largest issue of the Edge discussion was actually agreeing about what “edge” meant. It seemed as if every session had a 50% mandatory overhead in definitioning. Putting my usual operations spin on the problem, I choose to define edge infrastructure in data center management terms. Edge infrastructure has very distinct challenges compared to hyperscale data centers. Read article for the list...

Running each site as a mini-cloud is clearly not the right answer. There are multiple challenges here. First, any scale infrastructure problem must be solved at the physical layer first. Second, we must have tooling that brings repeatable, automation processes to that layer. It’s not sufficient to have deep control of a single site: we must be able to reliably distribute automation over thousands of sites with limited operational support and bandwidth. These requirements are outside the scope of cloud focused tools.

If “cloudification” is not the solution then where should we look for management patterns? We believe that software development CI/CD and immutable infrastructure patterns are well suited to edge infrastructure use cases. We discussed this at a session at the OpenStack OpenDev Edge summit.

What do YOU think? This is an evolving topic and it’s time to engage in a healthy discussion.

In our last post, we pretty much tore apart the idea of running mini-clouds on the edge because they are not designed to be managed at scale in resource constrained environments without deep hardware automation. While I’m a huge advocate of API-driven infrastructure, I don’t believe in a one-size-fits-all API because a good API provides purpose-driven abstractions.

The logical extension is that having deep hardware automation means there’s no need for cloud (aka virtual infrastructure) APIs. This is exactly what container-focused customers have been telling us at RackN in regular data centers so we’d expect the same to apply for edge infrastructure.

If “cloudification” is not the solution then where should we look for management patterns?

Continuous Integration / Continuous Delivery (CI/CD) software pipelines help to manage environments where the risk of making changes is significant by breaking the changes into small, verifiable units. This is essential for edge because lack of physical access makes it very hard to mitigate problems. Using CI/CD, especially with A/B testing, allows for controlled rolling distribution of new software.

For example, in a 10,000 site deployment, the CI/CD infrastructure would continuously roll out updates and patches over the entire system. Small incremental changes reduce the risk of a major flaw being introduced. The effect is enhanced when changes are rolled slowly over the entire fleet instead of simultaneously rolled out to all sites (known as A/B or blue/green testing). In the rolling deployment scenario, breaking changes can be detected and stopped before they have significant impacts.

These processes and the support software systems are already in place for large scale cloud software deployments. There are likely gaps around physical proximity and heterogeneity; however, the process is there and initial use-case fit seems to be very good.

Immutable Infrastructure is a catch-all term for deployments based on images instead of configuration. This concept is popular in cloud deployments were teams produce “golden” VM or container images that contain the exact version of software needed and then are provisioned with minimal secondary configuration. In most cases, the images only need a small file injected (known as a cloud init) to complete the process.

In this immutable pattern, images are never updated post deployment; instead, instances are destroyed and recreated. It’s a deploy, destroy, repeat process. At RackN, we’ve been able to adapt Digital Rebar Provisioning to support this even at the hardware layer where images are delivered directly to disk and re-provisioning happens on a constant basis just like a cloud managing VMs.

The advantage of the immutable pattern is that we create a very repeatable and controlled environment. Instead of trying to maintain elaborate configurations and bi-directional systems of record, we can simply reset whole environments. In a CI/CD system, we constantly generate fresh images that are incrementally distributed through the environment.

Immutable Edge Infrastructure would mean building and deploying complete system images for our distributed environment. Clearly, this requires moving around larger images than just pushing patches; however, these uploads can easily be staged and they provide critical repeatability in management. The alternative is trying to keep track of which patches have been applied successfully to distributed systems. Based on personal experience, having an atomic deliverable sounds very attractive.

CI/CD and Immutable patterns are deep and complex subjects that go beyond the scope of a single post; however, they also offer a concrete basis for building manageable data centers.

The takeaway is that we need to be looking first to scale distributed software management patterns to help build robust edge infrastructure platforms. Picking a cloud platform before we’ve figured out these concerns is a waste of time.