Saturday, December 29, 2012

Over the last few days, a few of the engineers at TranscendComputing have been discussing what we could have done to have helped Netflix
avoid their Christmas outage. For those of you who aren’t aware, AWS suffered
an outage in the Elastic Load Balancer (ELB) service in the East Region.

In the middle of our discussions on creating massively
scalable, highly available, clustered load balancers with feature parity to
ELB, I caught a post by Diane Mueller at Active State. The gist of her post is
that ‘Netflix went down because of AWS but her personal app (which leveraged
FeedHenry and Stackato’ was revived after 10 minutes. The post seems to imply
that if you use PaaS (like Stackato), one can switch clouds easily, like she did
when she moved to her application to the HP Cloud.

I’ll avoid the overly dramatic retort but let’s just say
that I disagree with Dianne’s implication. Here’s my position: if core Netflix
applications were negatively affected by any core service (such as ELB), it
would be extremely difficult to quickly switch to another cloud. Here are some
specifics:

No disrespect to my friends on the HP Cloud team but I
honestly believe that if Netflix were to have done a sudden switch from AWS to
HP it would have brought HP Cloud to its knees. ELB’s (if they had them) would
have been crushed and Internet gateways would have been overloaded. Finding a
very large number of idle servers may have also been a challenge.

In this imaginary scenario, I guess we’ll assume that
Netflix decided to keep their movie library and all application services
running on multiple clouds. Sure this would be expensive but it wouldn’t have
been realistic for them to do a just-in-time copy of the data from one location
to the other.

Netflix has done a great job of publishing their technical
architecture: EMR, ELB, EIP, VPC, SQS, Autoscale, etc. None of these are
available in the solution Dianne prescribed (Stackato), nor does HP Cloud offer
them natively. There is a complete mismatch of services between the clouds. CloudFoundry
offers some things that are ‘similar’ but I’m concerned that they wouldn’t have
offered performance at scale.

Netflix has also created tools specific to the AWS cloud
(Asgard, Astyanax, etc.) as well as tuned off-the-shelf tools for AWS like
Cassandra. These would have to be refined to work on each target cloud.

In summary, there’s little-to-no chance that Netflix could
have quickly moved to ANY other cloud provider (including Rackspace or Azure)
and there’s not a thing that Stackato would have done to alleviate the problem.
All medium and large customers have real needs that are service dependent. I’ve
joked that CloudFoundry is a toy. It is, but it’s a toy that is maturing and
eventually may help with ‘real’ problems – but let’s be clear – that day isn’t
today. Any suggestion that it is ready for a ‘Netflix-like-outage’ is either
naïve or intentionally misleading.

I’ve spent the last three years working on solving the AWS
portability problem – and it’s a bitch. Like Dianne, if you have a simple app
my solution, TopStack, will work. It replicates core AWS services for workload
portability. As proud as I am as what the team at Transcend Computing has done,
I’m also quick to note that cloning any of the AWS services at massive scale
with minimal down-time, across heterogeneous cloud platforms and providers is
an incredibly tough problem.

Here’s my belief: Running the Transcend Computing ELB
service on HP Cloud would not have worked for Netflix in their time of need.
Our software would have been crushed. HP’s cloud would have been crushed.
Netflix’s homegrown software wouldn’t have had ‘practically portable’. It
would not have worked.

I’m happy to acknowledge where we suck. We’ll continue to
listen to the unfortunate incidents that AWS, Netflix and others encounter. My
2013 prediction for Transcend Computing is this: we’ll suck less. Acknowledging
reality is the first step.

Saturday, November 24, 2012

As an advisor to some of the world’s largest companies, it’s
my job to keep up with advances in technology.
I’m paid to answer questions like, “what’s after cloud?” I’ve thought a
lot about this very question and I’ve formed my answer: “More Cloud”. I believe that many new innovations will be packaged as 'cloud' and the combined ecosystem of innovation will outweigh other non-server side contenders.

Clouds promote increased automation, computing efficiency
and increased service levels. Public clouds add the outsourcing model, while
private clouds leverage existing infrastructure. Despite the value clouds
offer, investments made in cloud computing by both vendors and buyers have been
insignificant relative to the size of the opportunity. I believe that the next
several decades will be dominated by
a single computing paradigm: cloud.

From Structured
Programming to Cloud Elasticity

The magic of cloud is
the ability of a service to provision additional computing capacity to solve
the problem without the user being aware.
Cloud offerings are divided into sub-systems that perform a specific
function and can be called over a network via a well-defined interface. For the
uninitiated, we call this a service-oriented architecture (SOA). Cloud offers a
variety of services such as compute-as-a-service and database-as-a-service. The
service-oriented approach allows an implementer to swap-out the internals of a
service without impacting the users. This concept is borrowed from prior art
(structured programming, OOD, CBD, etc.) While SOA extends prior paradigms to
embrace distributed computing, cloud extends SOA to solve issues related the
quality attributes or non-functional concerns such as scalability and
availability. Cloud services respond to requests from various users/consumers
where each request varies in complexity to the point where the amount
computational power needed to satisfy the request will vary over time.

Encapsulated
Innovation

The as-a-service model encapsulates (or hides) new
innovations behind the service interface. For example, when Solid State Drives
began delivering fast IO access at competitive prices, cloud storage services
began using them under the covers. When new patterns and algorithms are
invented we see them turned into as-a-Service offerings:

Map reduce becomes the AWS Elastic MapReduce
Service

Dynamo and eventual consistency become AWS
Dynamo / MongoDB-aaS

Dremel becomes Google Big Query

Significant innovations will continue to unfold but the
vehicle for delivering those innovations will be as-a-Service (SOA) with
elastic infrastructure (cloud). Said another way, cloud will be awarded the
credit for innovation because it is the delivery vehicle of the
innovation. This might seem like an
inappropriate assignment of credit but in many cases the cloud model may be the
only practical means of delivering highly complex, infrastructure intensive
solutions. For example, setting up a large Hadoop farm is impractical for many
users, but using one that is already in place (e.g., AWS EMR) brings the
innovation to the masses. In this sense, the cloud isn’t the innovation but it
is the agent that ignites its viability.

Metcalfe’s Law

A cloud is a collection of nodes that interact across
multiple layers (e.g., security, recovery, etc.) As the collection of nodes
grow, so does the value of the cloud. If this sounds familiar, it’s rooted in
network theory (Metcalfe’s Law, Reed’s Law, etc.) To liberally paraphrase,
these Laws state that the value of the network increases as more nodes, users
and content are added to the network. I’d argue that the same model holds true
for cloud: As the size of a cloud grows (machines, users, as-a-Service
offerings) the value of the cloud grows proportionately. Any solution that is able to accumulate value
in a non-linear fashion becomes very difficult to replace. The traditional
killer of network value propositions is when a new innovation kills the
original, or when the network gets dirty (too costly, too complicated, etc.).
In theory, SOA and the Cloud delivery model exhibit inherent properties that
counter these concerns.

Incremental Funding

A significant attribute of cloud is that it grows
‘horizontally’. This means that a cloud operator can add another server or
storage system incrementally. Unlike the mainframe, you can grow a cloud by
using small, inexpensive units. This characteristic encourages long-term
growth. Anyone who has had to fight for
I.T. budget will recognize the importance of being able to leverage agile
funding models. It’s more than a nicety; it’s a Darwinian survival method
during depressed times. Cloud, like a cockroach, will be able to survive the
harshest of environments.

Data Gravity (Before
and After)

Dave McCrory suggested the concept of Data Gravity: “Data
Gravity is a theory around which data has mass.
As data (mass) accumulates, it begins to have gravity. This Data Gravity pulls services and
applications closer to the data. This attraction (gravitational force) is
caused by the need for services and applications to have higher bandwidth
and/or lower latency access to the data.” McCrory’s concept suggests an
initial barrier to cloud adoption (moving data to the cloud), but also suggests
that once it has been moved, more data will be accumulated, increasing the
difficulty to move off of the cloud. This model jives with modern engineering
belief that it’s better to move the application logic to the data, rather than
the reverse. As clouds accumulate data,
Data Gravity suggests that even more data (and logic) will accumulate.

The Centralization-Decentralization Debate

One of my first managers told me that I.T. goes through
cycles of centralization and decentralization. At the time he mentioned it, we
were moving from mainframes to client/server. He noted that when control was
moved too far out of ones control there would be a natural reaction to remove
power from the central authority and to regain enough power to solve your
problem. Of course, cloud attempts to balance this concern. The cloud is usually
considered a centralized model due to the homogenous nature of the data
centers, servers, etc. However, the self-service aspect of cloud attempts to
push power to the end user. Cloud is
designed to be the happy medium between centralized and decentralized; only
time will tell if it satisfies this issue.

In summary, I believe that multiple large innovations are
coming but many, if not most, will be buried behind an as-a-Service interface
and we’ll call them cloud. When I watch TV, I’m rarely aware of the innovations
in the cameras, editing machines, satellites or other key elements of the
ecosystem. From my perspective, TV just keeps getting better (it’s magic). The
cloud encapsulates innovation in a similar manner. In some ways, it is
unfortunate new innovations will be buried by the delivery model but in
fundamentally, it’s this very abstraction that will ensure its survival and
growth.

Monday, November 19, 2012

Most people would agree: Amazon Web Services is crushing
their competition. Their innovation is leading edge, their rate of introducing
new products is furious and their pricing is bargain-basement low.

This is a tough combination to beat! How do they do it?

The Power
of Permutations

Amazon’s offering takes
a layered approach. New solutions are introduced at any of the Five Layers and
are then combined with the other layers. By creating solutions with interchangeable
parts, they’ve harvested the power of permutations via configurable systems.

Platform

Take an example starting with a new platform. Let’s imagine
that Amazon were to offer a new Data Analytics service. They’d likely consider
the offering from two angles: 1) How do we support current analytics platforms
(legacy)? and 2) How do we reinvent the platform to take advantage of
scale-out, commodity architectures? Amazon typically releases new platforms in
a way that supports current customer needs (e.g., support for MySQL, Oracle,
etc.) and then rolls out a second way that is proprietary (e.g, SimpleDB,
DynamoDB) but arguably a better solution for a cloud-based architecture.

Data Center:
When Amazon releases a new offering they rarely release it to all of their data
centers at the same time. We’d expect them to deliver it in their largest center:
the AWS East Region where it would be delivered across multiple availability
zones. After some stabilization period, the offering would likely be delivered
in all US regions, or even globally. Later, it would be added to restricted
centers like GovCloud. Amazon is careful to release a new offering in a limited
geography for evaluation purposes. Over time, the service is expanded
geographically.

Virtualized
Infrastructure: The new service would likely use hardware and
storage devices best suited for the job (large memory, high CPU, fast network).
It’s common to see Amazon introduce new compute configurations that were driven
by the needs of their platform offerings. Over time, the offerings are extended
to use additional support services. This might include things like ways to back
up the data or patch the service. Naturally, we’d expect that as even newer
infrastructure offerings became available, we’d be able to insert them into our
platform configuration.

Cross-Cutting
Services: For every service introduced, there are a number of “crosscutting
services” that intersect all of the offerings. Amazon’s first priority is
usually to update their UI console, which enables convenient administration of
the service. Later, we’d expect the service to be added to their monitoring
system (Cloud Watch), their orchestration service (CloudFormation) and ensure
that it could be secured via their permissions system (I&AM). These three crosscutting
services are key enablers to the automation story that Amazon offers.

Economics:
Perhaps the only thing Amazon enjoys more than creating new cloud services is
finding interesting ways to price them. For any new offering, we would expect Amazon to have multiple ways to price the
offerings. If it was for a legacy platform, we’d expect to be billed by the
size of the machines and the number of hours that they ran, and the disk and
network that they used. If it was a next-generation platform, we’d expect to be
billed on some new concept – perhaps the number of rows analyzed, or rows
returned on a query. Either way, we’d expect that the price of the offering
will come down over time due to Amazon’s economies of scale and efficiency.

The Amazon advantage isn’t about any one service or
offering. It’s a combinatorial solution. They have found a formula for
decoupling their offering in a way that enables rapid new product introduction
and perhaps more importantly it offers the ability to upgrade their offerings
in a predictable and leveraged manner over time. Their ability to combine two
or more products to create a new offering gives them ‘economies of scope’. This
is a fundamental enabler of product diversification and leads to lower average
cost per unit across the portfolio. Amazon’s ability to independently control
the Five Layers has given them a repeatable formula for success. Next time you
read about Amazon introducing XYZ platform, in the East Region, using Clustered
Compute Boxes, hooked into CloudWatch, CloudFormation and IAM, with Reserved
Instance and Spot Instance pricing – just remember, it’s no accident. Service
providers who aren’t able to pivot at the Five Layers may find themselves
obsolete.

Saturday, June 02, 2012

Jason Bloomberg wrote a thought provoking article on, "Why You Really, Truly Don't Want a Private Cloud". The article reviews the benefits of public cloud and then challenges the ability for a private cloud to bring the same benefits. Unfortunately, I think Jason's conclusions are wrong. I want to be clear about two things: my day-to-day experience and my potential conflict of interests.

Conflicts of Interest: MomentumSI consults with organizations on how to select private clouds, install/configure them, monitor, manage, govern and secure them. Transcend Computing provides software that makes the private cloud run more like Amazon. Each company does a significant amount of work in public cloud. We love both public and private clouds.

My day-to-day Experience: I have teams of consultants and engineers who use private cloud and public cloud on every assignment. They have done so for years. Cloud is their default deployment model. Several of my younger team members have never worked with physical servers/disks/network devices - they only know IaaS/PaaS. Team members switch between public and private clouds like they're switching from a pen to a pencil. They don't think twice about it - they just do it. The reasons why they select one over the other are commonsense to them (and anyone who has access to both):

They run elastic/bursty jobs in the public cloud

Most new production applications are run in the public cloud because there is built in disaster recovery, elastic scaling and a global footprint: Availability Zones, Regions, etc.

Pre-production staging environments are done in the public cloud because we want it to mirror the production architecture.

Most of our legacy COTS applications have been moved to the private cloud. We watch them closely, optimize their environments when needed and avoid violating the license agreements which often prohibit their execution in a public cloud.

Most dev/test is done in our private clouds. We run Eucalytpus, OpenStack and CloudStack. Most companies wouldn't do this, but we do given the nature of our consulting. Developers prefer private clouds when they want:

Low latency access to their cloud (for themselves or other applications)

Low level probing that filters out multi-tenant noisy-neighbors

Constant booting of a machine (fast and cheap)

More choice in the cloud hardware configuration (Amazon is getting there, but still has a long way to go...)

We see more experimentation being done on the private cloud (fixed/sunk costs). Most team members are keenly aware of the large public cloud bills that they've generated.

The reasons why a person might use one or the other are in some ways irrelevant. The fact is, they do. I'm proposing the following:

"When I.T. staff are given access to a public and a private cloud, they will use both. Either way, they will get their work done faster and ultimately save their employer money in labor and asset costs."

I had a real hard time swallowing Jason's analysis that clouds are only good when you rent them from a third party like Amazon or Rackspace. Using the vehicle analogy, I believe that it's OK to own your car and to rent other vehicles when needed (e.g., boats, RV's, taxi-cabs, vacation bikes, jet ski's, limos, and so on). It's not an all-or-nothing proposition.

The one thing I want to leave you with is this: I.T. staff will use both - and they'll figure out when to use each. They're not dumb. They don't blindly listen to bloggers, authors, analysts or tweeters. Give them access and let them do what you pay them to do. Empower them.

Wednesday, April 04, 2012

I’m excited to announce that today, April 4th,
our new company Transcend Computing will be emerging from stealth mode. In
short, we are launching:

StackStudio is a visual, drag-and-drop online
development environment for assembling multi-tier application topologies using
the Amazon CloudFormation format. Application stacks assembled with StackStudio
are ready to run on Amazon Web Services (AWS) and on other public and private
ACE platforms.

These stacks can then be shared with other developers in StackPlace,
which was also launched today as an open social architecture community
sponsored by Transcend Computing. StackPlace allows developers to create,
contribute, consume and collaborate on ACE-compatible application topologies.

This is an exciting time for the Momentum family. As most of
you know, we’ve been incubating this program for the last couple of years. The
initial offering is a SaaS solution used to create multi-part applications on
AWS. In the coming months, we'll be introducing additional on-premise services.

It's been fun to watch the transition from SOA to 'as-a-Service'. There's little doubt that cloud (IaaS/PaaS) is the new model for application development and deployment. This is one of those few areas where both engineers and executives can agree on a new paradigm. This recipe for success will unfold over the coming years - and we're excited to be leaders in this new movement.