The key takeaways are that portability is hard and we’re still working out the impact of container architecture.

The benefit of the longer interview is that we really dig into the reasons why portability is hard and discuss ways to improve it. My personal SRE posts and those on the RackN blog describe operational processes that improve portability. These are real concerns for all IT organizations because mixed and hybrid models are a fact of life.

If you are not actively making automation that works against multiple infrastructures then you are building technical debt.

Of course, if you just want the snark, then jump forward to 24:00 minutes in where we talk future of Kubernetes, OpenStack and the inverted intersection of the projects.

Software development technology is so frothy that we’re developing collective immunity to constant churn and hype cycles. Lately, every time someone tells me that they have hot “picked technology Foo” they also explain how they are also planning contingencies for when Foo fails. Not if, when.

Required contingency? That’s why I believe 2017 is the year of the IT Escape Clause, or, more colorfully, the IT Crawfish.

When I lived in New Orleans, I learned that crawfish are anxious creatures (basically tiny lobsters) with powerful (and delicious) tails that propel them backward at any hint of any danger. Their ability to instantly back out of any situation has turned their name into a common use verb: crawfish means to back out or quickly retreat.

In IT terms, it means that your go-forward plans always include a quick escape hatch if there’s some problem. I like Subbu Allamaraju’s description of this as Change Agility. I’ve also seen this called lock-in prevention or contingency planning. Both are important; however, we’re reaching new levels for 2017 because we can’t predict which technology stacks are robust and complete.

The fact is the none of them are robust or complete compared to historical platforms. So we go forward with an eye on alternatives.

How did we get to this state? I blame the 2016 Infrastructure Revolt.

Way, way, way back in 2010 (that’s about bronze age in the Cloud era), we started talking about developers helping automate infrastructure as part of deploying their code. We created some great tools for this and co-opted the term DevOps to describe provisioning automation. Compared to the part, it was glorious with glittering self-service rebellions and API-driven enlightenment.

In reality, DevOps was really painful because most developers felt that time fixing infrastructure was a distraction from coding features.

In 2016, we finally reached a sufficient platform capability set in tools like CI/CD pipelines, Docker Containers, Kubernetes, Serverless/Lambda and others that Developers had real alternatives to dealing with infrastructure directly. Once we reached this tipping point, the idea of coding against infrastructure directly become unattractive. In fact, the world’s largest infrastructure company, Amazon, is actively repositioning as a platform services company. Their re:Invent message was very clear: if you want to get the most from AWS, use our services instead of the servers.

For most users, using platform services instead of infrastructure is excellent advice to save cost and time.

The dilemma is that platforms are still evolving rapidly. So rapidly that adopters cannot count of the services to exist in their current form for multiple generations. However, the real benefits drive aggressive adoption. They also drive the rise of Crawfish IT.

This post is a collaboration between three Dell Cloud activists: Rob Hirschfeld (@zehicle), Joseph B George (@jbgeorge) and Stephen Spector (@SpectoratDell).

We’re not making predictions for the “whole” Cloud market, this is a relatively narrow perspective based on technologies that on our daily radar. These views are strictly our own and based on publicly available data. They do not reflect plans, commitments, or internal data from our employer (Dell).

The major 2012 theme is cloud coalescence. However, Rob worries that we’ll see slower adoption due to lack of engineers and confusing names/concepts.

Here are our twelve items for 2012:

Open source continues to be a disruptive technology delivery model. It’s not “free” software – there’s an emerging IT culture that is doing business differently, including a number of large enterprises. The stable of sleeping giant vendors are waking up to this in 2012 but full engagement will take time.

Linux. It is the cloud operating system and had a great 2012. It seems silly pointing this out since it seems obvious, but it’s the foundation for open source acceleration.

Tight market for engineering and product development talent will get tighter. The catch-22 of this is that potential mentors are busy breaking new ground and writing code, making it hard for new experts to be developed.

On track, OpenStack moves into its awkward adolescence. It is still gangly and rebelling against authority, but coming into its own. Expect to see a groundswell of installations and an expected wave of issues and challenges that will drive the community. By the “F” release, expect to see OpenStack cement itself as a serious, stable contender with notable public deployments and a significant international private deployment foot print.

We’ll start seeing OpenStack Quantum (networking) in near-production pilots by year end. OpenStack Quantum is the glue that holds the big players in OpenStack Nova together. The potential for next generation cloud networking based on open standards is huge, but it will emerge without a killer app (OpenStack Nova in this case) pushing it forward. The OpenStack community will pull together to keep Quantum on track.

Hadoop will cross into mainstream awareness as the need for big data analysis grows exponentially along with the data. Hadoop is on fire in select circles and completely obscure in others. The challenge for Hadoop is there are not enough engineers who know how to operate it. We suspect that lack of expertise will throttle demand until we get more proprietary tools to simplify analysis. We also predict a lot of very rich entrepreneurs and VCs emerging from this market segment.

DevOps will enter mainstream IT discussions. Marketers from major IT brands will struggle and fail to find a better name for the movement. Our prediction is that by 2015, it will just be the way that “IT” is done and the name won’t matter.

KVM continues to gain believers as the open source hypervisor. In 2011, I would not have believed this prediction but KVM making great strides and getting a lot of love from the OpenStack community, though Xen is also a key open source technology as well. I believe that Libvirt compatibility between LXE & KVM will further accelerate both virtualization approaches.

Big Data and NoSQL will continue to converge. While NoSQL enthusiasm as a universal replacement for structured databases appears to be deflating, real applications will win.

Java will continue to encounter turbulence as a software platform under Oracle’s overly heady handed management.

PaaS continues to be a confusing term. Cloud players will struggle with a definition but I don’t think a common definition will surface in 2012. I think the big news will be convergence between DevOps and PaaS; however, that will be under the radar since most of the market is still getting educated on both of those concepts.

Hybrid cloud will continue to make strides but will not truly emerge in 2012 – we’ll try to develop this technology, and expose gaps that will get us there ultimately (see PaaS and Quantum above)

Sometimes a single sprint can deliver magic: when I signed up to document how to create a Crowbar module (aka a barclamp) two weeks ago, I had no idea that it would add a new flavor to Crowbar .

I’m proud to announce that the first public non-Dell Crowbar module will be supporting the VMware Cloud Foundry Open PaaS project.

Development is still in progress (on the Crowbar “CF” branch) and you’ll be able to watch us (even help!) collaborate on this project. Initially, the deployment will be to single server but we’re hoping to quickly expand to a distributed install that fully leverages the capabilities of both projects.

By creating a Crowbar module, Cloud Foundry™ is able to leverage the cloud deployment capabilities that allow it to be setup on any physical or virtualized data center. This is core to the Crowbar message: the value of a cloud solution can best be realized when it’s coupled with open practices for deploying it.

There are many significant aspects of this collaboration:

Cloud Foundry is taking the right approach to PaaS. Their team’s perspective on PaaS mirrors my own: A PaaS is a collection of application services. That approach makes it extensible and flexible. Plus, they are also multi-language and multi-platform.

Crowbar is proving our breadth of support. Last week we announced coming RHEL support and now adding Cloud Foundry is a natural extension. We did not design Crowbar to be a one-trick pony. It’s modular design makes it easy to extend while leveraging the existing body of work.

Big companies are acting like start-ups. Both Crowbar and Cloud Foundry are projects that focus on putting the core functionality out quickly to prove their value proposition, get feedback, and change the game. This collaboration is positive proof of these companies being Agile and starting a project Lean.

Big companies are acting in the open. Both Dell via Crowbar and VMware via Cloud Foundry are contributing their source and working on it in the open.

Stay tuned for that “how to create a barclamp” post (or check out the barclamp rake task).

In addition to attending the great sessions at the OpenStack Design Conference, our Dell team realized that we’ve been making Platform as a Service (PaaS) much more complex. Stripping away the detritus is important because it looks like “What is a PaaS” is changing on a daily basis so boiling it down to the must fundamental is essential.

At its core, a PaaS is an application that changes its architecture based on the load. That’s it no further definition is required.

I’ve been playing with this definition since April and am finding that it’s a much more productive definition of PaaS than any that I’ve used so far. The reason is that it’s

application focused,

not language or services bound and

captures the business use cases

Of course, I’m going to have to provide more backup in future posts. I want to invite discussion about this perspective on PaaS. I’m especially interesting in seeing how recent offerings from VMware (OpenPaaS/CloudFoundry) or Amazon (Elastic Beanstalk) measure against this concept.

We’ve assembled a top 20 list of things to know about programming for Azure (and really any PaaS leaning cloud):

If you want performance, optimize to reduce fees. Azure (and any cloud) is architected to penalize you if you use their resources poorly. The challenge is to fix this before your boss get the tab for your unenlightened design decisions.

Coding .NET on Azure easy, architecting for Azure requires learning. Clouds put things in different places than you are used to and the rules are different. Expect a learning curve.

Partitioning = parallelism. Learn to love partitions in all their forms, because your app will be throttled if you throw everything into a single partition! On the upside, each partition operates in parallel and even better, they usually don’t cost extra (SQL is the exception).

Roles are flexible. You can run web servers (Apache, etc) on a worker and worker tasks on a web role. This is a good way to save some change since you pay per role instance. It’s counter to separation of concerns, but financially you should also combine workers into a single role where possible.

Understand walking deployments. You can (and should) have simultaneous versions of the code operating against the same data so that you can roll upgrades (ala Timothy Fitz/Eric Ries) to reduce risk and without reducing performance. You should expect your data schema to simultaneously span mutiple code versions.

Learn about Update Domains (UDs). Deployment domains allow rolling upgrades and changes to Applications and Services. They are part of how you partition your overall application. If you’re planning a VIP swap deployment, then you won’t care.

Each service = ONE external IP. You can have many VMs backing each service (and multiple roles in a service) and Azure will load balance between them so you can scale out each service. Think of each service as a clonable entity: there will be at least 1 and more can be added if you want to scale.

Understand between VIP and DIP. VIPs stand for Virtual IPs and are external, public, and metered. DIPs are internal, private, and load balanced. Azure provides an API to discover your DIPs – do not assume you know them because they are DYNAMIC IPs. Azure won’t let you see other DIPs inside the system.

Azure has rich diagnostics, but beware. Azure leverages the existing diagnostics built into their system, but has to get the data off box since instances are volitile. This means that problems can be hard to isolate while excessive logging can impact performance and generate fees. Microsoft lets you target individual systems for elevated levels. You can also Terminal Server to a VM for troubleshooting (with caution).

The new Azure admin console rocks. Take your pick between Silverlight or MMC Snap-in.

Queues are essential, but tricky. Learn the meaning of idempotent because using queues requires you to handle failures and timeouts. The scary part is that it will work nicely until you exceed some limits and then you’ll experience cascading failure. Whee! Oh yea, and queues require polling (which stinks as a notification model).

SQL Azure is just mostly like MS SQL. Microsoft did a smart thing in keeping Cloud SQL so it was highly compatible with Local SQL. The biggest note is that limited in size of partition. If you embrace the size limits you will get better performance. So stop pushing BLOBs into databases and start sharding.

Duplicating data in tables will improve performance. This has to do with how partitions and keys operate but is an interesting architecture for NoSQL – stage data for use. Don’t be afraid to stage the same data in multiple ways. It may be faster/cheaper to write data twice if it becomes easier to find when you search it 1000s of times.

Table data can be “warmed up.” Storage has logic that makes frequently accessed items faster (sort of like a cache ;). If you can anticipate load spikes then you should warm the data just before the spike.

Storage billing is both amount and transactions. You can get burned on a small, but busy set of data. Note: you will pay even if you 404 a request.

Azure has a CDN. Leveraging Microsoft’s Content Delivery Network (CDN) will improve performance for your users with small, low latency, high request items. You need to change your URLs for those assets. Best practice is to use some versioning in the URI so that you can force changes. Remember, CDN is SLOWER for the first hit when the data is not in cache so avoid CDN for low volume assets!

Provisioning time is not instant. Azure needs anywhere from 1-3 minutes to spin a new instance of a role. Build this lag into your architecture and dynamic scale plans. New databases and partitions are fast.

The VM Role is maintained by YOU. Using the VM role is a handy shortcut, but has a long list of gotcha’s. Some of note: 1) the VM can be “reset” to the last VM image state that you uploaded, 2) you are responsble for VM OS upgrades and patches, 3) VMs must be clonable because they will operate in parallel.

Azure supports more than .NET. You can setup anything in a worker (and now VM) role, but there are nuances to doing this effectively. You really need to understand how Azure works and had better be ready to crack open Visual Studio for some things even if you’re writing in Java.

We hope this list helps you navigate Azure deployments. No matter what cloud you use, understanding Azure’s architecture will help you write better cloud scale applications.

While I was in Seattle for Azure training preparing for Dell’s Azure Appliance , Dave @McCrory suggested that we also attend the Seattle Cloud Camp (SCC Tweets). This event was very well attended (200 people!). With heavy attendance by Amazon (at their HQ), Microsoft (in the ‘hood), and Google, there was a substantial cloud vendor presence (>25% from those vendors alone). Notable omission: VMware.

My reflection about the event by segment.

Opening Sessions:

Most of the opening sessions were too light for the audience. I thought we were past the “what is cloud” level, sigh.

Of note, the Amazon security presentation by Steve Rileywas fun and entertaining.

Picking on a Dell competitor specifically: calling your cloud solution “WAS” is a branding #fail (not that DCSWA much is better).

Unpanel of self-appointed cloud extroverts experts:

The unpanel covered some decent topics (@adronbh captured them on twitter), unfortunately none of the answers really stood out to me. Except for NoSQL.

The unpanel discussion about NoSQL drew 2 answers. 1) It’s not NoSQL, it’s eventually consistent instead of strictly consistent. (note: I’ve been calling it “Storage++”) 2) We’ll see more and more choices in this area as we tune the models for utility then we’ll see some consolidation. The suggestion was that NoSQL would follow the same explosion/contraction pattern of SQL databases.

Session on Cloud APIs (my suggested topic)

The Cloud API topic was well attended (30+). The vast overwhelming majority or the attendees were using Amazon.

There was some interest in having “standard” APIs for cloud functions was not well received because it was felt to stifle innovation. We are still to early.

It was postulated but not generally agreed that cloud aggregation (DeltaCloud, RightScale, etc) is workable. This was considered a reason to not require standard clouds.

CloudCamp sponsor, Skytap, has their own API. These APIs are value added and provide extra abstraction levels.

It was said that there are a LOT (50 now, 500 soon) smaller hosts that want to enter the cloud space. These hosts will need an API – some are inventing their own.

I brought up the concept discussed at OpenStack that the logical abstraction for cloud network APIs is a “vlan.” This created confusion because some thought that I meant actual 802.1q tags. NO! I just meant that is was the ABSTRACTION of a VLAN connecting VMs together.

There was agreement from the clouderati in the room that cloud networking was f’ed up, but most people were not ready to discuss.

Cloud APIs have some basics that are working (semantics around VMs) but still have lots of wholes. Notably: networking, application, services, and identity)

Session on Google App Engine (GAE)

GAE is got a lot going on, especially in the social/mobile space.

Do not think a lack of news about GAE means that they are going slow, it’s just the opposite. It looks like they are totally kicking ass with a very focused strategy. I suspect that they are just waiting for the market to catch-up.

GAE understands what a “platform” really is. They talk about their platform as the SERVICES that they are offering. The code is just code. The services are impressive and include identity, mail, analysis, SQL (business only), map (as in Map-Reduce), prediction (yes, prediction!), storage, etc. The total list was nearly 20 distinct services.