Search

Over 20 years ago, I started my professional career joining GE working as a Systems Engineer for a large bank and doing several public sector assignments. After a few years, I worked for Sprint. Several years later, I founded my first startup which was sold to Quest Software (which in turn was bought by Dell). After doing 6 months of consulting for a Grid computing company (remember those?), I started another company. This company would ultimately be bought by Solarwinds. I moved on to Dell’s DCS (hyperscale compute) group for about 9 months. It was then that I wrote about Data Gravity, for the first time. I also discovered a project VMware was working on called Project Maple.

Project Maple was later renamed Cloud Foundry. Blogging about this discovery led to my recruitment by Jerry Chen to join the Cloud Foundry team. Working with the Cloud Foundry team in the early days was surreal to say the least. Eventually I was recruited away to Warner Music Group, where I became SVP of Engineering, working for Jonathan Murray. At WMG, we built a first of its kind software factory by leveraging Cloud Foundry OSS, which enabled an increase in application delivery speed by an order of magnitude.

Just after the first version of the factory shipped, I was contacted by Adam Wray, asking if I was interested in joining him at Basho as part of a new funding round with a new investor. This sounded like a great opportunity to experience joining a NoSQL startup. After leaving Basho at the end of June, I found myself at a periodic point. Much like Nick Weaver announcing recently that he had returned to EMC, I have returned to GE.

I have joined the GE Digital business as VP of Software Engineering, working as part of the Wise.io team on Machine Learning for IIOT.

I see something that is beyond the other opportunities I considered. The most compelling reason, is being able to have a profound effect on an incredibly large and diverse number of businesses and therefore, affecting a disproportionately large number of people’s lives in very positive ways. GE Digital’s Predix Platform directly supports all of the different GE Business Units’ IoT efforts, including the; Oil & Gas, Energy, Aviation, Health, Power, and Transportation divisions to name a few. The Wise.io team is amplifying the benefits and discoveries made by looking at all of this IoT data and applying machine learning against it. The work that GE Digital is doing with Predix and the Industrial Internet of Things is truly game, and life changing.

The Wise.io team itself is comprised of some of the smartest people I have ever worked with, Josh Bloom being a prime example. All of them are humble and kind, yet wickedly smart. They have created a unique culture of diversity, happiness, positivity, humbleness, respect, openness; all of this in a highly professional productive environment that avoids unnecessary meetings. The Wise.io team is truly an incredible team, and I look forward to learning and growing with them.

Why me?

The unique challenges that the Wise.io team has are familiar to me; How do you grow/build a “startup” inside a large company, and how do you grow that team to scale? What are the processes needed to achieve this? What does the reporting structure look like? Where do you find talent? How do you bridge the large company with the startup inside?

Some of the technical challenges they face also line up with my experience. How do you run a PaaS at scale? How do you run a PaaS on a PaaS? How do you go about building and operating a software factory? Those are some of the examples of why this unique challenge and team were so attractive and such a great fit.

If you are looking for a great engineering or data science position and either live within commuting distance of San Francisco, or would consider relocating, and believe that Machine Learning and IOT are the future, please send me a DM on twitter (@mccrory).

This is Part Four in the series “The Future of Networks”, to start at the beginning go here.

There are many possibilities as to where the previously mentioned trends drive networking. As I continue to interact with the community and evolve my thoughts about the Future of Networks, I am going to cover more of this. I am also going to continue this series past the 5th planned post. Although the original 5th and final post will still go out tomorrow as planned.

Networks are trending to commodity based switching hardware, this could be further embraced and follow a more programmable paradigm. The hardware could be open sourced along with the network operating system (some switches are already running Linux or derivatives). If things go in this direction, there is opportunity to leverage a large number of OSS projects and capabilities to rapidly evolve both the network and network devices. By leveraging all open technology and places some capabilities/services/workloads inside the switch itself, networks get access to new communities. These communities could be leveraged to further advance the move from running everything server side to moving the right components to the network and everyone would benefit.

Imagine a commodity network device that has virtual interfaces that work with containers, VM’s and can also support straight code execution. The containers/VMs/code run on CPU/memory in the network device. The network device itself is programmable for traditional Layer 2-4 tasks and also can route and manipulate data at Layer 7. This then allows direct attached scenarios and traditional switching scenarios described in yesterday’s post. The layer 7 capabilities could also be run in virtual switches or proxies allowing ubiquitous services running across any environment.

Speaking of proxies and where the Future of Networking is headed, I’m currently at Gluecon where yesterday the Istio project was announced. For those that missed it Istio incorporates the work that Lyft did with Envoy along with several new control plane components, operationalizing microservices. What is meant by operationalizing microservices? Specifically, enabling things like Kubernetes to have selective traffic management and request routing allowing canary deployments and A/B testing. It also includes service discovery and registration, along with authentication, policies, and more. The Envoy component acts as a sidecar/transparent proxy to each of the Kubernetes instances that is running to provide all of the network coordination and capabilities. The Istio project plans to support several other technologies besides Kubernetes including Cloud Foundry and Mesos. It is also worth noting that Istio has been running inside Google for quite a while now and Envoy runs in production at Lyft servicing over 2M requests per second.

How would our future network device work with something like Istio? Istio is a service mesh where the Envoy sidecar provides ingress and egress traffic management. Envoy is being controlled by the Istio-Manager, which would be ideal to run in our new network switch device. The Mixer would also run in our new network switch device. Dependent services that are either critical or are heavily used would also run on our new network switch, potentially being moved from a container server side to a container network switch side. The service registry and policies seem like great candidates as well to run here. The new network switch is critical to the system functioning, why not put the critical services there? If the network is down, none of the services will function anyway.

This may all seem like it is far away, but there are many indicators that this could be a done fairly quickly. The next post lists out some of the things that already exist that support the scenarios and thoughts from the previous posts.

For the past few months, I have been considering how Networking technology will (and in some cases should) evolve over the next 3–7 years. Networking technology has stagnated from the breakneck pace it once moved. During the 90’s we saw amazing boosts in capabilities, evolving protocols, performance, and overall incredible levels in innovation. Fast forward to 2017 and there has been no real evolution in the network space overall. Sure, there is network virtualization, SD-WAN, and NFV that are useful technologies. These technologies give a lot of flexibility and capability to virtual machines and their hypervisors, along with container technologies such as Docker and Kubernetes. However, the capabilities of the network and its abilities remain largely untapped in my opinion.

My initial thoughts around this began with a conversation I had with @jamesurqhart a few months ago, when he off handedly said that all network switches/routers were really just computers. I never looked at a network switch or router as a computer, I always viewed it as another specialized piece of hardware. The reality is that routers and switches are actually computers and in fact are purpose built computers with a number of ASICs in them. I considered this further and broke things out into modern switching gear becoming more like a commoditized component. This is partially due to large cloud providers such as Amazon, Microsoft, Facebook, and Google all creating their own switches and in some cases using their own ASICs.

[Above is a diagram of the Packet Flow on a
Juniper Networks T Series Core Router] Notice all the ASICs...

Switches have become a commodity today and routers (I mean big routers) are really more like HPC devices. Big iron routers are jammed packed with lots and lots of capabilities and many have large numbers of configurations available depending on the demands of the particular network they are supporting. Much like mainframes with specialized components, these systems are NOT becoming commoditized anytime soon, they are simply too specialized and complex. We have elite cloud companies that do their own ASIC design and can invest in switches in large quantities resulting in large capex savings. These same companies aren’t developing any HPC/core routers as their deployments of those are too few to justify the development expense, at least not currently.

As we move into the future, the trends of networking infrastructure point to continued commoditization. What will and needs to change is the view that switches are purely around to push packets. As you will see in this series of blog posts, there is a huge opportunity for network vendors and others to capitalize on several industry trends to create something beyond a traditional network.

Read part 2 of the series tomorrow to find out the additional trends that are affecting the Future of Networks.

It has been a while since I posted anything referring to Data Gravity. While Data Gravity is interesting and can explain many motivations of Cloud Companies and their Data Services, there are other influential forces at work.Service Energy

What am I referring to as a Service in this case? Any code or logic that has been deployed by a provider to expose a resource.

Examples include:

APIs

Message Queues and Buses

Automation, Scripting, and Provisioning Interfaces

Web Services

Many more…

When resources are externalized, this is what enhances the value of Data and helps increase Data Mass and Data Gravity. As a Service is used more frequently, the amount of energy it is emitting increases in our analogy. The emitted energy has effects just as it would in Physics. Service Energy has the ability to assist in Escape Velocity as well as increase Data Gravity, all depending on what the Service Energy is doing.
Service Energy shows motivations in Clouds for specific behaviors such as:

Why Salesforce acquired Heroku – (Heroku is indeed a Ruby PaaS, but it was beginning to bring in SERVICES from outside which increased its Service Energy) Salesforce needs this in the Ecosystem, just like it needed to create Database.com to help increase it’s Data Mass and therefore it’s Data Gravity.

Why Amazon created SQS and SES (These are services that encourage additional consumption of Compute but more so amplifies the amount of data (Data Mass)

It should be noted in the picture above that the Data is made accessible through a service which is why it has Service Energy around it, which should be distinguished from Data Gravity. Remember, Service Energy does NOT attract, but can amplify.

Service Energy also can be used for Escape Velocity. By properly architecting applications and even Service Oriented Platforms, the Data Mass can be spread across many providers (and even sources inside of those providers). This provides looser coupling between the App and a specific Cloud, which gives more flexibility. The trade-off is that this design is more prone to service interruptions, latency, and bandwidth constraints.

There is much more to be said about Service Energy in the future including exploring other effects it has with more IaaS centric solutions.

We’ve assembled a top 20 list of things to know about programming for Azure (and really any PaaS leaning cloud):

If you want performance, optimize to reduce fees. Azure (and any cloud) is architected to penalize you if you use their resources poorly. The challenge is to fix this before your boss get the tab for your unenlightened design decisions.

Coding .NET on Azure easy, architecting for Azure requires learning. Clouds put things in different places than you are used to and the rules are different. Expect a learning curve.

Partitioning = parallelism. Learn to love partitions in all their forms, because your app will be throttled if you throw everything into a single partition! On the upside, each partition operates in parallel and even better, they usually don’t cost extra (SQL is the exception).

Roles are flexible. You can run web servers (Apache, etc) on a worker and worker tasks on a web role. This is a good way to save some change since you pay per role instance. It’s counter to separation of concerns, but financially you should also combine workers into a single role where possible.

Understand walking deployments. You can (and should) have simultaneous versions of the code operating against the same data so that you can roll upgrades (ala Timothy Fitz/Eric Ries) to reduce risk and without reducing performance. You should expect your data schema to simultaneously span mutiple code versions.

Learn about Update Domains (UDs). Deployment domains allow rolling upgrades and changes to Applications and Services. They are part of how you partition your overall application. If you’re planning a VIP swap deployment, then you won’t care.

Each role = ONE external IP. You can have many VMs backing each role and Azure will load balance between them so you can scale out each role. Think of each role as a clonable entity: there will be at least 1 and more can be added if you want to scale.

Understand between VIP and DIP. VIPs stand for Virtual IPs and are external, public, and metered. DIPs are internal, private, and load balanced. Azure provides an API to discover your DIPs – do not assume you know them because they are DYNAMIC IPs. Azure won’t let you see other DIPs inside the system.

Azure has rich diagnostics, but beware. Azure leverages the existing diagnostics built into their system, but has to get the data off box since instances are volitile. This means that problems can be hard to isolate while excessive logging can impact performance and generate fees. Microsoft lets you target individual systems for elevated levels. You can also Terminal Server to a VM for troubleshooting (with caution).

The new Azure admin console rocks. Take your pick between Silverlight or MMC Snap-in.

Queues are essential, but tricky. Learn the meaning of idempotent because using queues requires you to handle failures and timeouts. The scary part is that it will work nicely until you exceed some limits and then you’ll experience cascading failure. Whee! Oh yea, and queues require polling (which stinks as a notification model).

SQL Azure is just mostly like MS SQL. Microsoft did a smart thing in keeping Cloud SQL so it was highly compatible with Local SQL. The biggest note is that limited in size of partition. If you embrace the size limits you will get better performance. So stop pushing BLOBs into databases and start sharding.

Duplicating data in tables will improve performance. This has to do with how partitions and keys operate but is an interesting architecture for NoSQL – stage data for use. Don’t be afraid to stage the same data in multiple ways. It may be faster/cheaper to write data twice if it becomes easier to find when you search it 1000s of times.

Table data can be “warmed up.” Storage has logic that makes frequently accessed items faster (sort of like a cache ;). If you can anticipate load spikes then you should warm the data just before the spike.

Storage billing is both amount and transactions. You can get burned on a small, but busy set of data. Note: you will pay even if you 404 a request.

Azure has a CDN. Leveraging Microsoft’s Content Delivery Network (CDN) will improve performance for your users with small, low latency, high request items. You need to change your URLs for those assets. Best practice is to use some versioning in the URI so that you can force changes. Remember, CDN is SLOWER for the first hit when the data is not in cache so avoid CDN for low volume assets!

Provisioning time is not instant. Azure needs anywhere from 1-3 minutes to spin a new instance of a role. Build this lag into your architecture and dynamic scale plans. New databases and partitions are fast.

The VM Role is maintained by YOU. Using the VM role is a handy shortcut, but has a long list of gotcha’s. Some of note: 1) the VM can be “reset” to the last VM image state that you uploaded, 2) you are responsble for VM OS upgrades and patches, 3) VMs must be clonable because they will operate in parallel.

Azure supports more than .NET. You can setup anything in a worker (and now VM) role, but there are nuances to doing this effectively. You really need to understand how Azure works and had better be ready to crack open Visual Studio for some things even if you’re writing in Java.

We hope this list helps you navigate Azure deployments. No matter what cloud you use, understanding Azure’s architecture will help you write better cloud scale applications.

I decided to do this based on today’s PDC 2010 announcement of the Microsoft Extra Small Instance. I wanted to compare and contrast Amazon’s EC2 Instances with Microsoft Azure’s Instances. I came across what I believe to be an error on Microsoft’s website (both Medium and Large Instances currently show as $0.48/CPU hour.

Below is a picture of the spreadsheet I put together. I will put a link to the XLS up later, I’m looking for corrections and input on improving this.

I decided to look at the Rails Rumble projects today and have found several that look very promising. Some could be applied to the cloud and some just solve some interesting problems. Below are some of my top picks:

StillAlive– Focuses on website monitoring, but at a new deeper level not just simple pinging

Monitaur – Looks at server statistics and tracks details of processes, disk, swap and more

Miss Monitor – (this is their summary, the project isn’t up yet) > Collect, aggregate and visualize static and dynamic data about your application and libraries. This includes for instance performance data collected from services like New Relic or code statistics like flog score and lines of code.
The goal is to provide an overview how your application develops over time and helping to identify connections between different kinds of data.

SaaS:

SaaS Modeller – (this is their summary, the project isn’t up yet) > Our app will help SaaS startups forecast the impact of pricing and conversion rates on their monthly recurring revenue.
It will automatically create a spreadsheet that projects their revenue for the next 2 years. Based on Ryan Carson’s SaaS model google spreadsheet.
It will factor in churn rates, lifetime profits from customers and other useful metrics to help founders find the right model.

Social Sofa – (this is their summary, the project isn’t up yet) > Social media collection, monitoring and discovery tool. Using PubSubHubbub, CouchDB and WebSockets/XMPP

StreamR – (this is their summary, the project isn’t up yet) > Streamr is a streaming web application that group most famous services of virtual life of companies and their employees (Github, Twitter, Facebook, Linkedin, etc.)

Presentations:

SPRKLR – Realtime feedback from your audience on how well you are presenting

SlideHub – (this is their summary, the project isn’t up yet) > A tool and a simple markdown-based language to create web presentations. No flash involved

Online Utility:

The Unpack-App – (this is their summary, the project isn’t up yet) > Aren’t you tired of all these zip and rar files polluting your inbox?
We have decided to save the planet removing all these packagings for you!