Search

Over 20 years ago, I started my professional career joining GE working as a Systems Engineer for a large bank and doing several public sector assignments. After a few years, I worked for Sprint. Several years later, I founded my first startup which was sold to Quest Software (which in turn was bought by Dell). After doing 6 months of consulting for a Grid computing company (remember those?), I started another company. This company would ultimately be bought by Solarwinds. I moved on to Dell’s DCS (hyperscale compute) group for about 9 months. It was then that I wrote about Data Gravity, for the first time. I also discovered a project VMware was working on called Project Maple.

Project Maple was later renamed Cloud Foundry. Blogging about this discovery led to my recruitment by Jerry Chen to join the Cloud Foundry team. Working with the Cloud Foundry team in the early days was surreal to say the least. Eventually I was recruited away to Warner Music Group, where I became SVP of Engineering, working for Jonathan Murray. At WMG, we built a first of its kind software factory by leveraging Cloud Foundry OSS, which enabled an increase in application delivery speed by an order of magnitude.

Just after the first version of the factory shipped, I was contacted by Adam Wray, asking if I was interested in joining him at Basho as part of a new funding round with a new investor. This sounded like a great opportunity to experience joining a NoSQL startup. After leaving Basho at the end of June, I found myself at a periodic point. Much like Nick Weaver announcing recently that he had returned to EMC, I have returned to GE.

I have joined the GE Digital business as VP of Software Engineering, working as part of the Wise.io team on Machine Learning for IIOT.

I see something that is beyond the other opportunities I considered. The most compelling reason, is being able to have a profound effect on an incredibly large and diverse number of businesses and therefore, affecting a disproportionately large number of people’s lives in very positive ways. GE Digital’s Predix Platform directly supports all of the different GE Business Units’ IoT efforts, including the; Oil & Gas, Energy, Aviation, Health, Power, and Transportation divisions to name a few. The Wise.io team is amplifying the benefits and discoveries made by looking at all of this IoT data and applying machine learning against it. The work that GE Digital is doing with Predix and the Industrial Internet of Things is truly game, and life changing.

The Wise.io team itself is comprised of some of the smartest people I have ever worked with, Josh Bloom being a prime example. All of them are humble and kind, yet wickedly smart. They have created a unique culture of diversity, happiness, positivity, humbleness, respect, openness; all of this in a highly professional productive environment that avoids unnecessary meetings. The Wise.io team is truly an incredible team, and I look forward to learning and growing with them.

Why me?

The unique challenges that the Wise.io team has are familiar to me; How do you grow/build a “startup” inside a large company, and how do you grow that team to scale? What are the processes needed to achieve this? What does the reporting structure look like? Where do you find talent? How do you bridge the large company with the startup inside?

Some of the technical challenges they face also line up with my experience. How do you run a PaaS at scale? How do you run a PaaS on a PaaS? How do you go about building and operating a software factory? Those are some of the examples of why this unique challenge and team were so attractive and such a great fit.

If you are looking for a great engineering or data science position and either live within commuting distance of San Francisco, or would consider relocating, and believe that Machine Learning and IOT are the future, please send me a DM on twitter (@mccrory).

This is Part Four in the series “The Future of Networks”, to start at the beginning go here.

There are many possibilities as to where the previously mentioned trends drive networking. As I continue to interact with the community and evolve my thoughts about the Future of Networks, I am going to cover more of this. I am also going to continue this series past the 5th planned post. Although the original 5th and final post will still go out tomorrow as planned.

Networks are trending to commodity based switching hardware, this could be further embraced and follow a more programmable paradigm. The hardware could be open sourced along with the network operating system (some switches are already running Linux or derivatives). If things go in this direction, there is opportunity to leverage a large number of OSS projects and capabilities to rapidly evolve both the network and network devices. By leveraging all open technology and places some capabilities/services/workloads inside the switch itself, networks get access to new communities. These communities could be leveraged to further advance the move from running everything server side to moving the right components to the network and everyone would benefit.

Imagine a commodity network device that has virtual interfaces that work with containers, VM’s and can also support straight code execution. The containers/VMs/code run on CPU/memory in the network device. The network device itself is programmable for traditional Layer 2-4 tasks and also can route and manipulate data at Layer 7. This then allows direct attached scenarios and traditional switching scenarios described in yesterday’s post. The layer 7 capabilities could also be run in virtual switches or proxies allowing ubiquitous services running across any environment.

Speaking of proxies and where the Future of Networking is headed, I’m currently at Gluecon where yesterday the Istio project was announced. For those that missed it Istio incorporates the work that Lyft did with Envoy along with several new control plane components, operationalizing microservices. What is meant by operationalizing microservices? Specifically, enabling things like Kubernetes to have selective traffic management and request routing allowing canary deployments and A/B testing. It also includes service discovery and registration, along with authentication, policies, and more. The Envoy component acts as a sidecar/transparent proxy to each of the Kubernetes instances that is running to provide all of the network coordination and capabilities. The Istio project plans to support several other technologies besides Kubernetes including Cloud Foundry and Mesos. It is also worth noting that Istio has been running inside Google for quite a while now and Envoy runs in production at Lyft servicing over 2M requests per second.

How would our future network device work with something like Istio? Istio is a service mesh where the Envoy sidecar provides ingress and egress traffic management. Envoy is being controlled by the Istio-Manager, which would be ideal to run in our new network switch device. The Mixer would also run in our new network switch device. Dependent services that are either critical or are heavily used would also run on our new network switch, potentially being moved from a container server side to a container network switch side. The service registry and policies seem like great candidates as well to run here. The new network switch is critical to the system functioning, why not put the critical services there? If the network is down, none of the services will function anyway.

This may all seem like it is far away, but there are many indicators that this could be a done fairly quickly. The next post lists out some of the things that already exist that support the scenarios and thoughts from the previous posts.

Imagine for a moment that if instead of treating each server (or container, or vm) as if it is one or more network hops away, that the server was directly attached to the services and other servers that it needs to interact with most. This would increase efficiency and would solve quite a few consistency and consensus issues that exist today when dealing with modern infrastructure. If each network switch and router is already a specialized computer, why not further empower these devices to solve many of the problems we struggle with in infrastructure today? We had quite a few specialized network appliances such as load balancers, firewalls, and storage appliances that emerged in the late 90’s and early 00’s. These follow a path of mixing the network with servers providing some specialized function or functions.

The demands for the ever faster evolution of infrastructure capabilities and scale, and the fact that the network is treated as a dumb pipe or simply plumbing between nodes all contribute to the friction of trying to have a dynamic infrastructure. This dumb pipe approach forces all complexity into the software on server nodes, which makes things using distributed systems and solutions vastly more complex than necessary. Trying to create overlays and abstractions just increases complexity and impairs performance as a tradeoff for flexibility.

If you accept that switches and routers are computers that are purpose built for pushing packets, why not consider adding additional capabilities inside these computers? I know there will be naysayers that think this is a bad idea and there will be cases where some of these things may very well be a bad idea. However, there is definitely a class of problems where this would be a far more elegant solution than what we have now. Simplifying out some complexity and vastly improving performance and scalability seem pretty attractive. A message bus/queue comes to mind as a great example that could start providing incredibly low latencies by being tied into the network device itself.

This approach would also leverage the power of persistent connections, if a server is attached to two other “direct attached switches/network devices” (for redundancy), if either side goes down the other knows. This is an over-simplification that I ask you to overlook for now as I realize there are exceptions/edge cases. Generally when a physical server goes down, the network interface(s) will go down, this would immediately alert any service running on them that this node is down. There could also be a socket or other specialized service running in the switch looking at heartbeat, packet flow, buffers, and communicates with a server side service, maybe via an optimized coordination protocol? This would work well for local clusters, messaging, and similar scenarios. Your cluster needs concensus? No problem if you are only dealing with the concensus service running the switch or router. Even doing cluster concensus across switches and routers would be superior.

Now think about vendors providing prewritten services available to run on switches and routers that can be connected together with serverless functions. This provides a much more flexible and powerful model of “wiring up” infrastructure. Have a function that needs to be more performant and adding additional instances won’t help? Maybe it should be deployed on an FPGA, or perhaps a special ASIC is added to the switch. I realize that ASICs aren’t as portable/flexible, but perhaps there are common functions where it still makes sense to add ASICs (serialization/deserialization of a protocol for example). Perhaps these FPGAs and ASICs could even be admin insertable via cards or other modular interface wrapped around them? Or perhaps the switch itself is simply more programmable, but adds special function services running on general processors.

There is also an opportunity to enable more sophisticated systems that combine FPGAs, ASICs (TPUs anyone?), GPUs, and CPUs to handle a greater and broader number of problems. These systems would act and be closer to current big iron routers. However, they would be offer a much wider array of capabilities than what is available today. This might be an appropriate place to run a container running a critical piece of OSS infrastructure code for example.

[Above is a Mockup of a Switch with FPGAs, Pluggable ASICs, and CPUs]

I have only scratched the surface of this concept in this post, come back tomorrow to read part 4, where we will explore other possibilities.

This post is the second in a series: To read them in order, start here.

There are three trends that I have seen happening over the past few years that are indirectly or directly influencing the future of networks; The first is the use of FPGAs and ASICs to solve problems around distributed applications and network loads(specifically I’m thinking of Bing/Azure and several AI projects — including one that chose to go down the ASIC path — Google. The other path that is intriguing is of PISA by Barefoot Networks, which seems on a positive trajectory.

This has been driven by the need for greater performance and better efficiencies in the data center at extreme scale. These efficiencies also translate into very large amounts of capex that ends up being saved. The use of custom hardware in conjunction with software, brings much better returns because of the purpose built nature (which brings increased efficiency and/or performance). This could even be seen as what would be the equivalent of VMware shipping VMware Workstation. This technology provided the foundation for the virtualization transformation of the last decade, all based on making more efficient use of server hardware sitting in datacenter already. Yes, virtualization brought many many other benefits, but the story that initially took hold was better utilization of hardware.

Another trend has been the explosion of infrastructure technologies and projects, of which most are OSS. These projects include a wide array of technological building blocks that have been contributed by many of those same large cloud companies mentioned earlier. However, in addition to these large cloud companies, many other large software companies and some fast growing small companies like Hashicorp, have also developed and released infrastructure technologies (Infrastructure as Code). Technologies such as:

Network gear has stagnated so much, that pure software infrastructure has taken over a large swath of functionality that should live in software on network equipment. This isn’t a knock against any of the projects above, they are all quite innovative and are answering an obvious unmet need. This is another key indicator that we need to change the nature of the network and how it integrates with what runs on it.

The final trend is that of microservices and serverless aka (lambda/azure functions/cloud functions). This is increasing in popularity very quickly by proponents (and with good reason). What is intriguing about the technology, is that you can write very small amounts of logic/code to interconnect the equivalent of lego blocks of services and infrastructure functionality. This has the same risk as microservices, in that there is also some complexity to be managed (especially if the implementation is not well thought-out upfront). However the benefits in speed of development, scalability and delivery outweigh the costs in most cases from what we have seen so far.

So how does all this involve the future of networks? Read Part 3 of this series tomorrow to find out what the future of networks is pointing to.

For the past few months, I have been considering how Networking technology will (and in some cases should) evolve over the next 3–7 years. Networking technology has stagnated from the breakneck pace it once moved. During the 90’s we saw amazing boosts in capabilities, evolving protocols, performance, and overall incredible levels in innovation. Fast forward to 2017 and there has been no real evolution in the network space overall. Sure, there is network virtualization, SD-WAN, and NFV that are useful technologies. These technologies give a lot of flexibility and capability to virtual machines and their hypervisors, along with container technologies such as Docker and Kubernetes. However, the capabilities of the network and its abilities remain largely untapped in my opinion.

My initial thoughts around this began with a conversation I had with @jamesurqhart a few months ago, when he off handedly said that all network switches/routers were really just computers. I never looked at a network switch or router as a computer, I always viewed it as another specialized piece of hardware. The reality is that routers and switches are actually computers and in fact are purpose built computers with a number of ASICs in them. I considered this further and broke things out into modern switching gear becoming more like a commoditized component. This is partially due to large cloud providers such as Amazon, Microsoft, Facebook, and Google all creating their own switches and in some cases using their own ASICs.

[Above is a diagram of the Packet Flow on a
Juniper Networks T Series Core Router] Notice all the ASICs...

Switches have become a commodity today and routers (I mean big routers) are really more like HPC devices. Big iron routers are jammed packed with lots and lots of capabilities and many have large numbers of configurations available depending on the demands of the particular network they are supporting. Much like mainframes with specialized components, these systems are NOT becoming commoditized anytime soon, they are simply too specialized and complex. We have elite cloud companies that do their own ASIC design and can invest in switches in large quantities resulting in large capex savings. These same companies aren’t developing any HPC/core routers as their deployments of those are too few to justify the development expense, at least not currently.

As we move into the future, the trends of networking infrastructure point to continued commoditization. What will and needs to change is the view that switches are purely around to push packets. As you will see in this series of blog posts, there is a huge opportunity for network vendors and others to capitalize on several industry trends to create something beyond a traditional network.

Read part 2 of the series tomorrow to find out the additional trends that are affecting the Future of Networks.

As I continue my journey through Cloud Computing, Big Data, Networking, DevOps and more, I have decided to make another change. Today is my first day at Warner Music Group, where I will be assuming the role of SVP of DSP Engineering (aka SVP of Platform Engineering).

There may be speculation as to why I left VMware and Cloud Foundry, I would like to put that to rest here. VMware and Cloud Foundry are both things that I believe in and think they have bright futures. For me personally, I needed a change from a vendor role to that of a customer (which will also keep me in the Cloud Foundry ecosystem). The timing has absolutely nothing to do with Paul Maritz’s departure (which was coincidental).

I will be at VMworld presenting and participating (so feel free to connect with me there – Twitter is best for this).
I expect great things from VMware and especially the Cloud Foundry team over the next couple of years.
Meanwhile, at WMG, I expect to learn and do a great many new and exciting things that I will talk about more over time. I’m looking for a few very talented senior developers/leads to work with me, if you know someone or are interested feel free to send me a resume or ping me on Twitter.

Having covered Data Gravity several times on this blog, I thought that it would be time to cover a derivative topic: Artificial Data Gravity.

Recall that Data Gravity is the attractive force created by Data amassing and the needs of Apps and Services to leverage low latency and high bandwidth.

Artificial Data Gravity is the creation of attractive forces through indirect or outside influence. This could be something such as costs, throttling, specialization, Legislative, Usage, or other forms. Below I will walk through examples of Public Clouds creating, exerting, and leveraging Artificial Data Gravity.

Costs : The fact that AWS S3 Is free for unlimited Transfer In-bound traffic along with Windows Azure, are great examples of Artificially encouraging Data to amass internally. By allowing you to put more Data inside of S3 or Azure, this encourages Data Gravity patterns through Artificial means.

Throttling : The Twitter API is a great example with its well known API that allows 350 Calls per/hour. This makes it nearly impossible to replicate the traffic on twitter without special (and very expensive agreements in place).

Specialization : Specialized services such as DynamoDB not only encourage Data Gravity through transfer pricing, but encourage low writes, high reads based on a 1:5 ratio. Not only are you unlikely to ever leave DynamoDB, you are also encourage to write code as write efficient as possible due to costs.

Legislative : There are many laws that restrict the location and govern the security and use of Data, these are not technical or physics related, but artificial means of influencing Data Gravity as mentioned in this GigaOM piece covering the law dictating Data Gravity.

Usage : Dropbox charges each individual user for use of Shared Data (Artificial Usage). This means that each person pays for the Data consuming their storage, however Dropbox is only storing a single copy and pointing all authorized users to that single copy.

There are certainly other forms of Artificial Data Gravity that are not listed in the examples above, if you can think of a concrete example, please comment.

One last note : I’m not saying there is anything particularly wrong with Artificial Data Gravity, however it is something to be aware of as it is one of the behaviors/motivations exhibited from Data Gravity as a whole.