Deploying Network Services in OpenStack [Part 1: Challenges]

OpenStack Neutron provides multi-tenant aware configuration APIs for load balancer, firewall and VPN network services. The reference implementation renders these services on per tenant basis in router namespace on network node. A small subset of vendors have provided drivers to enable configuration of their devices using these common APIs. There are a number of challenges with deploying these services. As enterprises start deploying OpenStack in their production clouds, they find it very hard and frustrating to enable rich differentiated network services to their customers.

In the first part of this multi series blog, we state the challenges with the current state of affairs with network services and make the case for a solution that is complementary to Neutron project. We like to acknowledge that some of the challenges are outside the charter of Neutron project and that there are attempts to address some of them in a couple of other OpenStack projects. However, we are looking at a comprehensive solution that addresses all of the following for enterprises and provide the base solution in open source.

Complex to deliver network services

Neutron uses plugin, agent and driver model as reference architecture for delivering network services. There are also other variants to this model such as Octavia for LBaaS. It is a struggle to put together a solution for operators as they need to deal with the complexities of managing different variants of these models. The reference implementation uses namespaces on network node and this quickly becomes a bottleneck and does not scale. There is a sincere attempt by LBaaS community to provide operator grade load balancer through Octavia in Mitaka release. However, we are yet to see a wider adoption of Octavia model by third party vendors.

Limited vendor support

A limited number of vendors have integrated their drivers. One of the major drawbacks with the model is that it doesn’t provide life cycle management for vendor devices. The term, life cycle management, broadly deals with operations such as orchestration, licensing, configuration, upgrade and monitor of service instances. Because of this limitation, these drivers make assumptions about life cycle events and expect service instances were already launched and licensed. The scope of the drivers is mainly limited to providing configuration.

Lack of rich network service models

Neutron supports L3 insertion model for load balancers and router insertion model for VPN and firewalls. However, security as a service use cases would need to deploy richer service models such as transparent (IPS) and monitor (IDS) services. There is also a lot of interest and real need in the community for service chains. However, Neutron SFC which is expected to be released in Mitaka supports port chaining and leaves out the the notion of plumbing of network services.

API limitations and lack of differentiating flavors

Network services APIs are limited and mostly provide basic functionality. They also do not provide a means for vendors to enable their differentiated offerings to customers in a standard way. The lack of richness in APIs will take away the advantage for vendors over open source alternatives. There was an attempt to enable this differentiation using flavors in Neutron, but it doesn't bring out all the unique capabilities of a vendor service.

Operational assurances

Network service assurance guarantees continuous availability as agreed in operational contract. This includes features such as monitoring, high availability, auto scaling, self healing etc. Neutron doesn't provide these SLAs and this is a huge roadblock for going production. Additionally, different vendors provide different ways to enable this functionality. It requires abstraction to enable the same consistent solution across all supported vendor and open source network services.

Limited visibility and analytics

Neutron provides limited metrics such as bandwidth and connections for network services. These metrics don't provide further insight into traffic and usage analytics. These advanced analytics are useful for operators to make corrective actions to their load balancer or firewall policies. It is also needed to correlate logs and metrics for further insights into operation of a service and also for troubleshooting.

Lack of higher level policy

Neutron APIs are still complex for end users. These end users who deal with providing applications in the cloud would have to understand the complexities of deploying networks and network services. A policy is needed to specify intent as a higher level abstraction, which should render the required resources to realize the intent. Policy also needed to enable higher level of automation by deriving context and adapting context to the changes in the environment.

Scale and deployment challenges

As discussed above, life cycle management goes beyond orchestration. Aspects such as configuration, assurance and visibility need to scale with the number of network service instances. The standard OpenStack model of running these management and control services on OpenStack infrastructure management plane(under the cloud) will not scale. It requires special routing mechanisms (such as floating IPs), to access network service instances which run over the cloud (tenant plane) on virtual networks from under the cloud for configuration, monitor etc. This creates a number of challenges for operators. The model should also provide the flexibility of running separate network service controllers per domain etc. Therefore, we suggest that a combination of under the cloud and over the cloud deployment model provides enough flexibility and scale.

Summary

In this blog we analyzed the state of network services in OpenStack and identified the challenges. In subsequent blogs we propose a new framework in OpenStack project that is complementary to Neutron project and addresses the challenges above and provides the base model to enable rich differentiated offerings.