Comparing OpenStack Neutron ML2+OVS and OVN – Control Plane

We have done a lot of performance testing of OVN over time, but one major thing missing has been an apples-to-apples comparison with the current OVS-based OpenStack Neutron backend (ML2+OVS). I’ve been working with a group of people to compare the two OpenStack Neutron backends. This is the first piece of those results: the control plane. Later posts will discuss data plane performance.

Control Plane Differences

The ML2+OVS control plane is based on a pattern seen throughout OpenStack. There is a series of agents written in Python. The Neutron server communicates with these agents using an rpc mechanism built on top of AMQP (RabbitMQ in most deployments, including our tests).

OVN takes a distributed database-driven approach. Configuration and state is managed through two databases: the OVN northbound and southbound databases. These databases are currently based on OVSDB. Instead of receiving updates via RPC, components are watching relevant portions of the database for changes and applying them locally. More detail about these components can be found in my post about the first release of OVN, or even more detail is in the ovn-architecture document.

OVN does not make use of any of the Neutron agents. Instead, all required functionality is implemented by ovn-controller and OVS flows. This includes things like security groups, DHCP, L3 routing, and NAT.

Hardware and Software

Our testing was done in a lab using 13 machines which were allocated to the following functions:

Test Configuration

The tests were run using OpenStack Rally. We used the Browbeat project to easily set up, configure, and run the tests, as well as store, analyze, and compare results. The rally portion of the browbeat configuration was:

This configuration defines several scenarios to run. Each one is set to run 500 times, at three different concurrency levels. Finally, “rerun: 3” at the beginning says we run the entire configuration 3 times. This is a bit confusing, so let’s look at one example.

The “netcreate-boot” scenario is to create a network and boot a VM on that network. The configuration results in the following execution:

Run 1

Create 500 VMs, each on their own network, 8 at a time, and then clean up

Create 500 VMs, each on their own network, 16 at a time, and then clean up

Create 500 VMs, each on their own network, 32 at a time, and then clean up

Run 2

Create 500 VMs, each on their own network, 8 at a time, and then clean up

Create 500 VMs, each on their own network, 16 at a time, and then clean up

Create 500 VMs, each on their own network, 32 at a time, and then clean up

Run 3

Create 500 VMs, each on their own network, 8 at a time, and then clean up

Create 500 VMs, each on their own network, 16 at a time, and then clean up

Create 500 VMs, each on their own network, 32 at a time, and then clean up

In total, we will have created 4500 VMs.

Results

Browbeat includes the ability to store all rally test results in elastic search and then display them using Kibana. A live dashboard of these results is on elk.browbeatproject.org.

The following tables show the results for the average times, 95th percentile, Maximum, and minimum times for all APIs executed throughout the test scenarios.

API

ML2+OVS Average

OVN Average

% improvement

nova.boot_server

80.672

23.45

70.93%

neutron.list_ports

6.296

6.478

-2.89%

neutron.list_subnets

5.129

3.826

25.40%

neutron.add_interface_router

4.156

3.509

15.57%

neutron.list_routers

4.292

3.089

28.03%

neutron.list_networks

2.596

2.628

-1.23%

neutron.list_security_groups

2.518

2.518

0.00%

neutron.remove_interface_router

3.679

2.353

36.04%

neutron.create_port

2.096

2.136

-1.91%

neutron.create_subnet

1.775

1.543

13.07%

neutron.delete_port

1.592

1.517

4.71%

neutron.create_security_group

1.287

1.372

-6.60%

neutron.create_network

1.352

1.285

4.96%

neutron.create_router

1.181

0.845

28.45%

neutron.delete_security_group

0.763

0.793

-3.93%

API

ML2+OVS 95%

OVN 95%

% improvement

nova.boot_server

163.2

35.336

78.35%

neutron.list_ports

11.038

11.401

-3.29%

neutron.list_subnets

10.064

6.886

31.58%

neutron.add_interface_router

7.908

6.367

19.49%

neutron.list_routers

8.374

5.321

36.46%

neutron.list_networks

5.343

5.171

3.22%

neutron.list_security_groups

5.648

5.556

1.63%

neutron.remove_interface_router

6.917

4.078

41.04%

neutron.create_port

5.521

4.968

10.02%

neutron.create_subnet

4.041

3.091

23.51%

neutron.delete_port

2.865

2.598

9.32%

neutron.create_security_group

3.245

3.547

-9.31%

neutron.create_network

3.089

2.917

5.57%

neutron.create_router

2.893

1.92

33.63%

neutron.delete_security_group

1.776

1.72

3.15%

API

ML2+OVS Maximum

OVN Maximum

% improvement

nova.boot_server

221.877

47.827

78.44%

neutron.list_ports

29.233

32.279

-10.42%

neutron.list_subnets

35.996

17.54

51.27%

neutron.add_interface_router

29.591

22.951

22.44%

neutron.list_routers

19.332

13.975

27.71%

neutron.list_networks

12.516

13.765

-9.98%

neutron.list_security_groups

14.577

13.092

10.19%

neutron.remove_interface_router

35.546

9.391

73.58%

neutron.create_port

53.663

40.059

25.35%

neutron.create_subnet

46.058

26.472

42.52%

neutron.delete_port

5.121

5.149

-0.55%

neutron.create_security_group

14.243

13.206

7.28%

neutron.create_network

32.804

32.566

0.73%

neutron.create_router

14.594

6.452

55.79%

neutron.delete_security_group

4.249

3.746

11.84%

API

ML2+OVS Minimum

OVN Minimum

% improvement

nova.boot_server

18.665

3.761

79.85%

neutron.list_ports

0.195

0.22

-12.82%

neutron.list_subnets

0.252

0.187

25.79%

neutron.add_interface_router

1.698

1.556

8.36%

neutron.list_routers

0.185

0.147

20.54%

neutron.list_networks

0.21

0.174

17.14%

neutron.list_security_groups

0.132

0.184

-39.39%

neutron.remove_interface_router

1.557

1.057

32.11%

neutron.create_port

0.58

0.614

-5.86%

neutron.create_subnet

0.42

0.416

0.95%

neutron.delete_port

0.464

0.46

0.86%

neutron.create_security_group

0.081

0.094

-16.05%

neutron.create_network

0.113

0.179

-58.41%

neutron.create_router

0.077

0.053

31.17%

neutron.delete_security_group

0.092

0.104

-13.04%

Analysis

The most drastic difference in results is for “nova.boot_server”. This is also the one piece of these tests that actually measures the time it takes to provision the network, and not just loading Neutron with configuration.

When Nova boots a server, it blocks waiting for an event from Neutron indicating that a port is ready before it sets the server state to ACTIVE and powers on the VM. Both ML2+OVS and OVN implement this mechanism. Our test scenario measured the time it took for servers to become ACTIVE.

Further tests were done on ML2+OVS and we were able to confirm that disabling this synchronization between Nova and Neutron brought the results back to being on par with the OVN results. This confirmed that the extra time was indeed spent waiting for Neutron to report that ports were ready.

To be clear, you should not disable this synchronization. The only reason you can disable it is because not all Neutron backends support it (ML2+OVS and OVN both do). It was put in place to avoid a race condition. It ensures that the network is actually ready for use before booting a VM. The issue is how long it’s taking Neutron to provision the network for use. Further analysis is needed to break down where Neutron (ML2+OVS) is spending most of its time in the provisioning process.