Some Experiments in OpenStack

As a little side project, I’ve been toying with OpenStack to replace ESXi on a couple of old servers I’ve got in a lab. Now that it’s up and running, I must say I’m enjoying the flexibility that comes from it. While the documentation is extensive and you can get a working OpenStack implementation working by simply following the documentation on their website, there are a few gotchas and things that certainly caught me out.

The first of these is making use of trunk ports to connect your VMs to the physical world. While you can successfully map onto sub interfaces created through the /etc/network/interfaces file, mapping onto the native port in order to get the native VLAN doesn’t work. Using tcpdump to monitor the traffic on the VM’s tap interface showed some of the 802.1q tagged traffic hitting the VM. The fix for this was to tag every VLAN I’m interested in and specifically tell the Linux Bridge module to specifically use the sub interfaces.

Another issue that can catch you out is firewalling. By default, ping and SSH from external networks is blocked by the security group the VMs are added to. To fix this, simply create a new security group with an appropriate ruleset for your environment.

However, the one issue that had me really stumped was causing problems with ARP. I’m making use of our existing DHCP infrastructure. You can do this by disabling DHCP on the subnet in OpenStack, you will run into issues. Specifically, you can see the VM complete the DHCP DORA process successfully and get an IP address. Great, but ping fails.

Digging in further, you can capture the traffic on the VM’s tap interface and see it responding to ARP requests.

Doing the same on the physical interface will show the ARP responses going missing!

ARP, Request who-has 10.12.12.123 tell osnode2, length 46

Turns out this is down to the port security feature in Neutron. Normally, it prevents VMs from spoofing addresses. In our case, as the OpenStack assigned address is not used (we’re using external DHCP), we need to disable the feature. To do that, simply run the following command for each network:

neutron net-update NetworkName --port-security-enabled=False

I’m not normally a fan of disabling security features. Even the idea of turning off SELinux rather than tuning it feels like a cop out to me. However, in this controlled case, I’m not overly concerned about disabling it. After all, this isn’t a public cloud.

That said, it also has another sizable side-effect. The firewalling that OpenStack provides has to be disabled for any instances on the subnet in question. That means removing it from any security groups or it will fail to deploy successfully.

On a more positive networking note, VXLAN just works. So long as each compute node has it enabled, and you provide a large enough VLI range (think VLANs), it’ll do its job for any virtual networks you might want to create.

And moving on from networking to storage, I did run into a real issue with Cinder for block storage. In my first attempt it was either really picky about LVM Volume Group (VG) names (I may have missed a change somewhere!) or doesn’t like sharing the same VG the system is currently running/booted from. Sticking with the naming convention in the documentation and using a separate VG seemed to work much better.

The lack of redundancy built in to cinder does concern me slightly. While I understand that you want to build this in at a lower level (i.e. making storage redundant with a SAN/replication arrangement), it would be something to seriously consider on a real world deployment. Though in that case, I get the impression that the object storage approach is probably the better option.