It's time to share some of my experiences, crazy ideas, tips and tricks !!!

Monday, March 13, 2017

vSphere Replication Traffic Isolation

Once again I was working on a vSphere Site Recovery Manager project using vSphere Replication as a replication technology and had to explain how to isolate replication traffic from others network traffics, while explaining it is never a problem, I was wondering how many out there still have doubts about it, so creating this post is my way to try reach as many of you as I can.

But before we start we should ask ourselves, why isolate the replication traffic ?

Isolating the replication traffic might enhance the network performance in the data center because you isolate it from the other critical business network traffics, then we can apply individual prioritization and QoS methods, utilize a dedicated physical uplink or entirely network, enhance your monitoring and troubleshooting because you know exactly what is the purpose of the traffic and where they are flowing, security is enhanced as well as you don’t mix and match them. It’s all benefits ; )

The goal, again, is to isolate the replication traffic from other traffics.

Obs: I’m intentionally hiding other services like vMotion or VM’s network to keep it simple, but imagine they are all running on their own segments.

Now let’s setup the environment properly

ESXi Preparation:

The way vSphere replication isolation works is sending and receiving the replication traffic through some specific VMkernel adapters, as we will see bellow.

On each ESXi hosts, create 2 new VMkernel adapters, make sure to select the portgroup which correlates to the VLAN ID for the replication segment for each site and configure an IP address accordingly for the adapter. (don't forget to select the right service for each adapter)

******* Updated information 07/12/2017 **************I might be to conservative in here, creating one VMkernel for each traffic direction, while it allows more granular control it also might create some additional complexity and requires more effort setting up the environment.

The point is, if you want, you can create a single VMkernel and enable both services, got it ?***************************************************************

- One for outgoing traffic (vSphere Replication traffic)

- One for incoming traffic (vSphere Replication NFC traffic)

Obs: We are creating both VMkernels adapter on each host, so they can work bi-directional, which means, it could be a source of a replication but also a destination.

If you remember ESXi TCP/IP stack there’s not individual stack for replication, so it would use the default gateway (on management interface) to replicate the traffic to a routed segment, that’s not our goal.

In this case, we must add a static route to each and every ESXi host telling to use another route through the new VMkernel interface to reach the replication segment on the other site. (KB2001426) is a very nice KB on how to add static routes to ESXi hosts.

vSphere Replication Preparation:

vR comes with a single vNIC, which is used for everything, management traffic, which involves communication with vCenter and ESXi and coordination with other vRs as well, plus replication.

Since we want to isolate the traffic, we will add a second vNIC just for this purpose.

- First, shutdown the vR appliance;

- add a second vNIC;

- power it on and access it’s VAMI Console. (https://"vr_ip":5480)

- On the Network tab, select Address;

- Scroll down to the eth1 and add an IP of the replication segment according to the site;

As has been on ESXi, we don’t want to use the default gateway, on management segment, to send the replication traffic, so we need to add a static route to vR.

Since it's a Linux box we can add the static routes information to /etc/sysconfig/network/routes file.

- Restart network services.

Now with everything ready, let's see how the replication flows:

- Configure on VM for replication;

- Once an RPO is met, the ESXi hosting the VM which needs to be replicated sends it’s data through the VMkernel setup as vSphere Replication Traffic to the vSphere Replication Server on the destination site. (Dark Blue flow)

The vSphere Replication on the destination site receives and buffer the data and then sends it to the ESXi host on the destination site, which receives it through the VMkernel setup as vSphere Replication NFC traffic and saves the data to the destination datastore. (Red flow)

As we can see the traffic is isolated from management segment and hopefully from others.

Others can argue that you could create a dedicated segment for vSphere Replication traffic (outgoing) and other for vSphere Replication NFC traffic (incoming), but personally I believe, since it’s all replication traffic related, breaking it down further just adds complexity, so I like to keep it simple with a single segment for all replication traffic, incoming and outgoing.

We are done, you can start a replication and see if the traffic is following how it supposed to.

Hi Chris, you are absolutely right, If nothing is specified traffic will follow through the vmKernel assigned as management and use whatever the bandwidth is available to it. If it's your case I suggest leveraging NIOC to prevent impact on other network traffic types.

Post a Comment

Who am I

I’m an IT specialist with over 15 years of experience, working from IT infrastructure to management products, troubleshooting and project management skills from medium to large environments.
Nowadays I'm working for VMware as a Consulting Architect, helping customers to embrace the Cloud Era and make them successfully on their journey.
Despite the fact I'm a VMware employee these postings reflect my own opinion and do not represents VMware's position, strategies or opinions.
Reach me at @dumeirell