Virtualization has made disaster recovery tremendously less complex for companies, large and small. In fact, it’s not a stretch to say that virtualization has made disaster recovery “possible” at many companies, simply because prior to virtualization it was technically overwhelming and financially prohibitive. Thanks to the containerization of servers that makes them hardware independent and the ability to track changed blocks of those virtual machines, VMs can easily be replicated to another site, run on dissimilar servers and dissimilar storage, and then restarted at the push of a button. Never before has disaster recovery been easier. However, you still need some tools to automate it and replicate the data. After all, DR isn’t just “built into the hypervisor”.

What About Replication?

Let’s start with getting your data (your virtual machines that contain your OS, applications, and data) from the primary datacenter to the secondary datacenter. Replication is what moves them from “point A to point B” but there are various ways to do it. Yes, you should still have those VMs backed-up and stored offsite, even if you are using replication. When considering disaster recovery tools for vSphere infrastructures, many admins have heard of VMware’s Site Recovery Manager (SRM), which helps you to create, test, and execute vSphere recovery plans. However, many admins don’t know fully what SRM does and does not do. SRM isn’t an all-encompassing disaster recovery solution. SRM has always been a great solution for reviewing what VMs you have, determining what is most important, determining what is dependent on what, creating a recovery plan based on that info, testing that recovery plan, and, should the time come, executing the recovery plan. What’s missing? Actually getting the data from the primary to the secondary datacenter is not part of SRM. If there is no movement of data then the VMs aren’t actually located at the recovery/secondary datacenter and would have to be recovered from offsite backup (which could take a long time and really negates the purpose of SRM’s DR orchestration).

For a long time, replication was only available in high-end storage arrays as an add-on feature – all of which was very costly. However, with virtualization more and more software-based replication options have become available. With traditional hardware-based replication in storage-area networks (SAN) being so costly, many companies are interested in today’s lower-cost and more easily implemented software-based replication solutions.

VMware offers vSphere Replication and Zerto offers Virtual Replication – both of which are software-based replication solutions for vSphere. vSphere Replication is included with vSphere Essentials Plus and above but has a number of limitations. First off, as its name states, vSphere replication does not include all the orchestration, testing, reporting and enterprise-class DR functions that you would find in SRM. Instead, vSphere Replication replicates individual VMs within or across vSphere clusters. Certainly it is better than recovering from an offsite backup but its recovery time and scalability (even when combined with SRM) may not be enough to satisfy your needs.

Zerto is sold on a per-VM basis and is compatible with vSphere Essentials and above. Zerto’s design is unique in that it is focused on protecting your company’s applications, not just VMs or LUNs. Zerto works by running a Zerto Virtual Manager (ZVM) that ties into vCenter as a plugin and manages the replication stack and Virtual Replication Appliances (VRA), deployed by ZVM, on each vSphere host.

The VRAs can see the writes performed by virtual machines on the host and then send those changes over the network (LAN or WAN) to the ZVM secondary site. Because of Zerto’s close integration vCenter and vCloud Director, it recognizes all vCenter constructs such as folder, resource pools, virtual networks, datastore clusters, as well as vCloud Director constructs like Orgs, OrgVDCs, Org Networks, & Storage Profiles.

Zerto vs vSphere Replication (focusing around data replication):

Zerto Replication

vSphere Replication

Generally recommended for:

Any size vSphere infrastructure

Small infrastructures, remote offices, and non-critical applications

Max number of VMs protected per appliance and vCenter Server:

500/5000

500/500

Linked clone and template support:

Yes

No

Physical RDM and Virtual RDM support:

Yes, virtual and physical mode RDMs are supported

No/Yes, Physical RDMs are not supported but virtual RDM are supported

RPO range:

Seconds

15 minutes to 24 hours

Application consistent?:

Supports Windows VMs using VSS and Linux with Application Quesicense

Only with Windows VMs using VSS

How VMs are chosen:

VMs can be organized into Virtual Protection Groups

VMs can be selected individually or multi-select but virtual protection grouping is not available

Automated failback support:

Yes

No, VMware recommends SRM

Allows you to plan, test, and automate failover and failback:

Yes

No, VMware recommends SRM

Compression included:

Yes

No, and neither does SRM

RE-IP addressing of virtual machines:

Yes

No, VMware recommends SRM

Cloning of recovery sites:

Yes

No

Point in time recovery:

Yes, up to 5 days with standard recovery, up to 1 year with extended recovery using the Offsite Backup feature

Yes, up to 24 snapshots

Compatible with vApps:

Yes

No

vCloud Director integration:

Yes

No

Snapshot-based?

No

Yes/No, VMware says that vSR doesn't use snapshots technology but does use a modified version of CBT, still many of the same snapshot-based limitations apply. For example, vSR uses VM snapshots at the recovery site when recovering to a different point-in-time.

What if I Add VMware’s Site Recovery Manager (SRM) to vSphere Replication?

While vSphere Replication may initially be appealing because it is included free with your edition of vSphere, most companies quickly realize that vSphere Replication lacks the DR plan design, testing, failover, and failback orchestrations that they require. Because of this, they usually consider using vSphere Replication with VMware’s Site Recovery Manager (SRM), or using SRM with array-based replication. SRM is a fully-featured disaster recovery planning, testing, and execution tool for vSphere environments. So how would Zerto compare to running SRM with vSphere Replication?

Provides planning, testing, and execution of disaster recovery for vSphere:

Yes

Yes

Designed for:

Zerto was designed for hypervisor-based replication AND disaster recovery orchestration

SRM was designed for disaster recovery orchestration only

Licensed:

Per-VM

Per-VM

Replication granularity:

Per-VM and/or Per- Virtual Protection Group

Per-VM or multi-select but virtual protection grouping is not available

Configure consistency groups (virtual protection groups):

Yes

No

Replication recovery points:

Yes, up to 5 days with standard recovery, up to 1 year with extended recovery using the Offsite Backup feature

Yes, up to 24 snapshots

Compatibility:

Zerto works with ESX 4.0 U1 and above. Zerto can replicate between different versions of vCenter.

vSR works with ESX 5.x and above. SRM requires the same version of vCenter and SRM be installed at both sites.

Managed with:

vSphere Client Plugin and stand-alone browser UI

vSphere Client Plugin

Replication is peformed with:

Zerto Hypervisor-based replication

vSphere Replication

As you can see, while SRM adds capabilities around the planning, testing, and execution of a disaster recovery, it can’t overcome the replication limitations of vSphere Replication. For this reason, most companies that use SRM have opted to use array-based replication (and they are paying a pretty-penny for it). Zerto, on the other hand, requires no specific arrays at each datacenter, no array replication licenses, no storage reorganization for DR, and it supports multi-site replication at no additional cost (for a comparison between Zerto and SRM with array-based replication, see this post).

Comments

Most of the last sentence paints an inaccurate comparison between Zerto and SRM.

“Zerto, on the other hand, requires no specific arrays at each datacenter, no array replication licenses, no storage reorganization for DR,”

All of the above is true for SRM in conjunction with vSphere Replication. The paragraph starts out acknowledging that SRM supports both array based and vSphere replication (a flexibility Zerto lacks) but the next sentence seems to discount SRMs vSphere Replication support from existence.

My last comment is that if you think customers are paying a pretty penny for array based replication, then with that in mind they are paying the King’s ransom and then some for SRM licensing. I’ll grant you that array based replication was at one time a costly proposition, storage costs have come down considerably while performance and capacity capabilities continue to rise. Even with vSphere or Zerto replication, it’s very likely that the source and destination storage will be in the form of some sort of shared storage array – which accounts for a significant enough portion of what’s referred to as “a pretty penny”. What’s not in that picture is the array based replication license between two dissimilar arrays.

just wanted to add a few comments, vSphere Replication is (as article correctly states) not a fully featured DR orchestration solution. If that is what you need then of course you can layer SRM on top of vSphere Replication to add this capability. Consider first what your requirements are. For many customers if you simply require point to point asynchronous storage agnostic vm level replication, or single site replication, or datacenter migration / collapse replication layer or even a remote office replication solution then vSphere Replication can be a great fit and is being used in this way by a lot of customers. more overview information here: http://pubs.vmware.com/vsphere-55/topic/com.vmware.vsphere.replication_admin.doc/GUID-C521A814-91E1-4092-BD29-7E2BA256E67E.html

There are some specific points I just wanted to provide some additional information on:

Physical RDM Support – Due to the way in which vSphere Replication tracks changes to vmdks (note we do NOT use changed block tracking technique used by our backup solutions even though some of the same terms do get interchanged) we do not support Physical Mode RDMs (Virtual are supported). vSphere Replication uses a filter within the vSphere kernel that is attached to each vmdk stack of the given vm being replicated. This allows us to very efficiently track the changes, mark the blocks and let the write continue (we don’t have to cache the changes) result here is minimal impact to source host when replication is enabled. Of course using this mechanism means if the RDM is physical IO would bypass the filter hence we do not support physical RDM’s with vSphere Replication.

RPO Range – vSphere Replication is asynchronous only so if your RPO requirements are sub 15 minutes then today vSphere Replication may not be for you. In this case solutions like Zerto and array based replication should be investigated.

Application Consistent – vSphere Replication can leverage VSS today though it is not required if you only desire crash consistency. Customers need to test VSS for themselves with their own workloads. In some implementations we have seen time taken for VSS to put the VM into a consistent state take longer than the RPO being attempted so if you consistently see RPO violations you may want to look inside windows to see how long VSS is taking. Note even if the time taken means vSphere Replication would violate the RPO for that transfer we will still complete the transfer, it will not be aborted, we simply generate alerts to inform the user there has been an RPO violation (basically the transfer at that time took too long i.e > RPO desired / 2) which should then be investigated.

How VMs are chosen – Replication only works whilst the VM’s are powered on, replication can be stopped/paused if needed or reconfigured if you decide to change configuration options such as which disks are replicated, how often, what the target is etc etc. vSphere Replication is per VM although I guess really its per vmdk of the VM.

RE-IP addressing of virtual machines – if you need ip addressing to be automated or built into a workflow SRM will be needed. If you are simply using vSphere Replication on its own then post recovery the VM will be left disconnected from the network. At that point you can either connect the VM(s) to the desired port groups manually (or you could script that) or you can use your own scripted vCenter customisation to flip the addresses. Other techniques i’ve seen implemented (customers are creative) include things like DHCP/Mac translations, in house scripts and even VM’s configured with two nics where each nic is preconfigured to one site and not the other (in terms of subnet/network properties) and the customer simply has the non-relevant nic disconnected until the VM is recovered to the site where that nic is relevant at which point that nic is attached at power on. Obviously some of those options work better and are more manageable in environments where you are replicating smaller numbers of VMs, as your estate grows you may want to look into orchestrated offerings like Zerto or SRM for example.

Snapshot based – see my comments about physical RDM support. vSphere Replication, well how we track changes to the VM’s, is not based on VM snapshots and is not using the change block tracking (CBT) framework that the backup API uses at all.

Configure consistency groups – to add consistency groups can be used with SRM if you are using array based replication only today.

Replication recovery points – see answer to “Point in time recovery”

Replication is performed with – as the article is about vSphere Replication this is correct however remember that SRM also supports an ecosystem of storage replication based solutions that can be used alongside vSphere Replication in the same implementation if required (to offer a service level approach). Note that you cannot (should not) attempt to replicate a VM using both techniques. A VM can be replicated by vSphere Replication or Array Based Replication but NOT both at the same time, do not attempt that as today that is not a supported use case.

I would like to ask a question regarding segregation of the replication traffic from the regular production traffic to our servers. So currently we are using SRM 5.1 to replicate through an enhanced MPLS circuit, but the replication traffic is taking up to 97% of the 46 Mbps bandwidth. Would it be possible with either product Zerto or SRM to segregate this traffic into another dedicated point to point circuit to our DR site? (please note that our DR site is also connected to the MPLS network through a 100 Mbps circuit). And if so, how can you accomplish this? through the application or through the network?

December 9, 2014

David Davis, VCP, VCAP-DCA

Likely there are a couple of different ways to accomplish this. 1) would be through the network – using policy-based routing / traffic filtering 2) would be through the applications themselves, perhaps selecting different IP destinations for SRM or Zerto traffic in the application that would take a different network path. it’s just something that the network person and virtualization person would have to sit down, look at the design, and discuss together. I’m sure it’s doable. Hope that helps! Thanks for reading!

How many vCenters are supported as Source and Target. Example I have requirement that 50 Source Sites all replicating to a single target in IDC. Is this possible with Zerto ?

December 29, 2015

Gregg Duncan

50+ source sites replicating to a single target is supported by Zerto.

October 10, 2016

David Chung

Little late to this blog but as a both VMware SRM and Zerto user, I can say Zerto is much more feature packed, all-in-one solution. Sure, SRM can work with storage replication* or vsphere replication but if you want a simple solution that replicates and orchestrates the recovery, I think Zerto wins out.

If you haven’t had these virtual machine problems yet – you will! These VM problems and their solutions have been selected based on the real-world feedback from thousands of VMware Admins and you’ll have two vExperts on the webinar to offer insight and remedies.