The automatic activation of the storage volumes and bringing up all VM’s. The recovery plans and resource mapping are good features.
With the resource mapping we can use different networks to bring up our VM’s
The automatic activation of the storage volumes and bringing up all VM’s. The recovery plans and resource mapping are good features.
With the resource mapping we can use different networks to bring up our VM’s
We can easily create a test environment with the SRM testing functionality, also in combination with EMC recoverpoint we can start up a virtual machine to a given point in time.
Automatic guest customizations through a GUI is a missing feature. Also the new version of SRM is web-only and is not that stable as the old client version.
We had couple of times that the web interface crashed and needed to restart the services.
7 years.
No issues encountered.
No, but the new web client is sometime ‘buggy.'
No issues encountered.
We needed to use customer service a couple of times while upgrading versions but all went fine.
Technical Support:8/10
No previous solution used.
It wasn't complex.
We used a vendor team and their expertise was good.
This product makes it easy for us to have a managed DR. Also the point in time functionality to bring up vm’s is a cost-saving method for us during the upgrade / testing period.
No other options evaluated.
Read and follow the admin guide, otherwise when you miss a step you can search long time. Also use the correct storage adapter of your storage vendor.
Disaster recovery of a failed site.
None, I implement this product for customers.
Storage integration.
5 years.
The integration (SRA’s) with storage and it’s versions.
None encountered.
None encountered.
9 out of 10.
Technical Support:9 out of 10.
Just a regular backup product. But this wasn't a disaster recovery solution.
A SRM implementation is not difficult. Getting the business requirements is more of a challenge.
I was the implementor.
For most customers about 2 years.
I don’t know as it depends on the customer.
Nope, I was asked to implement this solution.
Know your business requirements.
Instant VMware integration and a wide integration with storage vendors.
The company was able to provide customers with the required RPO that was not possible with traditional backup solutions.
Stability, speed.
4 years.
Bugs bugs bugs...
Sometimes the vm would not failover.
No backwards compatibility of different versions, this made scaling a bit difficult as we needed to upgrade existing infrastructure all the time.
3/10, VMware support was not really proficient in the product, as most of the time we would give them the solution.
Technical Support:3/10, VMware support was not really proficient in the product, as most of the time we would give them the solutio
No.
The integration with EMC recoverpoint was problematic in the beginning, with vm's not failing over all the time and slow recover times.
Yes, Zerto, Veeam.
POC and test.
Centralized recovery plans for thousands of VMs, Non-disruptive recovery testing, Automated DR workflows.
Lowers the cost of DR management, Eliminates clexity and risk of manual processes, Enables fast and highly predictable RTOs.
In my opinion if Vmware added some function to detect the business critical applications like oracle, exchange to help monitor these applications for disaster recovery .
7 years on many international projects.
In the earlier versions I had some issue, however all of them resolved now.
No issues with stability.
No issues with scalability.
Excellent, I had some issues for trouble shooting which was far from my knowledge and vmware customer service remotely solved the problem.
Technical Support:Excellent
Yes, I used other products like Storage replications or some other software like "double take.” The problem with storage replication was that it was so risky and unstable to manually bring the application up on DR site, besides taking more time to restore.
Other software, like double take, we needed to do lot of effort on each application separately which makes the solution more complex.
In some basic installations, it is very straightforward, but for enterprise customers it makes sense to do some extra steps to protect applications and boot order.
Both, In my experience vendor teams like HP, EMC or net app, didn’t have much experience with this product, especially for the last 5 years, I mainly have to help them understand the solution.
Based on average of downtime cost on DR and how automation can help to bring the business on, SRM can reduce the cost nearly 50 percent; moreover you don’t need to have SAN storage on DR.
Setup cost was based on number of vms and protection plan, and if communication DR site has no any issue, within two weeks all setup can normally be finished and cost is around $300- $350 per day.
For some customer who want to protect small number of of applications, I will recommend to go with vendor disaster recovery solution, like Oracle data guard for oracle DB or Microsoft exchange replication or SQL log shipping for Microsoft SQL products.
Vmware SRM can handle all of the challenge of replication and disaster recovery in a simple way.
Ability to support DR.
We have a fully functioning DR environment.
The setup documentation needs to be improved greatly.
Only the fact that I had scour the Internet to find the proper setup information.
No issues encountered.
No issues encountered.
Average.
Technical Support:Well the tech I spoke to did not know how the setup should be implemented, until I found the literature myself.
Did not use a previous solution.
Complex - settings for the sql db were not documented - had to search a lot for those.
I implemented the product.
Very good - we have a DR platform that we've tested with a live production workflow.
No others were evaluated.
Gather the proper setup information.
The recovery plans and customization options. I also the love the re-protect feature for easier fallback.
We were able to migrate 100 virtual servers to our St. Louis office with very little downtime.
I would like to see the database come built-in to the product. I would also prefer to the SRM server come as a virtual appliance instead of a windows vm. I would also like to see a way to control bandwidth usage.
3 years.
No, the process is very well documented.
No, but you need to have a good understanding of your bandwidth between sites. SRM does not have a native way to throttle bandwidth usage.
No.
Excellent.
Technical Support:Excellent.
We looked at various types of solutions like Veeam, DoubleTake and Neverfail.
It was very straightforward.
In-house.
We were able to reduce the cost of our overall DR infrastructure while increase our recovery time. We are seeing a 25% return.
Yes, Veeam, Neverfail and DoubleTake.
Take the time to understand your change rate of your servers, your bandwidth capabilities and the recovery objects for your DR plan.
Most VMware administrators have heard of Site Recovery Manager (SRM). SRM has been the standard in disaster recovery for some time. It plays into VMware’s parent company’s (EMC) product line, traditionally leveraging storage based replication. This architecture leverages write journaling technology we spoke of in our first article in the series, so Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) could be very aggressive.
The down side to this architecture is that the customer has to have similar storage arrays at both the production and disaster recovery site. If for example the customer had a fiber channel array on the production side, and a lower grade NFS array from a different vendor on the other side SRM was not compatible Bummer…
VMware however released vSphere replication in the vSphere 5 family suite and allowed administrators to replicate their virtual machines without common storage subsystems. What this means is that you could have your traditional fibre channel SAN on the production side, and NFS, or internal storage on your disaster recovery site. The underlying storage type is completely irrelevant as long as the workload is supported. This is a gift for DR budgets everywhere. Additionally you can recover to previous points in time using snapshots at the recovery site much the same as you would use a traditional snapshot.
SRM in thie configuration sits on top of the vSphere replication instead of RPAs that are common in array to array based architectures. These replication appliances are Linux virtual machines that are deployed in the VMware environment. I will give VMware a large amount of credit here, where some competing technologies are cumbersome to install, vSphere replication installation takes only a few mouse clicks. Your vSphere replication appliances are functional in just a few minutes. Replication can be configured through the VMware fat client or the web client.
So what’s the catch? vSphere replication would fall into the snap and replicate category. This means that RTOs and RPOs wont be as aggressive as with array to array based replciation, or hypervisor technologies that use write journaling. The current RTOs and RPOs that can be achieved by vSphere replication with SRM over vSphere replication is 15 minutes. There are rumors that this will be coming down to 5 minutes in the future, but it’s only a rumor at this point. Also if you are trying to move to the web client then you will dismayed to learn that SRM can still only be managed through the VMware fat client. I don’t know to many administrators that are excited about the web client, but it’s a relevant piece of information for your day to day work.
So what about the licensing and additional costs? There are pros and cons to the vSphere replication / SRM model.
The virtual appliances are Linux based – pro
This means there aren’t additional Windows licenses required to operate the environment. Some of the other products use Windows based virtual appliances. When you have to stand up more Windows servers you have to patch and manage them, this adds to the cost of the solution. SRM can generally be installed on your Windows system that vCenter runs on. If you’re using the Linux based vCenter appliance SRM isn’t compatable. I would expect this to be resolved soon as VMware is trying to eliminate the need for Windows systems in the environment.
The base vSphere replication is free – pro
Yes you heard that correct, vSphere replication is free. If you have lower priority virtual machines you don’t have to buy SRM licenses. This means you can save money and buy only the SRM licenses (sold in packs of 25) for your mission critical VMs.
SRM is the orchestration tool on top of vSpherer replication – nutural
SRM and all of it’s power can be scoped down to only the systems you need it for. I personally like the flexability and choice, most companies don’t need to replicate all of their virtual machines with very tight RTOs and RPOs. If you are trying to replicate your entire VMware environment, you maybe better off with a solution that licenses by socket as it maybe more cost effective.
Snap and replicate technology – con
At the end of the day snap and replicate technologies are limited. Because the recovered virtual machine ends up with snapshots scalability can be an issue. Let’s look at an example.
VMware recommends that you only have 21 snapshots at a maximum using vSphere replication. More snapshots than this can lead to snapshot consolidation issues. If you wanted to have a recovery point every hour, you wouldn’t be able to recover your virtual machine to a point further back than 21 hours. This a limitation of any snaphost based replication technology not a defiency with in SRM or vSphere replication.
Scalability – neutral
The upper limit to SRM with vSphere replication is 1000 virtual machines. This will suit most enterprises; however, for very large scale deployments this may not be enough. SRM with storage array replication for example can support up to 1500 vitual machines. This limit is roughly about what you would get with any other snap and replication technology. In my personal experience Veeam starts to have problems after 300 virtual machines in a single instance.
Speaking of Veeam this is the next technology that we will discuss. Veeam is a good product that not only provides DR capabilities, but also a very mature backup solution. Join us for our next article in the series.
Originally published here: https://simplecontinuity.com/dr-for-vmware-srm-on-vsphere-replication/
Disaster recovery planning is something that seems challenging for all businesses. Virtualization in addition to its operational flexibility, and cost reduction benefits, has helped companies improve their DR posture. Virtualization has made it easier to move machines from production to recovery sites, but many of the disaster recovery tools today still function at the storage layer. Legacy technologies like storage array snapshots, and LUN based replication restrict the configuration options of upstream technologies like VMware Storage DRS. If you wanted to replicate a virtual machine you had to replicate the entire LUN is resided on. You weren’t free to leverage Storage DRS for its automated performance balancing features because a VM could be migrated from one LUN mucking up your storage based replication.
Fortunately over the past few years there’s been great advancement in hypervisor based replication technologies. There’s a wealth of competing products vying for customer attention. As always competition drives innovation and value for the consumer. This will be the first of a 4 part blog series that looks at various hypervisor based disaster recovery products. Note this isn’t a review of backup products which is a separate category, we are looking at products specifically designed to assist companies in a disaster scenario.
Before talking about products; however, we should understand their underlying architectures, and how it relates to their storage based predecessors. Like storage based technologies hypervisor based replication technologies currently come in two flavors:
Snap and replicate
Write journaling
These technologies should be very familiar to storage administrators. Write journaling is a newer technology, and the market leader is currently EMC’s Recover Point product. Different storage arrays all have slightly different terms for snap and replicate technologies, but the principals are the same. It’s important to understand this because the technologies will dictate how tightly you can define your recovery time objectives (RTOs) and recovery point objectives (RPOs).
First we will cover snap and replicate technologies. Snap and replicate at the hypervisor level works similarly to its storage counterpart. Instead of taking a snapshot of a storage LUN on a scheduled basis VMware takes a snapshot of the virtual machine’s disks on a scheduled basis. This allows products to copy those disks off of the primary storage media to a secondary location. A nice benefit about using VMware snap and replicate technologies is that you can use completely different types of storage systems on the product and DR systems. You can you and enterprise class SAN in the production datacenter, and internal storage if desired at the disaster recovery location. As long as the storage subsystem is supported by VMware, and has the proper performance characteristics the technology works. Typically a technology called change block tracking keeps track of any data that may change during the backup window.
Write splitting is the second technology we will examine. Like snap and replicate technologies write splitting at the hypervisor level doesn’t require the same storage type at the primary and secondary sites. Write splitting at the hypervisor level is a fairly new technology, but it’s been developed by the same team that developed write splitting at the storage layer. When I evaluate a technology I like to know there’s a history of success from the team that’s created it.
Virtual machine write journaling works differently than storage based write journaling. Instead of having a physical appliance that sits in front of your storage arrays the write splitting occurs inside the ESXi kernel. Because the technology is splitting every write there are some significant technical benefits. As a general rule snap and replicate technologies can in best case scenarios only achieve 15 minutes RTOs and RPOs. White journaling under best case scenarios can deliver RTOs and RPOs from 5 to 10 seconds.
While there is certainly an RTO and RPO benefit to the write journaling technology there are other things to consider. Hero numbers are great for the marketing team, but anyone who’s worked in operations knows what really matters about the product generally isn’t on a spec sheet. All of the products we will talk about work differently, but they all seek to achieve the same result. The supporting infrastructure and associated management costs for all of these products is critical.
Every technology we’re examining works on a management server / replication server architecture. Some of these packages use Windows proxies while other products use Linux based proxies. Consider if you’re planning a massive DR project what if there are dozens of Windows licenses you have to account for, time to patch and manage those virtual machines, etc. If you fall into the scope of PCI you will most likely be required to manage anti-virus, and some sort of log monitoring on all those windows servers; whereas, on Linux systems anti-virus is more of an “option” according to PCI. Also Linux has native syslog capabilities built in whereas Windows does not. All of these factors can add to or reduce the total cost of ownership of a disaster recovery product.
Through the rest of this series we will look at three products that are the leaders in the disaster recovery space for VMware.
VMware SRM running (on top of vSphere replication)
Veeam Backup and Recovery
Zerto Virtual Replication
Without saying another factor to consider is price for the solution. Generally the tighter the RTO and RPO the solution provides the more expensive it will be. However list pricing isn’t always cut and dry when considering total cost of owner ship added to the cost of potential gains in RTO and RPO. In addition various software vendors pricing models lend them to a specific virtual machine configuration. If you have a virtual environment with fewer larger servers product X maybe more favorable from a cost perspective. If you have a virtual environment with smaller server product Y’s pricing model maybe more favorable.
View the above chart of the quick and dirty of the three technologies we will be diving into over the next few weeks in our series.
Disaster recovery is a challenging project, but thankfully there are more options than ever for businesses to select from. Many of them are technically sound and will accomplish business goals. Many times it comes down to selecting the right architecture and price model for your business.
Originally published here: https://simplecontinuity.com/disaster-recovery-for-vmware
Nice article - I recently have been looking at Vsan as a viable option for lab POC. Some DRaaS customers have a need to replicate/recover specific workloads outside of the SRM protected groups so they can control failover testing. In real world I do not see many customers using vsphere native replication in conjunction with SRA San layer replication. Vsan requires 3 host Minimum and works with vsphere replication.
Vsphere replication nice free to use pro for sure. Limited use cases as far as enterprise production recovery. Perhaps vsphere replication and vsan combination is low cost future of DRaaS?