What is our primary use case?
We do a semiannual disaster recovery test, usually one in January and another in September, where we fail our entire company over to our Arizona DR facility. We run the business out of the Arizona location for the day. In order to be able to do that, the Zerto application allows us to migrate 58 machines over to that location and allows us to run our business from that location for the course of the day.
How has it helped my organization?
We are able to have a successful disaster recovery solution through using Zerto for our Disaster Recovery drills. We are able to fail over anytime, day or night, to run our applications out of our Arizona facility. Within a 15 or 20 minute time frame, we can have those application servers up and running in Arizona. It is just a huge help to have a successful, reliable disaster recovery solution that we know at any point in time, within 15 or 20 minutes, can be running out of a different location.
Most of the time, this is at least a two person job. Previously, when we had a disaster recovery drill it would take two of us working for three or four hours just getting applications up and running in Arizona. Now, for the disaster recovery drill, I'm able to finish my work in about 30 mins and be available onsite to help and assist anybody else as needed during the disaster recovery drill. Its ease of use and the ability to have a reliable solution for disaster recovery has become invaluable to us.
What is most valuable?
There is built-in active logging if needed for a longer retention period. If we fail a machine over and are just doing tests for it, we can fail it right back at the end of the failover without much issue. We couldn't do that with SRM. The ability to keep track within the activity log of what is going on with the VM, then fail it back prior to the one-hour time frame that we have set up without having to worry about it losing data during our tests or production failover drills.
The product is very easy to use. On a scale of one to 10, I'd say it's a nine as far as ease of use goes. In order to do an update in our old product (SRM), we basically had to take down almost our entire vCenter to be able to do the updates. Whereas, I can do updates to our Zerto product within 30 minutes to both our ZVMs in Massachusetts and Arizona. We haven't had problems troubleshooting after doing upgrades. Within five minutes, we can configure a whole new cluster solution and work on getting it synced out to Arizona.
It transfers up-to-the-minute files. Therefore, if something was to happen and the business was to go down Massachusetts due to a server failure, we could simply fire up those VMs in Arizona within approximately five minutes. The data protection level is top-notch. We haven't lost any machines, data, or VMs during the course of utilizing this product.
What needs improvement?
The alerting doesn't quite give you the information about what exactly is going on when an issue comes up. We do get alerts inside of our vCenter, but it doesn't give you accurate information on the error message to be able to tell us what's going on without having to go actually login into Zerto to determine what's causing the issue.
Another issue with the alerting is that it will pause a job. E.g., if we have something running from Massachusetts to Arizona, but a VM has been removed, updated or moved to a new location in vCenter. It literally pauses the VPG the VM resides in but will never give us a notification that it's been paused. Therefore, if we had an issue during the course of the day such as a power event and we needed to gain access to those VMs in some sort of catastrophe, we wouldn't be able to get access to them because that job was paused and were never notified about it being paused for whatever reason. It would therefore be a big problem if the VM was needed to be recovered and we didn't have those resources available.
It would be great to get more precise alerting to be able to allow us to troubleshoot a bit better. Or have the application at least give us a heads up, "A VPG job has been paused." Right now, it's sort of a manual process that we have to monitor ourselves, which is not a great way to do things if you have a superior disaster recovery solution.
For how long have I used the solution?
What do I think about the stability of the solution?
The stability is rock-solid. Nothing has gone down since we installed it; there has been no downtime.
Typically, once a quarter, we have an update. Last year we were at version 7.5, then we recently went updates to 8.0. On top of that, they release security patches and other things to improve bugs they find in the program. Right now, there is a U4 version that's out, which we will be updating to this quarter.
In the U4 version, there are security enhancements because a lot of zero-day issues that are being found in a lot of the applications. Zerto is making more security modifications and enhancements to the encryption between one location and another, so somebody can't hack your data and access it while it's in transition.
What do I think about the scalability of the solution?
Scalability is very easy. We are going through a POC right now because we want to branch out to the cloud. Just getting that set up and going through the process was about 60 minutes.
It's very scalable and extendable. We can do one to many solutions, as far as where our disaster recovery is going. This is what we wanted. We would never have been able to do that with our SRM product.
There are two engineers trained to use the product. I'm the primary contact for the application and do most of the work on the product. One of the storage guys handles a lot of the storage set up on the back-end with me. We have at least two people trained on each application that we have in-house. Both of us are in charge of making sure the application is up-to-date and doing what it's supposed to be doing.
How are customer service and technical support?
Zerto's technical support is very good. They are very reliable and always very pleasant to deal with. We've never an issue working with them. They usually come back with the precise solution to whatever we are troubleshooting.
Our issues are usually user self-inflicted. E.g., we remove a host out of the cluster to upgrade it or do something else with it and don't follow the correct procedure that's needed in order to be able to shut down the Zerto appliance correctly. If somebody doesn't follow that procedure, because they either don't know how, weren't aware of it, or just skip that step, then it causes problems inside of Zerto. This will pause jobs and the VPG will no longer be accessible on that host. Sometimes it's easy to get it back up and running again. Usually, when you put a new piece of hardware in the cluster that has a different set of parameters with its hardware, then the appliance will be missing because it was taken out with the old hardware. Usually, you need to get their technical support involved in order to be able to troubleshoot the issue with them to be able to get the VPG back online again on the new hardware. As I said its self-inflicted most of the time because steps are missed with our processes.
The documentation that we got from them was in depth and work well when needed, if you follow them correctly you will have success. If you don't follow the steps, that's when problems develop. Therefore, it's not a fault in their documentation, it's a fault of the user who's not following the proper steps for success. It doesn't happen often but I think we have contacted technical support only three times in the two years that we've had the product.
Which solution did I use previously and why did I switch?
For eight years prior to using Zerto we used to use a product called SRM, which is part of VMware. We finally switched over to Zerto after having them come in and do a presentation for us. This was after trying for about a year to do that and convince our vice president to allow us to migrate over to a different platform.
The reason why we used SRM was because SRM was built into our VMware vCenter licensing. We never had a successful DR test during the previous couple of years with SRM. By switching over to the Zerto product a year and a half ago, we were able to run a successful disaster recovery test within three months of switching over. We had our first successful disaster recovery tests in two and a half years because Zerto made our life so much easier and helped getting servers over to a new location almost seamlessly.
In order to be able to have a successful disaster recovery, we need to be able to successfully migrate 58 servers from our Massachusetts location to Arizona. On previous attempts, we got about half the stuff over there, then we'd fail. In other scenarios we would get everything over there but some of the machines wouldn't come up because of the way they were configured. One time, the business was down for about half the morning because it took us that long to get the stuff back up and running using SRM. This was a real pain point for us, getting this product in place and working successfully. It took Zerto to be able to finally get us to do that. It's been a lifesaver. All we had with SRM was nothing but headaches.
How was the initial setup?
The initial setup was very straightforward. We had everything running in half an hour. It got deployed with two virtual machines (ZVMs): One got deployed in Massachusetts and another in our Arizona location. From there, we deploy appliances to each one of the hosts that's inside of the clusters that we are managing for our disaster recovery solution.
Within 30 minutes, we had it deployed to our entire production cluster and the hosts in here. After that, we just started creating jobs, which took quite awhile to do because we have a lot of large servers. However, that's not the worry of the Zerto application, but the size of the VMs we have in production.
For our implementation strategy, we just mimicked what we had in place for our SRM environment. Our 58 machines are spread across different clusters: some in our DMZ, some in our prod and some in our WebSphere clusters. After that, we ran two tests to ensure that we were able to fail over to our Arizona location then fail back without any changes or modifications to the VMs. Once we did that, we started rolling out to each of the clusters, one Virtual Protection Group (VPG) at a time. I think we now have 23 VPGs total.
What about the implementation team?
We worked with an outside vendor (Daymark) who does a lot of our work through outside vendors. They work with Zerto directly. When we set it up originally, we had a Zerto technician on the call as well as a Daymark technician on-site working with us.
Our experience with Daymark has been very good. We love working with them and try to use them for our integration and infrastructure work. They are a very good company that are easy to deal with. We try to use them as much as we can. Thanks to Rick and Matt for a great working relationship.
What was our ROI?
We have seen huge ROI.
It used to be a three-person job, and now it only takes one person to manage and run the process. The fall back is the same thing. We've never had any issues with stuff coming back out of Arizona to our Massachusetts location. Within 15 to 20 minutes, we can have our servers successfully migrated back, then up and running just as they were originally without having too many conflicts or configuration issues.
The solution has helped us reduce downtime in any situation that we have come across, thus far, for disaster recovery at a 4:1 ratio.
We are an insurance company therefore, if we're down for an hour, it's thousands of dollars being lost. E.g., people can't pay their insurance bills, open new policies or get the support they need for an accident.
These things have been invaluable to us:
- Not having to have so many bodies onsite during a disaster recovery drill.
- Not having to worry about multiple people dealing with the application.
- The product's reliability of always being up and running and not having any issues with it.
What's my experience with pricing, setup cost, and licensing?
It's very equitable, otherwise we wouldn't do it. It's something that we utilize for the licenses per host used. Therefore, it's very cost-efficient as far as the licensing goes. For the amount of stuff that we have configured and what we're utilizing it for, the licensing is not very expensive at all.
There is a one-time cost for maintenance and support. We have a three-year contract that we will have to renew when those three years come up. There is also licensing on top of that for whatever product you are using it depending on the host configurations.
Which other solutions did I evaluate?
Right now, we use Veritas. We will be evaluating Veeam and Rubrik as a new solution for our backups in the next quarter or so, on top of the fact that we may decide to use Zerto. The three of them are in the mix right now for when we decide to switch over vendors for a better backup solution.
Zerto gives you the ability to utilize it as a backup solution, but it's not a true backup solution because it can't do file level backups. If you want a particular file off of a server, it can't do that for you. What it can do is give you the whole server, then you need to go back and pull that file off it. Mainly for that reason, we haven't chosen to use Zerto and may never use Zerto as our backup solution. The other solutions allow us to get a file level backup.
What other advice do I have?
Don't hesitate. Go out and do it now. Don't wait two years like we did. Push harder in order to be able to get the solution in place, especially since we know it will work better for you. Don't just take, "No," for an answer from senior management.
The application is phenomenal. They continually add new things, more plugins, and modifications to the way things work. It just gets better as they go.
We don't plan to use the solution for long-term retention at this time, but we are looking at going into a hybrid cloud solution in the near future which we may be using long-term retention for to make a duplicate copy of everything we have in our Massachusetts data center into a cloud solution. Whether it be an Azure or Amazon location on the cloud.
While I can't really speak to whether it would allow us to do it, the application is set up to create a duplicate of the actual servers in Arizona. That's how it works so quickly. If we ever had a problem, I could always revert back from the duplicates that we have out in Arizona using the application, if necessary. Luckily, we haven't had a need for that, and hopefully never do.
I would rate this solution as a nine (out of 10).
Which deployment model are you using for this solution?
On-premises
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.