As promised, this is the next instalment in my Veeam posts,
specifically on Remote Backup (backup to a remote site across the WAN).
This has been a learning experience even though I already had one
site doing remote backup for almost a year. The goal with this
particular project is to leverage as much goodness as possible out of
Veeam v7 for our customer. This meant learning how to make best use of
some of the features of Veeam that are either new to v7 or are things I
hadn’t really looked at in the past. This particular customer has Veeam
Enterprise Essentials, licensing that lines up with their VMware
vSphere Essentials licensing. They do NOT have the WAN Accelerator
goodness that comes with the PLUS versions so it will be interesting to
see how we do with traffic over the WAN.
The very first thing that I have learned from this project is to
ensure the underlying VMware ESXi environment is healthy (or Hyper-V if
that is your platform). An obvious statement but something that can
bite you in the ass if you are not careful. And I got bit in the ass
much to my chagrin. I had all sorts of weird Veeam issues happening
until I sorted out the underlying ESXi issues which mostly revolved
around problems with VMware Tools in the guests. All problems were
solved by upping the patch level on ESXi and refreshing VMware Tools in
the guests.
The next thing I learned was the Veeam documentation can sometimes be
a bit “thin”. I needed to use the seeding function for backups which
is NOT the same as the seeding function for replicas, at least from a
process point of view. My perusal of the docs seemed to indicate that
to seed a backup you need to copy a FULL backup (a Veeam .vbk file) into
the remote repository (created when you create the remote Veeam
agent). You then scan the remote repository in the Veeam console which
should “import” the backup into the overall Veeam config, then you can
use the “map backup” function to point at that backup as the seed data
and you are good to go. Well, yes and no …
I tried all of that and sneaker-netted my VBK file over to the remote
site, copied it into the repository then went around in 27 circles
trying to get the blasted thing to scan and import. Epic fail.
Nothing. I finally gave up and contacted my good buddy at Veeam, Matt
Price, for some advice. It seems I was doing about 75% of what I needed
to do. Matt directed me to create a net-new backup for each of the
VM’s I want to have backed up to the remote site because once data is
seeded at the remote site the backup job would be repointed (remapped)
to the remote site. That way there would be no conflicts with the
current backup job that will continue to run at the main site. He also
advised me to set the remote job as a “forward incremental” with a
weekly synthetic backup so that we end up with a full backup at the
remote site and keep the repository somewhat pruned. So that was “ah
ha!” moment number 1. “Ah ha!” moment number two was when Matt pointed
out that you have to copy both the .VBK file AND the metadata file to
the remote repository in order for the rescan and import to work.
So, I followed Matt’s instructions to a “T” and created my two new
jobs. After they ran I copied the data via sneaker-net out to the
remote repository and rescanned the repository. Voila! Things
rescanned properly and the backups imported properly. I then modified
each of the two jobs and used the “map backup” function to point to the
remote repository backup. This effectively “seeded” the backup and I
set the jobs to run. I ended up with a 50/50 status … one job ran
perfectly and the other messed up.
The job that ran perfectly was for the SBS 2011 VM and it copied
about 8GB of changed data out over a roughly 4 hour period (about par
for the 5 megabit uplink provided by Shaw). I have to play with this
backup some more in order to decide if it is going to be a daily run vs a
weekly run but I am happy with the results so far.
The job that messed up seemed to mess up, once again, due to issues
with the underlying VM. The Veeam job seemed to cause issues with the
VM’s network stack although I cannot say why as the VM backs up properly
when running the local Veeam job. I killed the running job as it had
slowed to a crawl, fixed the network issues then tried re-running the
job, a couple of times, as it turns out. When I finally got networking
sorted out it seemed like I had totally messed up the remote “seed”
backup as Veeam appeared to be trying to send ALL data back out to the
remote site rather than just the changed data that I would have expected
since the seed backup was taken. I have killed the whole job, deleted
the job data (in local repository as well as remote repository) and am
trying again form scratch. I’m hoping that I have fixed all of the
underlying problems with the VM and that this job will now work as well
as the job for the other VM.
So, there you have it. I’ll blog the results of my (hopefully) fixed second VM job as things progress.
I’ve managed to get a lot farther with the backup process. All of
the issues with the second VM have been fixed and the remote backup of
that VM has been successful. Throughput for it is the same as that for
the first VM so I’m pretty happy overall.
A few things learned along the way (above and beyond what was Iisted in the last update):
- The load placed on the WAN connections can be pretty noticeable. In
this case, the remote site is at an owner’s house and he definitely
knows when the backup is running as his Internet connection (from
machines in the house) becomes pretty “pokey”. I’m assuming the same is
similar at the source end of the backup, as well. It might pay to have
multiple ISP feeds to support remote backup if you are planning to move
any amount of data. At the same time, it might also pay to have a
separate backup network established at the source site in order to keep
performance disruptions to a minimum on the production LAN.
- Veeam can gracefully recover from a failed backup run, even on first
run of a “seeded” backup. My backup today was disrupted at the target
end, probably when house users “messed” with things on the house LAN
during backup. The “retry” feature of the backup worked well,
thankfully.
- 5 Mbps as an upload speed doesn’t cut it anymore (are you listening
Shaw, Bell, Telus and all the other Canadian ISP’s?). We need better
upload speeds, specially for this kind of application. The upload speed
will be the limiting factor in any remote backup/replication scenario.
While Veeam has all sorts of tricks to help “slim” the datastream, the
bottom line is there is only so much data that can be crammed through a
tiny pipe over a given time period. I am truly envious of the speeds my
American cousins usually enjoy on their connections. So, the extra
cost of Veeam Enterprise Plus will probably be worth it for any of us
that have slowish connections and/or large amounts of backup/replication
traffic.
That’s it for now. I’m going to twiddle and tune the backup and will
post another update once I have the schedule and such figured out.
I’ve had this remote backup in place now for about a month and a half
and the results have definitely been “mixed”. In fact, I have shutdown
the remote backup at this point pending moving the customer to a trial
of the Enterprise Plus product that enables the WAN acceleration feature
of Veeam.
I guess I should back up a bit …
The customer has standard Shaw Business service in the office and the
same service at the owner’s house (remote backup site). Upload speeds
on these services are 5Mbps max (burst) with, probably, closer to 4Mbps
sustained being the norm. So, in practice, this really means that the
most that could be transferred out over the WAN connections is about 4 ~
5 GB per hour. This is not a large amount in terms of backup, even if
Veeam is doing its dedupe and compression “thing” to make the data
transfer as small as possible.
I have seen some surprisingly large transfers take place on both the
SBS server VM as well as the LOB app VM and the size does not
necessarily correlate with what we have seen with data updates on the
servers. Apparently, other factors can also affect the actual “changed
blocks” that Veeam tracks on a VM including A/V scanning activity
(!!!). The net result is we have had a lot of job failures due to
timeouts or other network issues related to the sheer volume of data
being sent out over what amounts to a very small pipe.
This is by no means a slam against Veeam; rather, it is a cautionary
reminder that there is a lot more going on “under the covers” than you
might think and it highlights the reason why Veeam makes such a big deal
about the new WAN acceleration feature. I’m hoping that it will allow
me to overcome the inherent limitations of the “thin” Internet
connection. It also serves to highlight an issue with the whole concept
of “Cloud backup” (meaning backup over the Internet to wherever) in that
you need bandwidth to make it all work. Regardless of the
service or the technology used, in the end it all comes down to how much
data you can squirt out over your WAN links in a given timeframe. All
of the “tricks” used by programs/services like Veeam to thin down the
data stream are all well and good but if it comes down to you having to
move 20GB of data in a specific timeframe and you can only move 10,
well, you have a problem.
Anyway, I’m going to bump them up to Enterprise Plus and we’ll see how it all goes. Stay tuned for the next update!
After installing Veeam Enterprise
Plus and enabling the WAN Acceleration feature the size of the backups
did start to drop but not enough to overcome the limitations of the
single Shaw “pipes” at either end of the installation. My customer was
seeing Internet performance “crawl” at each location while backups were
running and, because we couldn’t get backups to “thin out” enough to run
to completion within the defined backup windows, we decided to shutdown
the remote backups for now. And this was also the case at the second
installation where we had hoped to do replication to a remote DR site;
the single Shaw pipes at each end were just not sufficient to handle the
traffic generated by Veeam as well as all of the other production
traffic.
I want to state emphatically that this is not a failure of Veeam; not
in any way or any shape. Rather, it is a stark reminder that
“inexpensive” low-bandwidth pipes can’t hope to meet the growing data
transmission needs of many organizations. Bandwidth “rules” and it will
make or break a project like the ones I’ve been attempting.
We are going to look into the possibility of installing additional
Shaw pipes to handle only the Veeam traffic as the process itself works
and there is massive value in having backup data automatically “live” in
multiple locations. And we are going to pray to the network gods that
Shaw will actually rollout the long-promised upgrade in bandwidth and
upload speeds that we have all been waiting for for oh so long.
If you have been following my blog over
the last little while you will know that I have faced a few challenges
with Veeam in terms of setting up remote jobs (backups across the WAN).
Well, I think I finally have some good news.
I was originally trying to do full Veeam backups of ESXi VM’s across
the WAN meaning I was trying to perform a “normal” Veeam backup across
the WAN. I had managed to do this with a particular customer’s VM under
Veeam 6.5 and I thought the same process would work for others but I
kept hitting roadblocks. At first I thought it was something to do with
changes in Veeam 7.0 but it turns out to that it was far more
complicated than that. As I indicated in earlier posts, the customer
that I was really having problems with had fairly large VM’s with a lot
of changing daily data, at least in one VM. Trying to back them up in a
“normal” fashion across the WAN usually failed miserably and yet things
seem to work okay with the other customer. It turns out that, 1) I was
labouring under a misconception about the best way to perform Veeam
backups across the WAN and, 2) it was just sheer dumb luck that backups
for the first customer (the one started under 6.5) actually worked!
Now that I have gone through many more hoops at both customers as well as sorted through issues with Veeam support it seems that I may have some answers.
First and foremost, if you want to perform Veeam backups across the WAN you should be using Backup Copy Jobs
as an adjunct to your “on LAN” Veeam job and NOT try to target a direct
job out over the WAN (unless, of course, you have massive big pipes).
The reason for this is that the Backup Copy Job is actually a backup of
an existingbackup and not an actual backup of
a running VM itself. What this does is backup the last completed
backup of a VM and it runs between the Veeam server and the Veeam
repositories and does not impact the running VM. The end result is a
fully intact and verified copy (hence the Backup Copy moniker)
of your backup in your remote repository. This is particularly nice as
it puts no load on your production VM or the hypervisor host. This can
be particularly important if you have a busy VM and it gets impacted
when a snapshot is taken or removed during busy operational periods.
Secondly, if you are serious about all of this then you want
to upgrade your Veeam license to Enterprise Plus so that you get the
WAN accelerator feature. Backup Copy Jobs are currently the only
jobs that actually use the WAN Accelerator and from what I’ve seen it
can make a very large difference in the amount of data that is actually
transferred across the WAN. I originally thought the WAN Accelerator
worked with all types of Veeam jobs, including Replication, but that is
incorrect. I have been told that Veeam is considering how the WAN
Accelerator feature might be enabled for other jobs but, for now, only
Backup Copy Jobs leverage the capability.
I have happily running Backup Copy Jobs, now, and the remaining hair
on my head is safe from being torn out any further. My thanks to
Michael Strine at Veeam Support for helping to get me pointed in the
right direction and for patiently working through the whole process with
me.
Disclaimer: My company is partners with several vendors including Veeam.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Great article. We have been considering Veeam and wondered about the WAN accelerator. We were also told that Veeam doesn't really do Grandfather-father-son backups - don't know if that is fact or not. I like the idea of the SureBackup and the Virtual Lab. I also like the idea of instant recovery. Any thoughts?
We are currently using VMware 5.1 and Dell Appassure. It seems to work ok but, at times seems to take a lot of resources. When replication across the WAN is "caught up" it works really well. If anything forces a base image of a server, it can be weeks before it gets back on track.
Thanks for your time.
Steve