BackUp device for local or colo storage

DustinB3403

New topic discussing just the goals of this project.
http://mangolassi.it/topic/6453/backup-and-recovery-goals

DustinB3403

@scottalanmiller said:

Wouldn't you carry off daily?

Sorry just saw this, its a nuisance to have to swap tape or drive daily to do it. Our current plan is carry off weekly.

Dashrender

@DustinB3403 said:

Cost consciousness.

Is there that much added value in doubling what we have for those "if" events.

Remember this post when you ask for a full second server to run your VM environment.

Dashrender

@DustinB3403 said:

@coliver Possibly.

The biggest bottleneck with the existing backup solution is the server performing the work. Which is just constantly getting hit.

Port bonding on the new setup would reduce some cost, at the price of reducing what we can run VM wise since those ports would be tied up.

What do you mean? you typically bond all the NICs in a VM host together and all the VMs on the host share the pipe.

Next question, do you really use 800 Mb (realistic use from 1 Gb ports) on each server at the same time?

DustinB3403

I've never bonded all of the NICs as we haven't had the need for it.

In most cases we've simply allocated a specific NIC for a specific number of VM's.

Dashrender

Unless you need to leave bandwidth overhead for something, why split it?

It's just like you always you OBR10 unless you have a specific reason not to.

DustinB3403

Why Bond when I'm still only capable of pushing 1Gb/s at best?

scottalanmiller

@DustinB3403 said:

I've never bonded all of the NICs as we haven't had the need for it.

Aren't we seeing bottlenecks, though? Bonding is a standard, best practice.

scottalanmiller

@DustinB3403 said:

Why Bond when I'm still only capable of pushing 1Gb/s at best?

What is limiting you to 1Gb/s if not the GigE link?

scottalanmiller

And you bond for failover, not just speed.

scottalanmiller

@Dashrender said:

What do you mean? you typically bond all the NICs in a VM host together and all the VMs on the host share the pipe.

Up to four NICs.

DustinB3403

The switches between all of the separate devices aren't they?

Plus this is all existing equipment that is weird. With the new equipment I can get all of this sorted.

Dashrender

Assuming the switches (possibly new switch) understand link bonding (aggregation) will treat the 4 lines as one.

So you have two servers, on the same switch, with 4 cables going to one server, and 4 cables going to another. This would allow the servers to talk to each other at 4 Gb

scottalanmiller

@DustinB3403 said:

The switches between all of the separate devices aren't they?

Yes

DustinB3403

@Dashrender said:

Assuming the switches (possibly new switch) understand link bonding (aggregation) will treat the 4 lines as one.

So you have two servers, on the same switch, with 4 cables going to one server, and 4 cables going to another. This would allow the servers to talk to each other at 4 Gb

Wouldn't that really be 2.4GB/s not 4Gb/s assuming you realistically only get 800Mb/s.

Dashrender

@DustinB3403 said:

@Dashrender said:

Assuming the switches (possibly new switch) understand link bonding (aggregation) will treat the 4 lines as one.

So you have two servers, on the same switch, with 4 cables going to one server, and 4 cables going to another. This would allow the servers to talk to each other at 4 Gb

Wouldn't that really be 2.4GB/s not 4Gb/s assuming you realistically only get 800Mb/s.

LOL - yeah, but when you write it, you would write 4 Gb, because that's what the links are.

scottalanmiller

@DustinB3403 said:

@Dashrender said:

Assuming the switches (possibly new switch) understand link bonding (aggregation) will treat the 4 lines as one.

So you have two servers, on the same switch, with 4 cables going to one server, and 4 cables going to another. This would allow the servers to talk to each other at 4 Gb

Wouldn't that really be 2.4GB/s not 4Gb/s assuming you realistically only get 800Mb/s.

3.2Gb/s? Math fail.

DustinB3403

@scottalanmiller said:

@DustinB3403 said:

@Dashrender said:

Assuming the switches (possibly new switch) understand link bonding (aggregation) will treat the 4 lines as one.

So you have two servers, on the same switch, with 4 cables going to one server, and 4 cables going to another. This would allow the servers to talk to each other at 4 Gb

Wouldn't that really be 2.4GB/s not 4Gb/s assuming you realistically only get 800Mb/s.

3.2Gb/s? Math fail.

Yeah math fail... sorry internet...

Steven

My two bits:

Definitely ditch the full backup process and go with a continuous incremental process. StorageCraft will do continuous incremental backups of your virtual or physical systems as frequently as every 15 minutes. These are byte/sector level files so they're small and efficient.

Just to run the math (*assuming you have 24TB of data to back up) here are three quick examples:

Create a weekly full backup. This produces 24TB x 4 weeks = 96TB of backup files. Even with good compression you're still looking at a lot of storage and network traffic when replicating these offsite.
Initial full and then weekly incrementals backups every Saturday. Let's assume a constant change rate of around 20%/month to keep this simple which means that every weekly incremental would be about 5% x 24TB in size. The first month would be 24TB (base full) + 1.2TB x 3 weeks or 27.6 TB. Every subsequent month would only be 4 x 1.2TB or 4.8TB of storage. Compression would further reduce the storage requirements.
Now for the really slick option... since this is a continuous rate of change we can increase the recovery points (capture incremental files more frequently) without affecting storage too much. For example, each 15 minute incremental file would be approximately 24TB x (.05/week) x (1 week / 7 days) x (1 day / 24 hours) x (4 backups / hour) = about 28.5GB every 15 minutes. The advantage here is that you have about the same amount of data your storing as in option #2 but you have granular recovery points every 15 minutes of every day in the week. So you can select a very specific point in time to recover.

Obviously, this math is over-simplified and you should benchmark your own numbers. But even with a simplified model it should be obvious that periodic full backups are much more storage intensive than incremental backups. And a continuous incremental scheme can produce a powerful granular recovery through the amount of recovery points generated.

The only reason I see people do full backups is because their backup process rolls up these continuous incremental files into a synthetic full which means that if corruption gets into just one of my recovery points my synthetic full is now corrupt. Essentially the periodic full backup is their way of re-basing a backup chain to keep out corruption.

(This became longer than I expected... maybe I should've made the value bigger than "2 bits")

Cheers!

Dashrender

@scottalanmiller earlier was mentioning differentials vs incrementals.

I wonder if the terms still apply, or if the industry at large has dropped the two terms and simply moved to the use of incrementals.

Most systems that I've seen, Unitrends, Veeam, AppAssure allow for the use of continuous incrementals. You take a snapshot at some point as a full backup, then create incremental backups on the desired schedule. The system is then able to create a synthetic full backup from the last full and various incrementals along the way.
In the old days you'd have to do a full restore from the full backup, then apply the most recent differential, or all of the incrementals since the last full. The new way makes it all a single step.

Furthermore, some system can take a backup and export to storage somewhere as a VHD ready to be mounted as a VM where ever you need it.