Syncing massive amounts of changing data to BackBlaze B2 via Linux
-
@DustinB3403 How much of each file is really changing? Can you use de-dup to drastically reduce the amount of data being transferred?
-
@JasGot said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 How much of each file is really changing? Can you use de-dup to drastically reduce the amount of data being transferred?
Being that each source file is unique not much I would suspect. This isn't for a typical file share with basic documents.
-
Of course I could use the native sync interface I just don't know how performant that's going to be this types of files, and these sizes.
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
There are a multiple ways of doing this with tools like Duplicati, CloudBerry etc etc etc (the integrations page goes on forever).
In any scenario, if you had a high turnover SMB share with large files (some of which might be 10GB+ individual files) and multiple terabytes worth of change (in a week) - how would you go about getting the data to B2.
Down is the other half of the battle, which can be discussed afterwards.
Using a command line tool like rsync is one option, although I'm not sure how effective it would over a long duration.
To me it does sound like a cloud backup solution is the wrong solution for that use case.
Have you done the math? Multiple of TBs each week, say 5 TB per week, that is 700 GB per day, 30 GB per hour, 500 MB per minute or or 8 MB per second. So you need an average of 80 Megabit per sec 24/7 to upload that amount of data.
-
@Pete-S said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
There are a multiple ways of doing this with tools like Duplicati, CloudBerry etc etc etc (the integrations page goes on forever).
In any scenario, if you had a high turnover SMB share with large files (some of which might be 10GB+ individual files) and multiple terabytes worth of change (in a week) - how would you go about getting the data to B2.
Down is the other half of the battle, which can be discussed afterwards.
Using a command line tool like rsync is one option, although I'm not sure how effective it would over a long duration.
To me it does sound like a cloud backup solution is the wrong solution for that use case.
Have you done the math? Multiple of TBs each week, say 5 TB per week, that is 700 GB per day, 30 GB per hour, 500 MB per minute or or 8 MB per second. So you need an average of 80 Megabit per sec 24/7 to upload that amount of data.
Bandwidth isn't an issue, the goal is to offload the data once the working files are collected and to simply store them in a safe relatively low cost space without having to build something.
I too immediately understand onsite backup would be great, but also unrealistic to build as the cost of the storage alone would be far too high.
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
There are a multiple ways of doing this with tools like Duplicati, CloudBerry etc etc etc (the integrations page goes on forever).
In any scenario, if you had a high turnover SMB share with large files (some of which might be 10GB+ individual files) and multiple terabytes worth of change (in a week) - how would you go about getting the data to B2.
Down is the other half of the battle, which can be discussed afterwards.
Using a command line tool like rsync is one option, although I'm not sure how effective it would over a long duration.
To me it does sound like a cloud backup solution is the wrong solution for that use case.
Have you done the math? Multiple of TBs each week, say 5 TB per week, that is 700 GB per day, 30 GB per hour, 500 MB per minute or or 8 MB per second. So you need an average of 80 Megabit per sec 24/7 to upload that amount of data.
Bandwidth isn't an issue, the goal is to offload the data once the working files are collected and to simply store them in a safe relatively low cost space without having to build something.
I too immediately understand onsite backup would be great, but also unrealistic to build as the cost of the storage alone would be far too high.
I do understand what you're saying but I do think bandwidth is an issue. You might have the bandwidth but do you have that bandwidth consistently 24/7 all the way to Backblaze servers?
-
@Pete-S 1Gbe symmetric 24/7
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S 1Gbe symmetric 24/7
So when you upload to Backblaze you get 1Gbit/s?
-
@Pete-S said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S 1Gbe symmetric 24/7
So when you upload to Backblaze you get 1Gbit/s?
I haven't specifically checked, but when we get to L3 were do have 1GBe.
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S 1Gbe symmetric 24/7
So when you upload to Backblaze you get 1Gbit/s?
I haven't specifically checked, but when we get to L3 were do have 1GBe.
You could do a simple test here:
https://www.backblaze.com/speedtest/I'm not sure it will tell the complete story though. I understand that Backblaze only has one datacenter i Sacramento, California. I don't know how many hops away that is for you. Any congestion, traffic shaping etc on the way will lower your bandwidth.
-
@Pete-S At my workstation I'm getting 225Mbit/s down and 155Mbit/s up (clearly not symmetrical there. . .) but not bad either considering I have nothing special configured for my workstation.
On a second test I noticed this A connection of 152.8 Mbps upload would backup 1,650 GB in a day
So this very well could be feasible to do.
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S At my workstation I'm getting 225Mbit/s down and 155Mbit/s up (clearly not symmetrical there. . .) but not bad either considering I have nothing special configured for my workstation.
On a second test I noticed this A connection of 152.8 Mbps upload would backup 1,650 GB in a day
So this very well could be feasible to do.
what you get is totally dependent upon so many factors - and you know you can't control those factors over the internet.
-
@Dashrender said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S At my workstation I'm getting 225Mbit/s down and 155Mbit/s up (clearly not symmetrical there. . .) but not bad either considering I have nothing special configured for my workstation.
On a second test I noticed this A connection of 152.8 Mbps upload would backup 1,650 GB in a day
So this very well could be feasible to do.
what you get is totally dependent upon so many factors - and you know you can't control those factors over the internet.
I understand that, but those speeds meet/exceed what would be created within a week. Which if the backup process took 2-3 days to complete that would be fine.
-
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Dashrender said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S At my workstation I'm getting 225Mbit/s down and 155Mbit/s up (clearly not symmetrical there. . .) but not bad either considering I have nothing special configured for my workstation.
On a second test I noticed this A connection of 152.8 Mbps upload would backup 1,650 GB in a day
So this very well could be feasible to do.
what you get is totally dependent upon so many factors - and you know you can't control those factors over the internet.
I understand that, but those speeds meet/exceed what would be created within a week. Which if the backup process took 2-3 days to complete that would be fine.
If you already have B2, the best thing you could do, I think is run it for a week and see how far it makes it.
-
@DustinB3403 said in Syncing massive amounts of changing data to BackBlaze B2 via Linux:
@Pete-S At my workstation I'm getting 225Mbit/s down and 155Mbit/s up (clearly not symmetrical there. . .) but not bad either considering I have nothing special configured for my workstation.
On a second test I noticed this A connection of 152.8 Mbps upload would backup 1,650 GB in a day
So this very well could be feasible to do.
Yes, that's not too bad. It could work. As @dafyre and other mentioned you should give it a try.
$.005 per GB is $5 per TB. So get an account and upload 2TB of random data to see how long it takes. Only going to cost you 10 bucks to find out.