Safe to have a 48TB Windows volume?
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@PhlipElder said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
It seems like I remember Scott Miller talking about combining enterprise hardware + SAS/SATA Controller + Linux for storage requirements vs proprietary hardware raid controller.
@Donahue - Yes. I have a similar setup offsite backup several miles away for disaster recovery / hardware failure etc. I know raid != backups.
What's the air-gap to protect against an encryption event if any?
LOL. I like that term. "Encryption Event"
It implies, quite correctly, that many of those problems are not exactly malware. Many are just bad system design.
Indeed. We've "heard" of cloud vendors that have lost both their own and their tenant's environments due to an encryption event which implies improper setup and procedures.
As far as the backup server pulling the data on to itself one needs to make sure no credentials are saved anywhere. All it takes is one lazy tech doing so and the baddies are in. Rotating that password regularly would help to stem that.
Gostev (Veeam) has a regular newsletter and mentioned that offlining the backup server with it firing up to do its pulls then shutting itself back down again once done would be one way of dealing with having an air-gap.
EDIT: Setting that "Cannot Save Credentials" setting for RDS in Local GPMC would work too.
-
@jim9500 said in Safe to have a 48TB Windows volume?:
@PhlipElder said in Safe to have a 48TB Windows volume?:
What's the air-gap to protect against an encryption event if any?
What's the air-gap to protect against an encryption event if any?
My backup server has access to the rest of the network - but it pulls the backups to itself vs backups being pushed. The rest of the network can't directly write to it. My backups happen weekly - so my (hope) is that I would recognize what was happening to my live network before it was backed up.
I have been contemplating doubling my backup storage space to make sure I have enough space to store older file revisions in a ransomware situation.
Is it a backup or just a copy? If it's a backup, thinking something like Veeam here, then having multiple backup copies on the backup server won't need say - double the space to have two full copies, it will need the amount of typical changes between backups, though I'd go for twice that difference so you can take a backup, then add the second backup, then add a third backup, then delete the second backup, etc. So you'll end up with two 'copies' on the backup at all time.
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
NTFS has improved a lot over the years. This is definitely a big volume for NTFS to handle. ZFS is better designed for volumes of this size.
You are correct, with your triple mirrored (and hot spare!) setup, it's your filesystem, not your array, that you have to worry about. You have definitely managed to shift the risk from the RAID to the FS.
This isn't insanely big, but certainly having Windows managing storage always gives me a little moment of pause. Storage is not their strong suit and has weakened, rather than improving, in recent years. ReFS has had issues, the recent releases have had their own issues even with NTFS, and their software RAID has had big time issues (you aren't using that here, so not applicable either.) But this is just generally an area that Microsoft struggles with and doesn't tend to see as critical so seems to mostly poo-poo reliability concerns to focus on other areas.
If I was doing storage this large, I would almost certainly be using XFS on hardware RAID based on your setup. XFS is faster than NTFS, and pretty much bullet proof.
I agree. Last place I worked we did 96TB arrays on RAID 10 with XFS.
-
@scottalanmiller I somehow missed this reply. This is the answer I was looking for. The great news is that my hardware will likely stay (almost) the same when I need to upgrade.
-
@Dashrender said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
@PhlipElder said in Safe to have a 48TB Windows volume?:
What's the air-gap to protect against an encryption event if any?
What's the air-gap to protect against an encryption event if any?
My backup server has access to the rest of the network - but it pulls the backups to itself vs backups being pushed. The rest of the network can't directly write to it. My backups happen weekly - so my (hope) is that I would recognize what was happening to my live network before it was backed up.
I have been contemplating doubling my backup storage space to make sure I have enough space to store older file revisions in a ransomware situation.
Is it a backup or just a copy?
There isn't a difference. Backups are just decoupled copies.
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@DustinB3403 said in Safe to have a 48TB Windows volume?:
Doesn't ntfs have a limit of 16TB per volume?
NTFS volume limit is 256TB in older systems.
NTFS has an 8PB volume limit in modern ones.
The one caveat to NTFS Volumes as far as size goes is the 64TB limit for Volume Shadow Copy snapshots. A lot of products use VSS for their purposes.
-
@PhlipElder said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@DustinB3403 said in Safe to have a 48TB Windows volume?:
Doesn't ntfs have a limit of 16TB per volume?
NTFS volume limit is 256TB in older systems.
NTFS has an 8PB volume limit in modern ones.
The one caveat to NTFS Volumes as far as size goes is the 64TB limit for Volume Shadow Copy snapshots. A lot of products use VSS for their purposes.
Major caveat there!
-
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
-
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
It's been great in my experience. Though, I am using it in such a way the risk is worth the benefits... replication and backup repositories. It's been 100% solid. And like I said, it's all huge files stored on it, and probably not the use case that you seen results in data loss. I haven't seen that anywhere, so only taking your word for it unless you have links for me to do some reading. Not dumb stuff from Tom's or whatever, reputable scenarios in correct use cases.
-
https://docs.microsoft.com/en-us/windows-server/storage/refs/refs-overview
All I need from it is g2g.
-
Run a Chkdsk in that volume can take days..
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
If you're talking about why 2019 (and Windows 10 1809) were pulled, that data loss has nothing to do with REFS. Additionally, REFS was removed from Windows 10 for all versions exception workstation.
-
@Dashrender said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
If you're talking about why 2019 (and Windows 10 1809) were pulled, that data loss has nothing to do with REFS. Additionally, REFS was removed from Windows 10 for all versions exception workstation.
I never said it did. Why would it need to be? There are issues with Microsoft and storage in general, problems with ReFS in general, and problems with 2019 in regards to storage. What more do you need to be wary?
-
@Obsolesce said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
It's been great in my experience. Though, I am using it in such a way the risk is worth the benefits... replication and backup repositories. It's been 100% solid. And like I said, it's all huge files stored on it, and probably not the use case that you seen results in data loss. I haven't seen that anywhere, so only taking your word for it unless you have links for me to do some reading. Not dumb stuff from Tom's or whatever, reputable scenarios in correct use cases.
The problem with storage is that we expect durability of something like seven nines as a "minimum" for being production ready. That means no matter how many people having "good experiences" with it, that tells us nothing. It's the people having issues with it that matter. And ReFS lacks the stability, safety, and recoverability necessary for it to be considered production ready to normal people as a baseline.
But even systems that lose data 90% of the time, work perfectly for 10% of people.
-
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS is supported for production workloads on Storage Spaces Direct and Storage Spaces. With the Server 2019 ReFS generation Microsoft has relented to some degree and stated that ReFS can be done on SAN but only for archival purposes only. No workloads on SAN. Period.
There are a lot of features within ReFS that need to reach in a lot deeper thus the Storage Spaces/Storage Spaces Direct requirement.
-
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
It's been great in my experience. Though, I am using it in such a way the risk is worth the benefits... replication and backup repositories. It's been 100% solid. And like I said, it's all huge files stored on it, and probably not the use case that you seen results in data loss. I haven't seen that anywhere, so only taking your word for it unless you have links for me to do some reading. Not dumb stuff from Tom's or whatever, reputable scenarios in correct use cases.
The problem with storage is that we expect durability of something like seven nines as a "minimum" for being production ready. That means no matter how many people having "good experiences" with it, that tells us nothing. It's the people having issues with it that matter. And ReFS lacks the stability, safety, and recoverability necessary for it to be considered production ready to normal people as a baseline.
But even systems that lose data 90% of the time, work perfectly for 10% of people.
The problem I have with this perspective is that some of us have more direct contacts with folks that have had their SAN storage blow up on them but nothing gets seen in the public. One that does come to mind is the Australian Government's very public SAN blow-out a few years ago.
There is no solution out there that's perfect. None. Nadda. Zippo. Zilch.
All solutions blow up, have failures, lose data, and outright stop working.
Thus, in my mind citing up-time, reliability, or any other such statistic is a moot point. It's essentially useless.
The reality for me is, and maybe my perspective is coloured by the fact that I've been on so many calls over the years with the other end being at their wit's end with a solution that has blown-up on them, no end of marketing fluff promoting a product as being five nines or whatever has an ounce/milligram of credibility to stand on. None.
The only answer that has any value to me at this point is this: Are the backups taken test restored to bare-metal or bare-hypervisor? Has your hyper-scale whatever been tested to failover without data loss?
The answer to the first question is a percentage I'm interested in and could probably guess. We all know the answer to the second question as there have been many public cloud data loss situations over the years.
[/PONTIFICATION]
-
@PhlipElder said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@scottalanmiller said in Safe to have a 48TB Windows volume?:
@Obsolesce said in Safe to have a 48TB Windows volume?:
@jim9500 said in Safe to have a 48TB Windows volume?:
Have any of you used 48TB Windows volumes? Any resources on risk analysis vs ZFS?
I have two that are close to 60 TB. But they are REFS and hold a lot of large virtual disks.
REFS on 2019 is what I would wait for, for bare file storage.
Are you on 2019 now or looking to move off of a Windows file server?
ReFS has a bad track record. It's got a future, but has been pretty lacking and presents a bit of risk. Microsoft has had a disastrous track record with storage recently, even if ReFS is supposed to get brought to production levels with 2019, 2019 is questionably production ready. Remember... data loss is why it was pulled out of production in the first place.
It's been great in my experience. Though, I am using it in such a way the risk is worth the benefits... replication and backup repositories. It's been 100% solid. And like I said, it's all huge files stored on it, and probably not the use case that you seen results in data loss. I haven't seen that anywhere, so only taking your word for it unless you have links for me to do some reading. Not dumb stuff from Tom's or whatever, reputable scenarios in correct use cases.
The problem with storage is that we expect durability of something like seven nines as a "minimum" for being production ready. That means no matter how many people having "good experiences" with it, that tells us nothing. It's the people having issues with it that matter. And ReFS lacks the stability, safety, and recoverability necessary for it to be considered production ready to normal people as a baseline.
But even systems that lose data 90% of the time, work perfectly for 10% of people.
The problem I have with this perspective is that some of us have more direct contacts with folks that have had their SAN storage blow up on them but nothing gets seen in the public. One that does come to mind is the Australian Government's very public SAN blow-out a few years ago.
There is no solution out there that's perfect. None. Nadda. Zippo. Zilch.
All solutions blow up, have failures, lose data, and outright stop working.
Thus, in my mind citing up-time, reliability, or any other such statistic is a moot point. It's essentially useless.
Not at all. Reliability stats are SUPER important. There's ton of value. When we are dealing with systems expecting durability like this, those stats tell us a wealth of information. You can't dismiss the only data we have on reliability. It's far from useless.
-
@PhlipElder said in Safe to have a 48TB Windows volume?:
The reality for me is, and maybe my perspective is coloured by the fact that I've been on so many calls over the years with the other end being at their wit's end with a solution that has blown-up on them, no end of marketing fluff promoting a product as being five nines or whatever has an ounce/milligram of credibility to stand on. None.
Agreed, but that's why knowing that stuff can't be five nines due to the stats we've collected, is so important.
-
@PhlipElder said in Safe to have a 48TB Windows volume?:
The only answer that has any value to me at this point is this: Are the backups taken test restored to bare-metal or bare-hypervisor? Has your hyper-scale whatever been tested to failover without data loss?
I think this is a terrible approach. This leads to creating systems that mathematically or statistically we'd expect to fail. If this was our true thought process, we'd skip tried and true systems like RAID, because we'd not trust them (even with studies that show how reliable that they are) because we are connecting them to some unethical SAN vendor who made false reliability stats and hides all failures from the public to trick us. We can't allow an emotional reaction to having sales people try to trick us with clearly false data lead us to do something dangerous.
There is a lot of real, non-vendor, information out there in the industry. And a lot of just common sense. And some real studies on reliability that are actually based on math. We don't have to be blind or emotional. With good math, observation, elimination of marketing information, logic, and common sense... we can have a really good starting point. Are we still partially blind? Of course. But can we start from an educated point with a low level of risk? Absolutely.
Basically, just because you can still have an accident doesn't mean that you shouldn't keep wearing your seatbelt and avoid hitting pot holes.