Hot Swap SSD for HDD in a RAID 1 array
-
Doing the drive swap on a RAID 1 is not too bad since the "stress" will be all on the SSDs, which don't really stress. This isn't a parity array so not really stressing. And since the existing drives are healthy, a single drive copy is not a big deal. No worse than running a backup.
-
@Dashrender said:
For those that don't like this idea - do you also say the same thing about Drobo? They even recommend this process when you need to increase your storage with new drives.
That's extremely different. Doing this like Paul wants to do on a RAID 1 going from Winchesters to SSDs is not a problem. Doing it on a Drobo is horrible. Drobo recommending that process is one of the things we point to as a major problem with their "recommending bad things to look good."
It's no different than how they "recommend" mismatched drives. It kills the performance and lowers reliability. They act like allowing bad practices is a "feature" and ignore that every RAID device ever allowed the same things and it is only good practice that tells people not to do it. Drobo isn't a bad product, but their advice as to how to use their products is based around recommending reckless, crazy behaviour in order to make people who believe that black boxes are magic think that somehow a Drobo is not subject to the same stresses as other RAID arrays. Spoiler alert: it is.
On a Drobo, doing a drive size increase with RAID 5 or RAID 6 is just terrible. I mean really, really terrible. If you have a B800i and wanted to move from 4TB to 6TB drives for example, you are breaking a RAID 6 array (at best, RAID 5 at worst) and doing a 24TB resilver.... eight times!! That's 192TB failure domain using SATA drives. You are looking at likely months of rebuilding, during which you are either on RAID 5 or RAID 0, under extreme stress with the box essentially offline during the process.
-
@scottalanmiller Ditto Synology
-
@MattSpeller said:
@scottalanmiller Ditto Synology
I don't think that they push it in the same way. They say that they can do it, but I've never seen them get all weird about pushing it as a reason to choose them.
-
@scottalanmiller They pimp their SHR raid-5 pretty hard. It and RAID-5 are the default comparison on their site, though you can change it.
-
SHR is fine, it's a virtualization layer on top of the RAID (well, below it technically) same as Drobo. That big is good. It's the pushing people to use mismatched drives, swap them to grow the system and such that Drobo does that is so bad.
-
@MattSpeller said:
@scottalanmiller They pimp their SHR raid-5 pretty hard. It and RAID-5 are the default comparison on their site, though you can change it.
What does @scottalanmiller always preach about taking advice from someone trying to sell you something?
-
@travisdh1 said:
What does @scottalanmiller always preach about taking advice from someone trying to sell you something?
OK I'll take the hit on that one.
But I still think it's OK to do what the OP wanted as a way to move to SSDs.
-
@Dashrender said:
@travisdh1 said:
What does @scottalanmiller always preach about taking advice from someone trying to sell you something?
OK I'll take the hit on that one.
But I still think it's OK to do what the OP wanted as a way to move to SSDs.
Yes, Paul's way of doing it is really not a problem.
-
@scottalanmiller Interesting. It has 4GB at the moment. I certainly could up the memory pretty easily. I have a second DC, so I could just take it down, add more RAM and reboot in about 10 minutes. How much do you think?
Here's the stats:
Disk Throughput 12.00 MB/s Average IO size Read: 40.82 KB / Write: 24.63 KB
IOPS 4 at 95% Average Latency 5 ms Reads and 4 ms writes
Read/Write Ratio 86% / 14% Average Queue Depth 0.11
Total Local Capacity 68.00 GB Peak/Min CPU 10% / 0%
Free Local Capacity 36.00 GB (53%) Peak/Min Memory 2.49 GB / 0.91 GB
Used Local Capacity 32.00 GB (47%) Peak/Min Memory In Use 3.28 GB / 1.70 GB -
@pchiodo Where is the IOPS number, I only see a 4. I think that it got cut off.
-
@scottalanmiller Here is a link to the stats - It's the total read latency that's raised this issue in the first place.
EDIT: So, if it's memory that's causing this in the first place, I should just be able to increase RAM, and not have to swap drives, correct?
-
@pchiodo said:
@scottalanmiller Here is a link to the stats - It's the total read latency that's raised this issue in the first place.
You actually have an IOPS at 95th Percentile of.... 4. Not 4,000.... just 4. This is the lowest IOPS I've ever heard of. Literally.
That latency is because your disks are likely spinning down and going to sleep.
Typically you move to SSD because you have pushed your disks beyond the IOPS that they can do. That's not an issue here. Each drive can deliver 100 - 200 IOPS easily with disk cache pushing bursts much higher.
Dont' go to SSD, it wouldn't do anything for that system. It's idle and running out of memory as it is. The disk latency might be "high", but it is never going to disk.
-
@pchiodo said:
EDIT: So, if it's memory that's causing this in the first place, I should just be able to increase RAM, and not have to swap drives, correct?
Doesn't look like there is an issue there, either. You are never hitting the disks.
-
@pchiodo said:
@travisdh1 - We have a pretty heavy reliance on the DC, more notably the DNS due to some in house app dependencies. Our performance results from recent DPACK testing showed that this DC would benefit from SSDs vs HDDs.
Are you feeling a lag in AD and DNS? What is pushing you to improve the performance of this system? The DPACK doesn't provide performance information, if provides capacity planning information only. The DPACK shows that the current system is underutilized, not taxed at all. The end users should be providing the feedback as to if the system is performing too slowly, are they complaining?
-
@scottalanmiller We occasionally have issues on the shop floor with relatively slow response (4-5 seconds vs. sub-second) and when we track it down, it always points to name resolution as the hang-up. We have a shop floor app that uses hostnames, and seems to hit the DNS quite frequently. Like I said, on occasion, this will appear to slow down, and cause these intermittent lags.
-
@pchiodo said:
@scottalanmiller We occasionally have issues on the shop floor with relatively slow response (4-5 seconds vs. sub-second) and when we track it down, it always points to name resolution as the hang-up. We have a shop floor app that uses hostnames, and seems to hit the DNS quite frequently. Like I said, on occasion, this will appear to slow down, and cause these intermittent lags.
DNS resolution of internal machines or DNS resolution of external domains? Even having 100,000 internal machines a DNS table should be in memory and essentially instant. If DNS is a delay, likely something is wrong that needs to be fixed, not that the system isn't fast enough.
-
@scottalanmiller Well, I guess I will have to dig further. I don't have any glaring DNS errors, and everything else works fine. Might have to point at the programming people and dig through that. The DBMS, and the Application server for the Shop floor app run exceptionally well, and we have never had an issue with performance on those. But, I suppose this could be a client side issue.
-
@pchiodo could be client side, could be hitting a DNS entry that is missing, could be going to something external that has a delay while it waits for remote machines to respond.
-
pchiodo - Is your app only looking for local DNS enteries, or also looking for internet based entries as well?