Configure Software RAID 1 in Centos
-
@Dashrender said:
I understand the Schrodinger's cat reference and agree (mostly). but just because something says it's working, there are times when they don't yet nothing bad is reported indicating as such.
Agreed, and you can do this one time to see if the process works conceptually when the device is not in production. Just attach the disk to another system and view the contents. But you can't do it for running systems, it is not a sustainable process.
But this is a case where "it says it is working" is all that you get. If you don't your RAID, stop using it and find one you do trust. That mechanism is doing a real test and is the best possible indicator. If the best possible isn't good enough, stop using computers. There's no alternative.
There is also a certain value to... if it is good enough for Wall St. the CIA, NASA, Canary Wharf, military, hospitals, nuclear reactors and other high demand scenarios, isn't it a bit silly to not trust it somewhere else?
-
@Dashrender said:
From what I THINK Scott is saying, the only way you could test this system is by leaving it running 100% of the time and pulling a drive out while running, and putting a new drive in also, while the system is running.
No, what I am saying is that RAID can never be tested in a live system by examining removed disks. Ever. It tells you the past, not the current or the future. So doing so puts you at risk without validating anything useful. It's a flawed concept to attempt.
-
@Dashrender said:
Now for a question. @scottalanmiller if my above example happens and he pulls sda (which holds the boot partition), and the re mirroring (is it called resilvering in RAID 1(10)?) is complete, there still won't be a boot partition so if the server has to be rebooted it will fail, right?
Correct, boot partitions need to be handled manually.
-
So the next question - Why are you using MX RAID instead of hardware RAID?
If I have to guess, it's because this is a test box, probably an old PC that doesn't have real RAID in it, so you can't test real RAID.
Testing MX RAID does not validate hardware RAID, so this test is also moot, assuming the production box will have hardware RAID.
-
@scottalanmiller said:
@Dashrender said:
From what I THINK Scott is saying, the only way you could test this system is by leaving it running 100% of the time and pulling a drive out while running, and putting a new drive in also, while the system is running.
No, what I am saying is that RAID can never be tested in a live system by examining removed disks. Ever. It tells you the past, not the current or the future. So doing so puts you at risk without validating anything useful. It's a flawed concept to attempt.
With hardware RAID, you shut down the system and boot the system from either drive. In a software solution like the one in question, that does not appear to be the case. This is what I was getting at. I wasn't talking at all about how useful this test would or wouldn't be.
-
@Dashrender said:
With hardware RAID, you shut down the system and boot the system from either drive. In a software solution like the one in question, that does not appear to be the case. This is what I was getting at. I wasn't talking at all about how useful this test would or wouldn't be.
In hardware RAID you would break the array and cause the same problems. Sure it would boot, but it would not test what you thought you were testing and it would leave you with a broken array. There is no value to the test but a lot of risk.
-
@Dashrender said:
So the next question - Why are you using MX RAID instead of hardware RAID?
If I have to guess, it's because this is a test box, probably an old PC that doesn't have real RAID in it, so you can't test real RAID.
Testing MX RAID does not validate hardware RAID, so this test is also moot, assuming the production box will have hardware RAID.
MD RAID is completely real and very enterprise. This isn't Windows, no reason to avoid MD RAID in production.
-
@Dashrender said:
So the next question - Why are you using MX RAID instead of hardware RAID?
If I have to guess, it's because this is a test box, probably an old PC that doesn't have real RAID in it, so you can't test real RAID.
Testing MX RAID does not validate hardware RAID, so this test is also moot, assuming the production box will have hardware RAID.
Hardware RAID vs Soft raid isn't as big a deal as it used to be. The only big issues is no hardware cache and your cpu will take a slight performance hit, and possibly slightly longer re-build times. Neither of which are a big deal.
-
@scottalanmiller said:
@Dashrender said:
With hardware RAID, you shut down the system and boot the system from either drive. In a software solution like the one in question, that does not appear to be the case. This is what I was getting at. I wasn't talking at all about how useful this test would or wouldn't be.
In hardware RAID you would break the array and cause the same problems. Sure it would boot, but it would not test what you thought you were testing and it would leave you with a broken array. There is no value to the test but a lot of risk.
What is it you think @Lakshmana is testing? Let's assume I asked this same question. The only thing I would be testing is - A: can either disk boot back up to the previous state? Do both drives have the same data as of the time I took them offline? I'm not sure what else i would be testing? If I saw that a drive didn't have any data on it, but the other did, I would know there was something wrong withe the RAID system.
Now that said, I've personally NEVER tested a RAID system, hardware or software to this degree. I just trust that it's working out of the box, and so far I've never been let down - one drive fails, I replace it, some time later another drive fails, I replace it, etc.. and my server experiences no downtime.
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
-
@thecreativeone91 said:
Hardware RAID vs Soft raid isn't as big a deal as it used to be. The only big issues is no hardware cache and your cpu will take a slight performance hit, and possibly slightly longer re-build times. Neither of which are a big deal.
Actually rebuild times have been, on average, faster with software RAID since around 2001. The Pentium III was the first CPU where software RAID typically rebuilt faster with software than hardware because the main CPU was just so much faster than the offload RAID processing unit.
-
@Dashrender said:
What is it you think @Lakshmana is testing?
What they are trying to do is "look at the files" to see if they replicated. But this only tells them that they DID replicate on an old array that they've now blown away in order to test this.
What you CAN do, and this is not advised, is power down the system, remove the disk, attach it to a secondary system, observe it read only, replace it and power back on.
-
@scottalanmiller said:
@Dashrender said:
So the next question - Why are you using MX RAID instead of hardware RAID?
If I have to guess, it's because this is a test box, probably an old PC that doesn't have real RAID in it, so you can't test real RAID.
Testing MX RAID does not validate hardware RAID, so this test is also moot, assuming the production box will have hardware RAID.
MD RAID is completely real and very enterprise. This isn't Windows, no reason to avoid MD RAID in production.
Well in RAID 1/10 I suppose the added load today probably isn't an issue for the processor compared to say a RAID 6. But have processors become so powerful that on SMB systems we no longer need to worry about performance drain from doing RAID 6?
-
@Dashrender said:
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
No, it is very unreasonable. Just because people lack trust doesn't mean that it is reasonable to not trust things. Literally millions of these are in use and have been for decades and work every day. Not trusting this is completely unreasonable and irrational. There are so many places to place your worries that are realistic. Spinning wheels trying to validate an irrational lack of faith in something so insanely well proven is completely unreasonable.
-
@Dashrender said:
@scottalanmiller said:
@Dashrender said:
So the next question - Why are you using MX RAID instead of hardware RAID?
If I have to guess, it's because this is a test box, probably an old PC that doesn't have real RAID in it, so you can't test real RAID.
Testing MX RAID does not validate hardware RAID, so this test is also moot, assuming the production box will have hardware RAID.
MD RAID is completely real and very enterprise. This isn't Windows, no reason to avoid MD RAID in production.
Well in RAID 1/10 I suppose the added load today probably isn't an issue for the processor compared to say a RAID 6. But have processors become so powerful that on SMB systems we no longer need to worry about performance drain from doing RAID 6?
Your load is likely less in SMB. I think a lot of the fuss has been some admins don't understand the tools of software raid and how to use it as it can be seen as more complex than just going into your HW raid boot rom and setting up a LUN.
Many SANs are using only software RAID
-
@Dashrender said:
Well in RAID 1/10 I suppose the added load today probably isn't an issue for the processor compared to say a RAID 6. But have processors become so powerful that on SMB systems we no longer need to worry about performance drain from doing RAID 6?
That was in 2001!! RAID 7, which uses way more processor power than anything else, is software only! There is no need for SMB to be a factor, RAID has the same load and impact no matter what the environment size. It is the array size that makes the difference and these vary little between company sizes. You've not needed to worry about the "drain" of any RAID level for nearly a decade and a half. And fifteen years ago it was only small Windows-based systems on Intel Pentium II and lower than were an issue. Enterprise servers have always been pure software RAID, even twenty years ago.
-
@scottalanmiller said:
@Dashrender said:
Well in RAID 1/10 I suppose the added load today probably isn't an issue for the processor compared to say a RAID 6. But have processors become so powerful that on SMB systems we no longer need to worry about performance drain from doing RAID 6?
That was in 2001!! RAID 7, which uses way more processor power than anything else, is software only! There is no need for SMB to be a factor, RAID has the same load and impact no matter what the environment size. It is the array size that makes the difference and these vary little between company sizes. You've not needed to worry about the "drain" of any RAID level for nearly a decade and a half. And fifteen years ago it was only small Windows-based systems on Intel Pentium II and lower than were an issue. Enterprise servers have always been pure software RAID, even twenty years ago.
I've never worked on an Enterprise system before - not in my wheelhouse, so I've never seen non hardware RAID systems. I knew it was much less of an issue, but didn't consider it a complete non issue.
-
@scottalanmiller said:
@Dashrender said:
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
No, it is very unreasonable. Just because people lack trust doesn't mean that it is reasonable to not trust things. Literally millions of these are in use and have been for decades and work every day. Not trusting this is completely unreasonable and irrational. There are so many places to place your worries that are realistic. Spinning wheels trying to validate an irrational lack of faith in something so insanely well proven is completely unreasonable.
What does testing this once or twice before a company goes live hurt other than setup time/tech time? I'm guessing that after seeing several of these solutions go into place the manager would probably just move on and not require it in the future - but I could be wrong.
-
@Dashrender said:
@scottalanmiller said:
@Dashrender said:
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
No, it is very unreasonable. Just because people lack trust doesn't mean that it is reasonable to not trust things. Literally millions of these are in use and have been for decades and work every day. Not trusting this is completely unreasonable and irrational. There are so many places to place your worries that are realistic. Spinning wheels trying to validate an irrational lack of faith in something so insanely well proven is completely unreasonable.
What does testing this once or twice before a company goes live hurt other than setup time/tech time? I'm guessing that after seeing several of these solutions go into place the manager would probably just move on and not require it in the future - but I could be wrong.
Are you going to test it once again after you rebuild the array you now broke? if not what's the guarantee it's still working? It's a endless cycle since you have to break it to check it and you are stating the process over.
-
@thecreativeone91 said:
@Dashrender said:
@scottalanmiller said:
@Dashrender said:
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
No, it is very unreasonable. Just because people lack trust doesn't mean that it is reasonable to not trust things. Literally millions of these are in use and have been for decades and work every day. Not trusting this is completely unreasonable and irrational. There are so many places to place your worries that are realistic. Spinning wheels trying to validate an irrational lack of faith in something so insanely well proven is completely unreasonable.
What does testing this once or twice before a company goes live hurt other than setup time/tech time? I'm guessing that after seeing several of these solutions go into place the manager would probably just move on and not require it in the future - but I could be wrong.
Are you going to test it once again after you rebuild the array you now broke? if not what's the guarantee it's still working? It's a endless cycle since you have to break it to check it and you are stating the process over.
Of course not, and while I see your point, it's a cyclical thing - but for someone who is unfamiliar with the system, if they want to prove it to themselves once before production, I don't see the harm. But if I was their IT team, after showing this manager 2 or 3 times (different servers) I would have another conversation with them about dropping this need, since he's been shown the technology works and needs to be trusted on it's own merit.
-
@Dashrender said:
@scottalanmiller said:
@Dashrender said:
But just because I trust the system doesn't mean everyone does. So doing this test on a system before it goes live in production (but never while in production) isn't unreasonable if the manager wants it.
No, it is very unreasonable. Just because people lack trust doesn't mean that it is reasonable to not trust things. Literally millions of these are in use and have been for decades and work every day. Not trusting this is completely unreasonable and irrational. There are so many places to place your worries that are realistic. Spinning wheels trying to validate an irrational lack of faith in something so insanely well proven is completely unreasonable.
What does testing this once or twice before a company goes live hurt other than setup time/tech time? I'm guessing that after seeing several of these solutions go into place the manager would probably just move on and not require it in the future - but I could be wrong.
As long as you are doing it ONLY before... you are only wasting time, not really hurting anything. But it is a process that cannot continue to a live system.