Server and Storage Redundancy
-
Cross posting for someone that I've been helping out:
User, new to virtualization, is looking to set up a new environment and wants to ensure redundancy. Further details not yet provided. The proposed solution was two servers, two SAN switches and two HP MSA SAN devices, all redundant. A 2-2-2 design. Each later redundant, three layers.
-
Well you have a few issues here. The first is that you have not identified why a company with only one server would need redundancy. Sure, you might need that, but the chances are extremely low. Only about 1-5% of the SMB market has need for redundancy on their servers, HA. You should start with this analysis. Don't assume that you are a special case, assume that you are not. SMBs just can't cost justify this in nearly all cases. Everyone thinks that they are that special case, but obviously, most are not.
-
You have two servers (at most) here. So SAN or any external storage is completely out of the question. You can't consider that until you have at least four servers and typically would not consider it until you were closer to a dozen or more. At three or fewer, it simply isn't a rational option.
http://www.smbitjournal.com/2013/06/when-to-consider-a-san/
So remove SAN, NAS or any other external storage from your thinking. It has no place here. Your idea of having two SANs is correct if you had the dozen or more physical host servers to attach to them. That would make sense. So good job on that part, most people miss that. But SAN itself isn't an option at this size.
-
You should have no switches. Even if you needed external storage, which is totally out of the question here, it would be DAS and DAS doesn't use switches.
http://www.smbitjournal.com/2012/08/choosing-a-storage-type/
So we can totally eliminate the switches and the SAN or external storage from this discussion.
-
So you are left with one or two servers, that's all of the hardware that we need to discuss. Remember solution elegance is key and money always matters. We've already trimmed your budget by at least 60%, possibly far more.
So the first thing that you have to decide is if you need high availability. Start by thinking of what standard availability means... the servers themselves deliver pretty close to six nines of availability (assuming high end servers like HPE Proliant and Dell PowerEdge.) So do you need dramatically more uptime than five nines? Do the math, likely you do not. For nearly all SMBs and even many larger businesses, a single server (mainframe architecture) and a good backup is all that you need. You are looking at only a few hours of downtime, on average, per decade. Not too shabby for very cheap.
If you do need HA, as unlikely as that is, then the only rational approach for two to three servers (and often for several more) is replicated local storage. This lets you get to a dramatically higher degree of availability than your 2x2x2 design at a fraction of the cost. Instead of six devices to potential fail, there are only two. Instead of a lot of complexity, there is very little. Instead of cutting corners on non-business class storage, you can be on enterprise highly reliable storage. The amount that this is better is dramatic.
-
Now, you did not mention your hypervisor choice. I'll throw some ideas out there.
- XenServer: XS is 100% free including all of the tools that you need for it. It is super easy to use and amazingly powerful. Full replicated local storage is included in it, as is high availability features. It has very high performance and is very mature. It has, by far, the best centralized management options (for free no less) than any of its competitors. Easiest to use, most powerful and robust option in my experience.
- Hyper-V: HV is 100% free but does not include as many ancillary tools or features. For replicated local storage you will use StarWind, which is free but third party. You will get full HA with RLS for zero cost here as well. More complex, less performant, less mature. But definitely your second choice. Bigger learning curve in my experience.
- VMware: Not free, will be a few thousand dollars to get fewer features than either XenServer or Hyper-V. Very mature. You will use StarWind here again, but StarWind does not have the same level of integration here as on Hyper-V so while it is the best choice, it is not as ideal as it is on Hyper-V. So very high cost, loss of features, loss of storage performance.
None of these come with support in the base price, all allow you to pay for support after installing. XenServer has excellent support from Citrix and numerous third parties. VMware has excellent support from, well, VMware, of course. Hyper-V is mired with Microsoft's notoriously bad support.
-
It should be noted that the HPE MSA are not appropriate gear to consider in a redundant or high availability setup. They are extremely fragile gear on their own, so a very bad building block for a redundant solution. I am unaware of if their array to array failover is functional or even an option, but their dual controller fragility and general reliability problems are well documented. And these are just rebranded DotHill devices, not things that anyone should be deploying. Reusing an existing one as a backup target is just fine, but buying new, even for that, is ill advised. The cost is too high, the reliability too low. From what I have seen anecdotally, these may be the lowest reliability product pushed to the mainstream business market. These are dramatically below ReadyNAS or Synology in reliability. HPE makes tons of awesome stuff, but not these. In their defence, they don't make these.
-
@scottalanmiller said in Server and Storage Redundancy:
You should have no switches. Even if you needed external storage, which is totally out of the question here, it would be DAS and DAS doesn't use switches.
http://www.smbitjournal.com/2012/08/choosing-a-storage-type/
So we can totally eliminate the switches and the SAN or external storage from this discussion.
There are DAS switches but for somebody who a) was smart enough to deploy Clustered Storage Spaces and many JBODs and servers and b) not scared on f a lock-in vendor (LSI makes them and maybe 1 or 2 more companies)
-
Scott you had a conversation with your self...
-
@DustinB3403 said in Server and Storage Redundancy:
Scott you had a conversation with your self...
I do that all the time. It doesn't get really confusing till me, myself, and I are all talking at the same time
-
@DustinB3403 said in Server and Storage Redundancy:
Scott you had a conversation with your self...
What's wrong with that?
-
@travisdh1 said in Server and Storage Redundancy:
@DustinB3403 said in Server and Storage Redundancy:
Scott you had a conversation with your self...
I do that all the time. It doesn't get really confusing till me, myself, and I are all talking at the same time
That's how Ethernet (original one) was invented ;))
-
@KOOLER said in Server and Storage Redundancy:
@scottalanmiller said in Server and Storage Redundancy:
You should have no switches. Even if you needed external storage, which is totally out of the question here, it would be DAS and DAS doesn't use switches.
http://www.smbitjournal.com/2012/08/choosing-a-storage-type/
So we can totally eliminate the switches and the SAN or external storage from this discussion.
There are DAS switches but for somebody who a) was smart enough to deploy Clustered Storage Spaces and many JBODs and servers and b) not scared on f a lock-in vendor (LSI makes them and maybe 1 or 2 more companies)
DAS switches just turns the DAS into a SAN, though
-
Hi Scott,
i will eliminate switches and SAN if the bosses approves so which they might as it would be more cost effective. but it might take a while.
will probably just take your advice and make it to 2 servers and another server for scaling in the future.
-
@softbank21 said in Server and Storage Redundancy:
Hi Scott,
i will eliminate switches and SAN if the bosses approves so which they might as it would be more cost effective. but it might take a while.
will probably just take your advice and make it to 2 servers and another server for scaling in the future.
Welcome to MangoLassi!
-
@softbank21 said in Server and Storage Redundancy:
will probably just take your advice and make it to 2 servers and another server for scaling in the future.
With either VMware or Hyper-V (I'd push for Hyper-V, more features, huge cost savings, better storage handling at this scale) you can use StarWind and we'll loop in @kooler who can talk about their advantages, features, etc.
-
@scottalanmiller said in Server and Storage Redundancy:
@softbank21 said in Server and Storage Redundancy:
will probably just take your advice and make it to 2 servers and another server for scaling in the future.
With either VMware or Hyper-V (I'd push for Hyper-V, more features, huge cost savings, better storage handling at this scale) you can use StarWind and we'll loop in @kooler who can talk about their advantages, features, etc.
I disagree with how everyone here pushes starwind as this wonder answer. It is a great product, but Hyper-V replication works quite well for the SMB that only needs a single server compute node, yet wants to have some failover redundancy. It is simple, baked in, and no third party tools required.
Replication from HV01 to on site HV02.
Replication from HV02 to HV03 offsite
Using IPSEC on a pair of ERL with WAN of 100/100 mbps fiber and 50/50 mbps fiber
-
@JaredBusch Hyper-V replica is nice for that scenario, however I don't see why StarWind isn't a good fit here. I mean if OP can dedicate any 2 servers for storage, they can run the free version of StarWInd, which effectively gives them better redundancy than Hyper-V replica, without any additional investments. The big question here, is how many servers will they be going with, once OP talks this through with management.
-
@ardeyn said in Server and Storage Redundancy:
@JaredBusch Hyper-V replica is nice for that scenario, however I don't see why StarWind isn't a good fit here. I mean if OP can dedicate any 2 servers for storage, they can run the free version of StarWInd, which effectively gives them better redundancy than Hyper-V replica, without any additional investments. The big question here, is how many servers will they be going with, once OP talks this through with management.
I believe that his opinion is that the additional investment is in effort and that the effort of setting up the better reliability of Starwind is not worth it. Which is a valid argument, especially given that nearly everyone setting up redundancy at this scale doesn't actually need it anyway.
But I'm of the opinion that once going this route and buying all of that gear, removing the data loss component is typically, but not always, worth a little extra effort. Plus the additional positioning to be ready to scale to another host when the time comes.
-
@JaredBusch said in Server and Storage Redundancy:
I disagree with how everyone here pushes starwind as this wonder answer. It is a great product, but Hyper-V replication works quite well for the SMB that only needs a single server compute node, yet wants to have some failover redundancy. It is simple, baked in, and no third party tools required.
I'll cross-post from SpiceWorks as I've been writing a Veeam Vs Hyper-V replica wrap up post recently and it's mostly what OP is asking here as well.
-->
You come up with a set of numbers:
- RTO (how much downtime you can afford?) and RPO (how much data you can lose?)
https://en.wikipedia.org/wiki/Recovery_time_objective
https://en.wikipedia.org/wiki/Recovery_point_objective
- Budget
So...
...if you can't afford much of a downtime and you can't afford to lose data you do implement some BC strategy: Windows failover cluster, HA VMs, FCI with SQL Server if you don't do AAG paying huge $$$ for SQL / Enterprise version etc and these would need HA storage (physical or virtual). Then you probably need some of our stuff (or some competitors, HP VSA is a good option if you care, upcoming Windows Server 2016 will have similar storage technology also if you'll go Datacente all-around and increase amount of licensed nodes). Most of these options are paid expect maybe self-made ones (you pay with your own labor), free 1TB HP VSA and also you can get free hyperconverged StarWind as well on a special request (search this forum there's a SpiceWorks path for that). This approach may add some $$$ requirements (+CapEx) and also for sure will make you work more to setup and support the final solution (+OpEx). <-- Do only if you REALLY NEED THIS !!
...if you can afford reasonable downtime (minutes, half of hour, hours maybe - time to recover your VM from some backup, depends on amount of data your VMs hold) and you can lose some data on manual failover or replication increment (again, some minutes typically) go w/out shared storage, run VMs from DAS but always use some good VM backup. Obviously Veeam is a good candidate for that. <-- Do this ALWAYS !! You can't run production w/out a backup solution !!
+ you need some backup storage and I don't see you mentioning you have one. You can't backup to yourself so cheap two-disk NetGear or maybe some off-site cloud space should help.
I don't recommend using Hyper-V Replica because it a) doesn't play nice with some apps officially (Exchange, SQL Server some scenarios, any VMs depending on each other) and b) steals IOPS from your system (twice as many writes now are needed, RAID10 is already half on writes so you get 1/4 of your disks so far, close to RAID5 write penalty now), and c) has dangerous "autofailover" option implemented with PowerShell script (see link below in P.S.), you really don't want to do that as there's no brain split issue protection, orchestration site should be a separate, say running in Azure but that's more $$$ and labor) d) make you think you're safe with VM replication while you really need a BACKUP. <-- Use Veeam Backup & Replication for both VM replication and VM backup, this money definitely pays you back.
That's it
TL;DR: Don't buy StarWind unless you really need to and buy Veeam always.
P.S. In case you'll decide to implement a crazy "two Hyper-V hosts replicate to each other and here's my poor' mans pseudo-HA system" scenario here's a link for you:
https://blogs.technet.microsoft.com/keithmayer/2012/10/05/automated-disaster-recovery-testing-and-fa...
Again, I don't recommend doing that and there are reasons above why exactly.
<--