Hyper-V High availability? or only VMware
-
Can you shade some light based on the requirement I stated earlier. I don't need us to have our own life support (although it would be nice to have such feature) but we should not stop our production due to internal issue that is preventable. Which is what this post is all about. True HA seem to cost both arms, legs, and some limbs for SMB. If we need to get 3 servers with more storage then I will take that into consideration. As SAM pointed out:
@scottalanmiller said:
Exactly. 95% of the risk is in overspending, technical debt or become reliant on a third party to handle what could be simple and internal. But definitely, with a good backup and general data protection strategy, HA is massive overkill for a normal SMB so the risk "anti-HA" IPOD / SAN design generally only introduces a kind of risk that probably wasn't important anyway.I do not want to overspend on something that can be done and deliver similar result for less. I have many more area I could use some more budget on.
@scottalanmiller said:
In comparing Hyper-V and VMware, there is only practical approach today for an HA cluster at two nodes and that is using StarWind (which is free) to handle the replicated local storage.StarWind is more stable and performant on Hyper-V than on VMware. This is a result of an architectural difference that is VMware's decision to not allow StarWind into the kernel space. The result is that given VMware does not have an equivalent product, in the two node space Hyper-V's technology is just as good but the available components and real world options put Hyper-V as a clearly superior technical option than VMware even if VMware was free, which it is not.
What kind of HDD type is recommended for Starwind VSAN? RAID10 with at least 3TB storage space. SATA7.2K or SAS 10K/15K? I doubt we can afford SSD.
-
@LAH3385 said:
Ultimately, I am trying to prove that Hyper-V is better in our scenario and we do not need to spend a fortune for it.
The onus should be completely on VMware to show how it is even a viable option, which I don't believe that it is. It is about $5K more and even at that higher price it delivers a technically inferior solution. VMware carries no benefits here, only technical and financial downsides. I'd question how it would even make the consideration list let alone how better solutions justify against it. Hyper-V should only need to show that it is better than XenServer. VMware is the fourth option, the "only when nothing else is available" option.
-
@LAH3385 said:
Budget is still a thing so I cannot spend a lot. I'm trying to get a solution that is less than $15K total (exclude any labor).
Storage space: 2TB used.. so 4-6TB in RAID10 (SAS10K or SATA7.2K for Hyper-V HA?) 2 hosts with RAM 32GB on each. CPU single E5-2620-v3.
While high for your budget, you might also want to look at Scale. Their entry point solution is very similar in hardware here, can be delivered in a two compute node and three storage node configuration for Windows users like yourself, is full HA and is designed to be super simple so that zero labor or consulting would be needed. It's much higher than your desired price point and obviously much higher than what you could spend with other solutions. But if you want something where third party vendor support would never be needed, it is well worth considering.
-
@LAH3385 said:
Hyper-V clustering seem ideal but can it achieve what VMware high availability does?
Reverse that, can VMware achieve what Hyper-V does here? And the answer is: not quite.
-
@LAH3385 said:
We want to have both server running at all time. Putting both servers as virtual instance onto the same host is not ideal to my boss.
Two things to point out here:
- With the SAN solution, everything would have been dependent on a single host. So the proposed solution is ruled out by your boss, I presume?
- This is not goal level thinking. This is focusing on redundancy as a proxy for reliability. Make sure to read that link.
-
@LAH3385 said:
What I mean by High Availability is for our production team to keep on working without interruption. Currently our file server is on the same server as DC AD DHCP DNS, etc... Back in July, AD got corrupted and went into BSOD loop. This cause our production to freeze for half a day before we are able to get the backup restored.
That incident cost us potential thousands of dollar in only half day. If it happens again and it goes down for days then we may be out of business. What that said, we are looking into redundancy servers or high availability.
AD is HA at the application level. Even if you have the most HA Hyper-V or VMware platform you would never have AD utilize it. AD would always be set to run as normal. That you ran into this issue would not be resolved by having HA and in this particular case could have been resolved by having two AD VMs on a single host.
In fact, if you had full VMware Fault Tolerance, your AD BSOD would have replicated to the other host and your VMware would have extended the problem rather than solving it! This is a great example of how HA is something you do, not something that you buy. You need to design your HA solution workload by workload. Tools like Hyper-V can be really important parts of that design, but rarely will it be the primary one.
-
@LAH3385 said:
Can you shade some light based on the requirement I stated earlier.
I just skimmed the SW thread. Your answer is drop the entire project and fire the current IT consultant.
Then start over with someone who is not out to screw you over.
-
@LAH3385 said:
What do we want? We want a server that can be a RAID 5 but for server.
Just be aware that the issue you described would be subject to the same kinds of issues that RAID is. In your example, AD failed above the platform level, the OS itself failed. So the platform HA would have done nothing to protect you. Platform level HA protects exclusively against hardware failures, not software ones. That's why you only do that when application level HA is not available or reasonable - because it only solves half or less of the issues.
In the same way RAID is great - until the issue is files deleted from the disk, cryptoware or file system corruption. RAID will, just like the HA virtualization solutions, replicate the problem to all nodes instantly leaving you with nothing working no matter how much you spend on the HA.
These are cases where application level HA would solve most problems and a good backup system would solve others.
-
Love this thread - it's finally getting down to "thinking about the problem, not just choosing a solution based on those already given to you"
You mentioned that you had an outage because you had AD corruption. What caused that corruption? How will having anything you've asked for so far prevent or solve this problem in the future?
Things we don't know - Did you only have one AD server? If yes, would having a second AD server have solved this issue?
You've told us your current file storage is on the AD server itself, OK that's easy to solve, make sure to put it on it's own VM in the future. You might find yourself needing a lot of Windows licensing here depending on your setup. If you're expecting a full fail over situation, you'll need the same number of licenses for each server. Assuming you only need one VM per host, you'll need to purchase 1 Windows Server license per host, but, if you need two or more, you'll need at least two Windows Server licenses per host to allow the fail over/maintenance to happen legally.Also, do you need real HA? Can you afford 10 mins of down time while you bootup another VM on the other host? etc etc.
-
@LAH3385 said:
I do not want to overspend on something that can be done and deliver similar result for less. I have many more area I could use some more budget on.
Nor do we want you to. In many cases this will come down to not buying more or little more but rather planning better, changing how and what you implement and being far more thoughtful rather than look at HA as a "solution." HA as a concept is awesome and you should always work towards it, all other factors being equal.
So that's what we need to do. It might make sense to make a separate thread for a number of workloads (maybe one thread per workload) and link here for a higher level description, and we can break down each workload and how it should or could be addressed.
Active Directory, for example, needs to be thought about uniquely. It's actually the easiest to deal with as normal SMBs all have HA for AD - but typically the "vendors salesguy idea" of what to do not only triples your cost, it very often breaks the HA you already had!
-
@LAH3385 said:
What kind of HDD type is recommended for Starwind VSAN? RAID10 with at least 3TB storage space. SATA7.2K or SAS 10K/15K? I doubt we can afford SSD.
If VMware is on the table, you can afford SSDs no problem. Not that you need them, just considering the one guarantees the budget for the other, if that makes sense.
StarWind does not recommend RAID 10 normally. Normally they would push towards RAID 6 or less.
Adding in @StarWind_Software @KOOLER @original_anvil
-
@scottalanmiller said:
StarWind does not recommend RAID 10 normally. Normally they would push towards RAID 6 or less.
Wait, what? StarWind recommends RAID 6 for the sync'ed underbelly of your VM infrastructure?
-
@Dashrender said:
@scottalanmiller said:
StarWind does not recommend RAID 10 normally. Normally they would push towards RAID 6 or less.
Wait, what? StarWind recommends RAID 6 for the sync'ed underbelly of your VM infrastructure?
Often they recommend RAID 0. I, however, do not.
-
@scottalanmiller said:
@Dashrender said:
@scottalanmiller said:
StarWind does not recommend RAID 10 normally. Normally they would push towards RAID 6 or less.
Wait, what? StarWind recommends RAID 6 for the sync'ed underbelly of your VM infrastructure?
Often they recommend RAID 0. I, however, do not.
Wow.. I guess that would be the really poor man's option.. but if you are that poor.. why do you have two servers? why not just one that costs less than the total cost of two but more powerful (if needed) than the single? Seems like the wrong way to go about things.
This reminds me of @scottalanmiller all eggs in one basket aren't really worst than splitting them over two baskets post.
-
@LAH3385 to determine the storage needs (drives, RAID, etc.) we would need some good info about the needed storage capacity and IOPS that are needed. It is very possible that normal SATA or SL-SAS drives will do the trick. For file servers and AD, slow SATA is more than enough.
-
@Dashrender said:
@scottalanmiller said:
@Dashrender said:
@scottalanmiller said:
StarWind does not recommend RAID 10 normally. Normally they would push towards RAID 6 or less.
Wait, what? StarWind recommends RAID 6 for the sync'ed underbelly of your VM infrastructure?
Often they recommend RAID 0. I, however, do not.
Wow.. I guess that would be the really poor man's option.. but if you are that poor.. why do you have two servers? why not just one that costs less than the total cost of two but more powerful (if needed) than the single? Seems like the wrong way to go about things.
Because many people worry solely about compute node failure and nothing else, just like the logic that leads people to spend a fortune on an inverted pyramid while having huge risk from a single, cheap, fragile SAN - they get sidetracked thinking about a single failure mode rather than focusing on overall reliability.
But keep in mind, StarWind with RAID 0 is still overall RAID 01. But I would almost want RAID 6 in there myself to avoid node failover caused by storage whenever possible. Resulting in RAID 61.
-
@scottalanmiller said:
But keep in mind, StarWind with RAID 0 is still overall RAID 01. But I would almost want RAID 6 in there myself to avoid node failover caused by storage whenever possible. Resulting in RAID 61.
I was thinking the same thing. I'd really had to loose a node, then loose the other server because of a drive failure that I was unlucky enough to loose during a node failure.
But even that seems really undesirable (RAID 6 that is) because of the performance penalties - I 'feel' like a single server would be better in general in that case with RAID 10 Spinning Rust, or RAID 5 SSD.
-
I wonder if @scottalanmiller would still recommend OBR 10 instead of RAID 6 for use with Starwind?
-
We have 2 main production team: I'll call them A and B for simplicity.
A requires File Server as they only need to gather documents and other stuff. Applications that they need are Chrome, Adobe, Office.
B requires some File Server and DB access (Access, SQL, some other accounting programs). B is a more mission critical. For B, the server cannot goes down during production.. period. B is what really require HACurrently both A and B are on different physical servers but B still has some files on A server. When server B goes down, it cause DB corruption. The fix is easy and only takes 30 minutes to relink files and restore some as needed from back up.
AD that got corrupted back in July cause File Server inaccessible and that was what really dealt the most damage. If File Server is the only Mission Critical then failover DFS should be enough. but my boss wants the OS to be failover-able.
-
The question is why? I'm guessing because he's old school (kinda like me). But he and I both need to join the 21st century.
Using Application Level failover is much better than using hardware fail over whenever possible. Of course it's not always possible, so we have hardware fail over as another thing we can add to the reliability chain.