@scottalanmiller said in StarWind HCA is one of the 10 coolest HCI systems of 2019 (so far):
Maybe you are running into these problems due to bad products, planning or design, but you are hoisting problems you've had onto everyone else where we aren't experiencing these problems.
You are "projecting". I'm not saying you or people you know having implemented HC and had it not work well. But you are confusing a bad implementation or planning with the architecture being bad. The two are different things.
Or maybe I'm just speaking from experience, and there's plenty of it. Local storage that is not shared is great, but it doesn't scale and pretty much kills all the nice features you can have in a virtualized DC - live migration, HA, all those things you don't care about in SMBs I suppose. Getting that storage replicated in a scalable fashion is hard, simple CBT pushed over network (what the folks at Linbit basically do) does not scale. And hard tasks require resources, those resoutces have to come from somewhere.
Ceph and Gluster are both known to be bad for that, that should have been known going into the project. That someone didn't head you off at the pass shows that the mistakes and oversights were happening early on. We could easily have warned you that that wasn't meant to have good local performance.
Mixing any distributed storage solution with any other workload is known to be bad, this is exactly what I'm saying. I've come into those projects when they were already implemented and got things working by breaking up those overloaded hosts into hardware that was doing one job and doing it well on either side.
Okay... so thebig question is... since this is not part of HC... why? Stop doing that. You can never have a rational, useful discussion about HC until you talk about HC and not something else.
But I am, at least at scale. DRBD and any similar system does not scale. When things are small (SMB level again) this is peanuts, we can do anything because our tasks are smaller than the hardware we can get. What happens at scale though?
No one anywhere recommends 200 nodes in a single cluster. If you think this is a good idea, SAN, SDS, HC, or otherwise, we are on different pages. That's a scale that literally no one, not MS, not StarWind, not VMware, not RedHat recommends as a single failure domain. DO they support it? Yup. Do they think you are crazy? Yup.
200 nodes is small for the scale I typically deal with. Red Hat has solutions that can deal with this kind of scale easily. I know of a few other companies that do. MS, VMW and probably StarWind do not, because of the nature of their clustering implementation, but that's basically all about how you manage locking.
Even in the enterprise, which you claim to know, they often use workloads scopes of this size for performance and safety. The larger the pool, the bigger the problems.
Not really. In a large pool, a dead node simply gets easily replaced. The effect is very small.
If you want to get into giant pools you have to pick your battles.
I usually am in those numbers, but ok
Let's talk reasonable size, like 10-80 nodes. If you need screaming performance that no SAN can match, then you are looking at StarWind network RAID which does that.
OK, so we have a network RAID, a bunch of blocks get streamed to other nodes when writes occur on one. When all there is is pushing block across, things are simple. What happens when a node dies, and I have to suddenly rebalance the data distribution? How is consistency kept? How does the system decide which blocks get streamed where? Even in a 10 node cluster, it would be plain out stupid to keep all the data replicated to everywhere, 10x the data on local disks would be too expensive
If you just want cheap pooled storage, you look at CEPH (and there are accelerators for that to make it fast,if you need to).
Here we have a distributed system, which starts at a least a core per RBD, and 32Gb or RAM to even get started properly. In SMB, I doubt you see many monstrous hypervisors with hundreds of cores, so what is there left to run your actual VMs?
At truly giant scale, the only real benefit to totally external storage, is when speed and reliability are so unimportant that you are willing to sacrifice them to a huge degree to save a few dollars. But with HC's low cost today, even that is approaching the impossible.
Only having a good storage fabric can give you excellent speed and very low latencies, and as for reliability - you can build whatever you want on the SAN side depending on your requirements. The only good thing about HC is local storage access, and it isn't really that far ahead of any decent fabric anyway, if at all.
Because you don't have all those things. Large chunks over the network takes like no resources. No idea where you think the overhead comes from, but for most of us, things like copying data is not a high overhead activity. It's a dedicated network in most cases, with offload engines on the NICs, and things like tiering and such take extremely little overhead (if done well.) These just aren't CPU or RAM intensive activities.
That is simply not true. Pushing large amount of data over the network is not cheap, and that is in the case of simple streaming. When you start running synchronizations and tiering stuff gets harder. And when you have to rebalance (which ceph does often) you need even more resources. Yes, you can dedicate NICs to just that (and those NICs will not be there to provide more bandwidth to the workload traffic) but in order to push large amounts of data into the NICs you also need CPU cycles and and RAM. It's CS 101, there are no free rides.
Those aren't HA, so not applicable to the discussion. HCI is assumed to be replicating to other nodes, something those providers don't provide. They are stand alone compute nodes. Very different animal.
My point exactly. If HC was so great, why wouldn't they be using it?