ZFS Based Storage for Medium VMWare Workload

Dashrender

If you were going to go with RLS, which is completely crazy given the scenario and historically accepted risk then the best investment would be to do the following:

Replace all nodes with adequately sized nodes built on the HP DL380 G9 platform or the Dell R730xd platform. These have enough compute to replace several of your nodes in one, enough memory to handle all of your needs and more than 600% greater per node storage capacity!

Move to either HyperV + Starwind or XenServer + DRBD (HA-Lizard)

Make two clusters of two servers each keeping every software piece free and simple

That would cost a lot more than his current $14,000 budget (assuming that number was a budget number).

scottalanmiller

@donaldlandru said:

Add a second SAN that replicates with the first (HP MSA easy to do, not so nice price tag)

I've never seen someone do this successfully. That doesn't suggest that it doesn't work, but are you sure that the MSA series will do SAN mirroring with fault tolerance? I'm not confident that that is a feature (but certainly not confident that it isn't.) Double check that to be sure as I talk to MSA users daily and no one has ever led me to believe that this was even an option.

I know that Dell's MD series cannot do this, only the EQL series.

donaldlandru

@scottalanmiller said:

@donaldlandru said:

Ok if we split this into two separate topics the only unmitigated failure point in operations in the single SAN. Two options to mitigate the risk are:

Not currently, you had said that your nodes do not have the tools or the overhead to absorb the load from a failed node, correct? That makes the risk of those nodes failing unmitigated as well. You only have enough nodes to handle your capacity not enough to use them for failure mitigation.

In Operations, the two node cluster,I said they do have necessary resources to absorb the other node failing. It is the development "cluster that isn't a cluster" that cannot absorb.

scottalanmiller

@Dashrender said:

That would cost a lot more than his current $14,000 budget (assuming that number was a budget number).

Yes, but cost far less than what he was proposing. My recommendations were to lower his cost while improving reliability originally. Then he lept to the Ferrari scenario so I proposed another solution that still beats that one while maintaining the Ferrari features while still only spending a fraction as much money.

scottalanmiller

@donaldlandru said:

In Operations, the two node cluster,I said they do have necessary resources to absorb the other node failing. It is the development "cluster that isn't a cluster" that cannot absorb.

Oh okay. So mitigated where it matters, I assume, and unmitigated where it doesn't matter so much. That I was not clear about.

Dashrender

@scottalanmiller said:

Going to VSAN, Starwind, DRBD, etc. would be an "orders of magnitude leap" that is not warranted. It just can't make sense. What you have today and what you are talking about moving to are insanely "low availability." Crazy low. And no one had any worries or concerns about that, right?

That's just it - the company probably thinks they have that super high level of availability and the fact that they've never had a failure feeds that fire of belief.

This has always been the issue I've had when I try to redesign/spend more money on a new solution. I get the push back - "well, we did it that cheap way before and it worked for 8 years, why do I suddenly now need to use this other way, clearly the old cheap way works."

donaldlandru

@Dashrender said:

@scottalanmiller said:

Going to VSAN, Starwind, DRBD, etc. would be an "orders of magnitude leap" that is not warranted. It just can't make sense. What you have today and what you are talking about moving to are insanely "low availability." Crazy low. And no one had any worries or concerns about that, right?

That's just it - the company probably thinks they have that super high level of availability and the fact that they've never had a failure feeds that fire of belief.

This has always been the issue I've had when I try to redesign/spend more money on a new solution. I get the push back - "well, we did it that cheap way before and it worked for 8 years, why do I suddenly now need to use this other way, clearly the old cheap way works."

This -- all day long! It worked for the 7 years before you got here, it will keep working long after. I fight this fight every day

dafyre

@scottalanmiller From Previous posts, it sounds like they are most concerned with the Dev environment right now since the Ops cluster appears to be ok.

coliver

@donaldlandru said:

@Dashrender said:

@scottalanmiller said:

Going to VSAN, Starwind, DRBD, etc. would be an "orders of magnitude leap" that is not warranted. It just can't make sense. What you have today and what you are talking about moving to are insanely "low availability." Crazy low. And no one had any worries or concerns about that, right?

That's just it - the company probably thinks they have that super high level of availability and the fact that they've never had a failure feeds that fire of belief.

This has always been the issue I've had when I try to redesign/spend more money on a new solution. I get the push back - "well, we did it that cheap way before and it worked for 8 years, why do I suddenly now need to use this other way, clearly the old cheap way works."

This -- all day long! It worked for the 7 years before you got here, it will keep working long after. I fight this fight every day

All you need is one good outage and the tune changes. Or at least that's what I experienced at my last position.

donaldlandru

@scottalanmiller said:

@donaldlandru said:

Add a second SAN that replicates with the first (HP MSA easy to do, not so nice price tag)

I've never seen someone do this successfully. That doesn't suggest that it doesn't work, but are you sure that the MSA series will do SAN mirroring with fault tolerance? I'm not confident that that is a feature (but certainly not confident that it isn't.) Double check that to be sure as I talk to MSA users daily and no one has ever led me to believe that this was even an option.

I know that Dell's MD series cannot do this, only the EQL series.

Real life I am not sure if it works, on paper it does. It is a false sense of security but the MSA does have active/active controllers built in (10GB iSCSI), redundant power supplies, and of course the disks are in a RAID. The risks that are not mitigated by the single chassis are:

Chassis failure (I am sure it can happen, but the only part in the chassis is the backplane and some power routing)
Software bug -- most likely failure to occur
Human error (oops I just unplugged the storage chassis)

All in all I think the operations is pretty well protected, minus the three risks listed above. It is two nodes that can absorb either node failing, it is on redundant 10gig top of rack switches and redundant 1gig switches. Also, backups are done and tested as well with Veeam. Am I missing something here?

Unless I am mistaken, and Scott please correct me if I am, it is the three node development cluster that is in sorry shape.

dafyre

In your Dev environment, you have 3 servers... with 288GB of Ram, 64GB of RAM, and 16 GB of RAM... Assume RAM compatibility... What happens if you balance out those three servers and get them at least close to having the same amount of RAM?

Does that help you at all? If that is a good idea, then why not look at converting them to XenServer and switching to Local Storage? You could then replicate the VMs to each of the three hosts, or you could set up HA-Lizard.

dafyre

If I am not mistaken with planned outages, you can actually Migrate VMs from one XenServer host to another (this is also true for Hyper-V, IIRC) without having shared storage... So for Maintenance, you can live migrate from one XenServer host to another (it would copy the storage too)... Whether or not that is feasible depends on the size of your VMs and speed of your network... among other things.

Dashrender

@coliver said:

@donaldlandru said:

@Dashrender said:

@scottalanmiller said:

Going to VSAN, Starwind, DRBD, etc. would be an "orders of magnitude leap" that is not warranted. It just can't make sense. What you have today and what you are talking about moving to are insanely "low availability." Crazy low. And no one had any worries or concerns about that, right?

That's just it - the company probably thinks they have that super high level of availability and the fact that they've never had a failure feeds that fire of belief.

This has always been the issue I've had when I try to redesign/spend more money on a new solution. I get the push back - "well, we did it that cheap way before and it worked for 8 years, why do I suddenly now need to use this other way, clearly the old cheap way works."

This -- all day long! It worked for the 7 years before you got here, it will keep working long after. I fight this fight every day

All you need is one good outage and the tune changes. Or at least that's what I experienced at my last position.

Of course, that part is pretty obvious. That outage also helps the company come to more realistic understanding of it's uptime needs, and what types of outages it can really handle. But many companies have never had to deal with one, so we are stuck where we are.

donaldlandru

@dafyre said:

In your Dev environment, you have 3 servers... with 288GB of Ram, 64GB of RAM, and 16 GB of RAM... Assume RAM compatibility... What happens if you balance out those three servers and get them at least close to having the same amount of RAM?

Does that help you at all? If that is a good idea, then why not look at converting them to XenServer and switching to Local Storage? You could then replicate the VMs to each of the three hosts, or you could set up HA-Lizard.

The two smaller servers pre-date my time with the company and were likely back of truck specials. Both of these are slated to be replaced next year with a single server with similar specs to the big server. The smallest one is already maxed out and the other one doesn't make sense to upgrade just to retire.

I also don't need HA on these (I don't have HA today on these) so I think this is an opportunity to move to different platform.

scottalanmiller

@donaldlandru said:

@Dashrender said:

@scottalanmiller said:

Going to VSAN, Starwind, DRBD, etc. would be an "orders of magnitude leap" that is not warranted. It just can't make sense. What you have today and what you are talking about moving to are insanely "low availability." Crazy low. And no one had any worries or concerns about that, right?

That's just it - the company probably thinks they have that super high level of availability and the fact that they've never had a failure feeds that fire of belief.

This has always been the issue I've had when I try to redesign/spend more money on a new solution. I get the push back - "well, we did it that cheap way before and it worked for 8 years, why do I suddenly now need to use this other way, clearly the old cheap way works."

This -- all day long! It worked for the 7 years before you got here, it will keep working long after. I fight this fight every day

We average more than ten years from 'just a server'. High Availability is for when you need MORE than that.

Dashrender

@scottalanmiller said:

We average more than ten years from 'just a server'. High Availability is for when you need MORE than that.

I'm not really sure what this means?

Any Tier one, and possible most Tier two servers should last 10 years. Is that 10 years without a single failure? I'm guessing not.

scottalanmiller

@donaldlandru said:

Real life I am not sure if it works, on paper it does. It is a false sense of security but the MSA does have active/active controllers built in (10GB iSCSI), redundant power supplies, and of course the disks are in a RAID. The risks that are not mitigated by the single chassis are:

Not active/active. It has codependent controllers that fail together. It's the opposite of what people expect when they say "redundant". It's the two straw houses next door in a fire, scenario. Having two houses is redundant, but if they are both made of straw and there is a fire, the redundant house will provide zero protection while very likely making a fire that much more likely to happen or to spread. Active/Active controllers from HP start in the 3PAR line, not the MSAs.

All that other redundant stuff is a red herring. EVERY enterprise server has all of that redundancy but without the cripplingly dangerous dual controllers. Making any normal server MORE reliable than the MSA, not less. If anyone talks to you about the "redundant" parts in an MSA you are getting a sales pitch from someone trying very hard to trick you unless they point out that every server has those things so this is "just another server".

scottalanmiller

@donaldlandru said:

@dafyre said:

In your Dev environment, you have 3 servers... with 288GB of Ram, 64GB of RAM, and 16 GB of RAM... Assume RAM compatibility... What happens if you balance out those three servers and get them at least close to having the same amount of RAM?

Does that help you at all? If that is a good idea, then why not look at converting them to XenServer and switching to Local Storage? You could then replicate the VMs to each of the three hosts, or you could set up HA-Lizard.

The two smaller servers pre-date my time with the company and were likely back of truck specials. Both of these are slated to be replaced next year with a single server with similar specs to the big server. The smallest one is already maxed out and the other one doesn't make sense to upgrade just to retire.

I also don't need HA on these (I don't have HA today on these) so I think this is an opportunity to move to different platform.

Something to consider here, is Scale. It would be a forklift operation, more or less, but would let you consolidate everything, get HA thrown in and all for one price. Would not be cheap, but you could move workloads over as needed. Start with three nodes and replace the Dev environment up front and start moving over the Ops environment as you can.

Easily doesn't fit, but it makes this model really easy and gives you all of the features that you want with essentially zero effort.

scottalanmiller

@dafyre said:

If I am not mistaken with planned outages, you can actually Migrate VMs from one XenServer host to another (this is also true for Hyper-V, IIRC) without having shared storage... So for Maintenance, you can live migrate from one XenServer host to another (it would copy the storage too)... Whether or not that is feasible depends on the size of your VMs and speed of your network... among other things.

Yes, because XS includes "Storage vMotion" functionality for free.

scottalanmiller

@donaldlandru said:

Chassis failure (I am sure it can happen, but the only part in the chassis is the backplane and some power routing)

Software bug -- most likely failure to occur

Human error (oops I just unplugged the storage chassis)

Chassis failure is uncommon, but common enough that it gets discussed on SW regularly as people have their units die. Only see that every so many months, but it does happen and is not to be ignored. This one issue puts this into a "blade" risk scenario and we've seen just this month, people lose entire blade enclosures because of backplane or control issues. It's a small risk in the relative sense but a very real one.

Software bugs are huge on the MSA and any device in this class. They are magnified by the dual controllers so become extremely risky and cause outages at a pace that seem to dramatically outscale standard servers.

Human error is big and I've seen some pretty dramatic ones. It's more likely on an MSA than on local storage.