Understanding Server 2012r2 Clustering
-
Anyway, back to my original question and why you shouldn't install Exchange or SQL on a SAN. If you don't use application level clustering and you use local storage then there is risk of data loss as a result of hardware failure. That risk is the same as the risk of data loss with HA and a SAN, right? Unless you would always restore from backup after a crash, which I wouldn't. I would allow SQL and Exchange to recover from the crash and hope no data loss has occurred. In this situation, your recovery time with HA is significantly quicker than with local storage. Which is the point of HA, isn't it?
What am I missing?
-
@Carnival-Boy said:
Anyway, back to my original question and why you shouldn't install Exchange or SQL on a SAN. If you don't use application level clustering and you use local storage then there is risk of data loss as a result of hardware failure. That risk is the same as the risk of data loss with HA and a SAN, right?
That's correct. In both cases you lack protection against crash cases.
Although it should be pointed out that using HA could actually automate a disaster and keep things running even though there is dataloss. Possibly unknown dataloss. SO you would have to decided for your organization if silently losing data is better or worse than extra downtime while people decide what to do.
In finance, you go down rather than lose financial data silently. When an outage like this happens, you stay down until humans decide it is okay to continue. In another environment, you have to weight the benefits and risk.
-
@Carnival-Boy said:
What am I missing?
That the cost of a SAN and HA is extremely high. If they were free, they might make a lot of sense. But when you have to buy a SAN, introduce more points of failure altogether (an inverted pyramid is more likely to experience a failure than a stand alone server) and then have to buy HA you are lowering your overall HA. So you are not getting the "HA point" since you are moving away from HA.
That this entire thing doesn't do the "point" is one thing. That it doubles the corruption points is an additional thing.
HA "is" about uptime. But if dataloss is acceptable you can do all kinds of things to get HA that are cheaper and easier. Most people assume the point of HA is to keep things running while not losing data.
Whether dataloss is acceptable or not, adding the SAN and HA product for databases is an inverted pyramid and is not a path towards high availability but a path away from it.
That it confuses people into thinking it is a DAG replacement is another layer of issues above and beyond the core ones.
-
@Carnival-Boy said:
I would allow SQL and Exchange to recover from the crash and hope no data loss has occurred. In this situation, your recovery time with HA is significantly quicker than with local storage. Which is the point of HA, isn't it?
But you are effectively inducing a crash. You have three points of potential failure at least (each server and the SAN.) Each one is just as likely to fail as the next. In the real world where people are building at this scale, SANs are actually more likely to fail than servers but you can spend big bucks to fix this problem - but why spend lots of money to not be as good as DAG? Just setting money on fire, never a good business decision.
So we aren't doing HA, we are below SA (standard availability.) Less available than a normal server. That there are two points of data corruption (if the SAN fails OR if the host fails) instead of just one is an additional risk beyond the normal problems of an inverted pyramid.
So if the point of HA is uptime, then this architecture does not make sense. If the point is the be financial wise, this is the worst option. If the point is high availability without dataloss, it just gets worse and worse.
-
Another thing to consider, what else are you protecting with the SAN and HA? If you are only protecting ONE SQL server and ONE Exchange server, the costs of purchasing an additional SQL license and an additional Exchange license so you can run software HA instead is much less expensive than the SAN will be. Granted you'll need twice the amount of disk, but that is also likely much less expensive.
This whole situation is definitely what Scott was talking about earlier in 'doing IT.' While I'm annoyed at my own ignorance and lack of thought process, reading Scott's posts have definitely added to my understand and helped me to change my thinking process. Not saying I always see it his way, but I try to be more encompassing in my thought process - but even my current phone project is showing that I'm still lacking.
-
Just in case it wasn't obvious what the "suggested" options were, because people often say to me "Okay, so the SAN is wrong, but what is right?"
In this case there are two generally good options, depending on business needs.
If HA is really needed, if uptime and data loss are bad and it is worth a lot of money to ensure that this does not happen, then you should trust your vendor and do things their way. While I'm down on white papers and whatnot, a good vendor is going to provide a way for them to make money and for you to get what you want. Microsoft is excellent at this. They make a product specifically designed to make Exchange as reliable as possible that they provide themselves. They will support you if you do other things, but they only provide one solution here that is their own... DAG. If HA, fault tolerance and data protection are important DAG is the answer.
For the vast majority of SMB (and even many enterprises) DAG is too much money to protect too little. That's fine and assumed. That almost anyone should be using DAG is not the suggestion, only that spending more than DAG to get less is not the answer. For most people, the answer is as simple as "just use a stand alone server." A stand alone server is a fraction of the cost of the SAN approach both in hardware and in software (does not require HA licensing for VMware.) And it has 67% fewer points of failure (one instead of three.) And it has no chained dependencies. And it is much simpler to manage making human failure less likely. And it has fewer points of potential data corruption (one instead of two.) These things add up. Save money, better performance, more reliable in both uptime and in data protection. It's a pretty big win unless you compare it to DAG which costs more but has better uptime and data protection.
Only a DAG cluster and a stand alone server have clear "winning" use cases.
-
@Dashrender said:
While I'm annoyed at my own ignorance and lack of thought process, reading Scott's posts have definitely added to my understand and helped me to change my thinking process. Not saying I always see it his way, but I try to be more encompassing in my thought process - but even my current phone project is showing that I'm still lacking.
I'm lacking too, but I'm not down about it. I just like getting free advice from Scott. Companies pay thousands for the kind of advice that we get here for free.
I have only have around 100 users, so neither a SAN/HA or DAGs/clustering were ever a real consideration for me. But that user count could be going up to 200 soon through mergers and acquisitions (or a new job ), so it's something I'm trying to understand a bit more about.
It seems to me that we should avoid treating SAN/HA as an alternative to DAG/application clustering as really they're different solutions to different problems. SAN/HA addresses minimising downtime whilst DAG/clustering also addresses data loss. So you need to identify your problem and then select your solution, rather than picking your solution first.
-
I definitely agree with Scott and have learned a lot from him through threads here and on SW!
I am a firm believer in the right tools for the right job. The SAN vs Local Storage (DAG, replication, etc, etc) will always be a decision to be made. And I am coming to agree more and more with @scottalanmiller that Local Storage + Replication + Good Backups will almost always be the better answer for an SMB.
My take-away from my first and only SAN deployment (as smoothly as it went, it could have been much, much, much worse) is that you must define what you are looking for clearly (both to yourself, your team, and to your vendor of choice). You must also do your own research about the products the vendor recommends. We were recommended a number of solutions that were FT only and not true HA. Once your vendor has made you a quote for "Product X" check and make sure the product's company defines things the same way that you do. If you think "Product Y" may be a better fit for what you are wanting, then ask your vendor about it.
Never be afraid to second guess yourself, your vendor, or your peers in a respectful way.
-
@Carnival-Boy said:
It seems to me that we should avoid treating SAN/HA as an alternative to DAG/application clustering as really they're different solutions to different problems. SAN/HA addresses minimising downtime whilst DAG/clustering also addresses data loss. So you need to identify your problem and then select your solution, rather than picking your solution first.
Let me add at that scale. HA addresses minimising downtime, but SAN does not. SAN actually exacerbates downtimes all other things being equal. Many people, especially sales people, conflate SAN with "buying super expensive equipment* which is not the same thing. You can but $60K SAN that are crazy reliable. This is very true. But for even better uptime than two hosts connected to a $60K SAN, you could just buy a $60K server that has fewer points of failure but matches or beats the expensive SAN in all the ways that it is reliable. That's part of the sales trick, making SANs look more reliable than servers by comparing entry level servers to high end SANs. But at the same levels, they are equally reliable (or slightly weighted to the advantage of servers due to massively larger volumes) so anything that is done to make a SAN reliable can be done to make a single server more reliable.
So while HA is all about uptime and a SAN may or may not be a part of that strategy, the real value to a SAN is in scalability. SANs are more flexibly scalable than any other solution. Scalable means in number of supported physical hosts. So if you need to scale to very large host counts, SAN is the obvious choice. That's SAN's one strong suit - and it is a big one. But when you don't need that one thing, SAN lacks its major "pro" and just comes with the "cons."
-
@Carnival-Boy said:
So you need to identify your problem and then select your solution, rather than picking your solution first.
That's huge. So many SMBs (and I assume enterprises but I see it rarely there) approach with the solution and then try to figure out how to make it work. I see it constantly where people start with "I need a SAN" and can't tell you why they feel that they need or want it. The desire for the SAN is often the starting point of the conversation rather than a clear business need. So when you ask what their goal is or how they are servicing the business, they are lost.
-
@scottalanmiller said:
@Carnival-Boy said:
So you need to identify your problem and then select your solution, rather than picking your solution first.
That's huge. So many SMBs (and I assume enterprises but I see it rarely there) approach with the solution and then try to figure out how to make it work. I see it constantly where people start with "I need a SAN" and can't tell you why they feel that they need or want it. The desire for the SAN is often the starting point of the conversation rather than a clear business need. So when you ask what their goal is or how they are servicing the business, they are lost.
If you can't answer the why I need $product , then chances are you really don't need it. I think @Carnival-Boy got it spot on though. Identify the problem and find products that solve that problem... Not buy the square peg and try to make it fit into the smaller moon-shaped hole.
-
Maybe I missed it, but it seems there is a huge amount of discussion are HA and SANs on Spiceworks, but relatively little on application clustering and DAGs or SQL Server resilience. So this thread is something of an eye-opener for me.
If you work on the premise that databases aren't a good fit for VMware HA and shouldn't be installed on a (non-redundant) SAN then I believe you rule out nearly every mission critical system an SMB is likely to use.
Thinking of my own environment, I have ERP, Sharepoint, EDM, Exchange and AD, all of which are databases. It's only really the file server that isn't (and a lot of that is moving to the Sharepoint and EDM servers). The other servers aren't really mission critical and/or are fairly static, like print servers, so can be fired up from a backup very easily without significant loss of data.
So protecting and managing databases becomes the key. And the resellers I've dealt seem to have very little knowledge or experience of SQL Server. Possibly because they're from a hardware background, or possibly because there isn't much money in SQL Server in the SMB marketplace. So they're pushing a hardware solution when a software solution is what SMBs really need (I guess).
-
@Carnival-Boy said:
Maybe I missed it, but it seems there is a huge amount of discussion are HA and SANs on Spiceworks, but relatively little on application clustering and DAGs or SQL Server resilience. So this thread is something of an eye-opener for me.
I'm on a good percentage of those discussions (I think) and DAG does not come up too often. Very often the people considering SANs are not actually listing their workloads and only, so they say, trying to get the platform to HA and not really considering if that fixes the workloads or not. But for SMBs, applications that have DAG available to them are few. Exchange tends to be hosted. SQL Server tends to be the Express (no DAG) edition.
But lots of workloads are similar, even without DAG. Active Directory Domain Controllers don't use DAG but have their own application layer failover. Same for MySQL and other databases.
-
@Carnival-Boy said:
If you work on the premise that databases aren't a good fit for VMware HA and shouldn't be installed on a (non-redundant) SAN then I believe you rule out nearly every mission critical system an SMB is likely to use.
Absolutely. This is what I've been saying to SMB - SAN is for one purpose only scale. Spiceworks even had me give a webinar in February about that. SMBs look to enterprises for "what to do" but enterprises all have scale (by definition) and SAN is about cost savings for them. SMBs tend to look at SAN not understanding its purpose and being confused by enterprise storage consolidation and thinking that that somehow applies to them, which it does not. Not that no SMB should have SAN, but it is few and far between and always for the purpose of storage consolidation at scale. SAN does not provide reliability, SAN hurts reliability but can be beneficial for other reasons and made (at cost) to overcome the inherent reliability concerns.
HA is not as much a "never for SMBs" thing as SANs, but it should be rare. SMBs often think that they need HA far beyond the needs of huge enterprises like investment banks (think Canary Wharf) and other enormous, big money loss outage companies. Some need HA and some get HA for cheap which changes the equation (Active Directory HA is super cheap) but going after platform HA (what VMware offers) rarely does what they think and almost never makes sense. It's really focused on technology like web servers where load balancers have not been implemented.
-
@Carnival-Boy said:
Thinking of my own environment, I have ERP, Sharepoint, EDM, Exchange and AD, all of which are databases. It's only really the file server that isn't (and a lot of that is moving to the Sharepoint and EDM servers). The other servers aren't really mission critical and/or are fairly static, like print servers, so can be fired up from a backup very easily without significant loss of data.
File servers are typically "good" HA workloads. Minor risk for data loss and data loss is typically really tiny (like one file or two that you can restore as a single file from backup rather than the whole system). There are other solutions to consider. DFS on Windows, for example, and full fault tolerance with DRBD on Linux or HAST on FreeBSD. These can be complicated to implement, but are all free. In cases where you might spend a lot of money on VMware's HA offering, you could go beyond HA to complete fault tolerance without using VMware, for free (but with some effort.)
So in that case, it is a trade off depending on your needs.
-
@Carnival-Boy said:
So protecting and managing databases becomes the key. And the resellers I've dealt seem to have very little knowledge or experience of SQL Server. Possibly because they're from a hardware background, or possibly because there isn't much money in SQL Server in the SMB marketplace. So they're pushing a hardware solution when a software solution is what SMBs really need (I guess).
The profit margin and "not my fault, call the vendor" benefits of a SAN are enormous. A single sale might set up a salesman for a month or two without needing to make another sale. And the "blame the vendor" benefits are huge. It's very hard to complain to your salesman when a SAN fails, they have another throat to give you to choke. Whereas if they recommend something else, the only throat to choke might be theirs, and they don't want to deal with that even if it is in your interest.
You are correct, there is very little call and very little money in SQL reliability for the SMB. It would be rather surprising that any salesman have been trained on and often might not even be aware of how databases really work. Database skills tend to be more enterprise leaning.
-
@scottalanmiller said:
But for SMBs, applications that have DAG available to them are few. Exchange tends to be hosted. SQL Server tends to be the Express (no DAG) edition.
Oh really? What's the definition of SMB? I don't have a lot of experience. Most of the companies I've worked for have around 100 users. In Europe we use the term SME which I think is generally up to 250 employees. I'd have thought anything over 50 users and you would be avoiding Express for mission critical applications and anything over 150 users and you'd be giving serious consideration to database availability and using Enterprise licencing and features.
-
@Carnival-Boy said:
Oh really? What's the definition of SMB? I don't have a lot of experience. Most of the companies I've worked for have around 100 users. In Europe we use the term SME which I think is generally up to 250 employees. I'd have thought anything over 50 users and you would be avoiding Express for mission critical applications and anything over 150 users and you'd be giving serious consideration to database availability and using Enterprise licencing and features.
There is no hard and fast rule, but pretty typically SMB is 20 - 500 users (but lots of people disagree, many say 1 - 200, IBM says 2,000 - 5,000, etc. IBM considers anything under 2,000 to not be a business.) SME, in the US, is the category above SMB. Sort of. But the names, of course, make no sense at all. But SME is generally used to refer to larger, maybe 250 - 1,000, person companies.
But regardless of those murky definitions, most companies under 500 users that I see tend to use Express. That doesn't mean that lots don't, but most tend to try to go for free. Database needs for SMBs tend to be pretty light and only so many workloads use SQL Server.
At 100 users I'm not sure I've seen enterprise licensing in more than one or two companies. It's decently rare there, I think.
Now what really tends to make sense for companies in that range is to often not have SQL Server at all since it is extremely expensive and the prices really only tend to make sense for larger businesses where the benefits of SQL Server can be leveraged. You can get the high end features of SQL Server for free from players like PostgreSQL.
-
If you think about SQL Server availability cost, getting an HA SQL Server setup is very costly. In a typical (what's typical?!?!) 100 person company, the cost can be brutal. Data loss might be of serious concern, but generally uptime is not (is four years once a decade worth tens of thousands of dollars to protect against?) A good, single server setup for database with good RAID and good backup can reduce the risk of data loss to extremely low levels while keeping availability to perfectly acceptable levels (for most companies.)
It takes a bit of risk analysis and every business (and workload) is unique. But pretty often, HA is not needed. Even for 1,000 person companies for nearly all workloads.
-
@scottalanmiller said:
most companies under 500 users that I see tend to use Express.
Gosh. Obviously 500 users doesn't mean 500 concurrent users, but still, I wouldn't want 500 users accessing Express. That performance can't be great.
For ERP systems where a new system can run into hundreds of thousands of dollars, the cost of SQL Server Standard is pretty trivial in my opinion. It's less than $10k for a basic 2 core licence.