The Inverted Pyramid of Doom Challenge
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Backups - (Some Hypervisors have changed block tracking so a backup takes minutes, others don't meaning a full can take hours). BTW I hear Hyper-V is getting this in 2016 (Veeam had to write their own for their platform).
Sure... but what does the application layer care? Either the application takes care of its own backups and doesn't care what the hypervisor does, or it relies on IT to handle backups and it isn't any of their concern either.
Again, this is an application vendor or programmer trying to get involved in IT decisions, processes and designs. Do you let the company that makes your sofa determine how big your fireplace has to be because "they want to ensure that you are cozy?"
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it. Otherwise I'll be saying "GALLERY FURNITURE SUCKS THEY SELL COUCHES THAT DON"T WORK". This is what users, application owners, and infrastructure people do today. Vendors MUST protect their name. I'm not saying these whiners make any sense, but people do this.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
For the 5 years I consulted "why is this slow" was one of the most common engagements. 9/10 of the time I was chasing some crazy application issue it had nothing to do with the application. Generally it was staring people in the face, had a giant RED alarm, and was fairly obvious (Disk latency isn't supposed to be 1200ms, and NL-SAS drives shouldn't be used for DB's in 5 billion dollar companies Yo?). Assuming that vendors are crossing a line by assuming internal IT doesn't understand what it will take to deliver their applications is CRITICAL to being a successful application vendor. I've seen users, IT and C-suite trash applications that worked fine, but the infrastructure was all wrong....
And that's why external IT consulting was brought in. Not a random application vendor.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
Yup, and if they outside IT to the application vendor, that SLA isn't owned by the actual IT department but by someone who came in, put in something new and ran away. If the application owners need a reliable RPO/RTO, they need to work with IT, not work against them.
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Application owners have RPO/RTO's and they expect the infrastructure people to often take care of that. (When I have a 5TB OLTP database, in guest options generally fail to deliver somehow).
Yup, and if they outside IT to the application vendor, that SLA isn't owned by the actual IT department but by someone who came in, put in something new and ran away. If the application owners need a reliable RPO/RTO, they need to work with IT, not work against them.
Its the reality in most companies. Software vendors requirements are not rooted in how IT SHOULD be run, but how it does. I agree with you in principal (it shouldn't matter) I've just seen hundreds of counter examples that would have destroyed these companies names.
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
The "IT guy is always" right attitude in IT bothers me. Part of why I always enjoyed arguing with you (and others internally) as its the only way to challenge my idea's and learn and thing of new things. Part of the reason I enjoyed consulting (although I did learn a lot of tact of how to carefully make people think it was their idea, or gently expose why what they were doing was hilariously a bad idea).
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it.
Would you honestly still do business with a furniture store that didn't state the size (that's demanding certain performance, the results not the means) but rather demanded that you buy a certain make or style of door regardless of the fact that the one that you had would have been big enough? because that's the comparison.
Making sure that the SIZE is right I always agreed to. Demanding only doors from certain vendors be used is where the insanity happens. Or forcing you to install a new door because they don't trust your measurements.
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
If I buy a couch or desk that's massive for a tiny apartment I could see the sales guy asking how big my door way is to make sure they can deliver it.
Would you honestly still do business with a furniture store that didn't state the size (that's demanding certain performance, the results not the means) but rather demanded that you buy a certain make or style of door regardless of the fact that the one that you had would have been big enough? because that's the comparison.
Making sure that the SIZE is right I always agreed to. Demanding only doors from certain vendors be used is where the insanity happens. Or forcing you to install a new door because they don't trust your measurements.
I think we've reached stasis here. I've provided examples where the platform matters.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Its the reality in most companies. Software vendors requirements are not rooted in how IT SHOULD be run, but how it does.
It's mandated shadow IT. It's a subversive approach. One nice thing for internal IT is that any failing of the system you get to run the vendor through the ringer. But I want products for my customers that are based around them being successful, not assuming their failure. I have very different goals (I want the company to succeed) than the software vendors (they could care less if it works, only that they don't get blamed.)
I don't blame vendors for taking advantage of bad business processes, suckers deserve to be suckered they have no one to blame but themselves, but my point is that good IT would be working on protecting their businesses from these processes and good management would be tasking them to do so.
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
That's not incompetence, though. That's just someone lying. there is a difference.
-
One last thought...
IF the reason that Xen has 2% market share is because there is NO LOGICAL REASON for vSphere or paid Hyper-V (with VMM to manage) then that means 98% of IT people are idiots. If 98% are idiots, wouldn't that mean they should be outsourcing their IT as much as possible to their vendors or others? (and therefore not deploy Xen).
Catch-22
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
There was thread on SW recently where someone said "NIMBLE SUCKS I DON"T GET THE IOPS I PROMISED". The next post was his Nimble sales rep posting "So I see your at 20% load, your IO latency is .5 ms currently and while your 220C model is one of our smaller ones we have far larger ones. If your having any problems please call us and we will help you" I laughed, but it made me realize the damage that incompetent IT do to the name of a product or application. We are at the point that a sales rep would rather piss off a customer and call them out as an idiot (he was nice about it) than risk their companies name being drug through the mud.
That's not incompetence, though. That's just someone lying. there is a difference.
I learned years ago to never prescribe malice to what you can attribute to ignorance in this industry. He likely was unhappy the number in his dashboard didn't say 100K!
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
I think we've reached stasis here. I've provided examples where the platform matters.
Okay, I'll buy that. The platform matters when internal IT has failed and you outsourced to an external IT department who has an interest in selling you something that you don't need to make extra money on probably the sale and definitely the consulting. Yes, I agree, but I don't agree that that doesn't match my original point. It's not in the interst of the customer, but there is a reason why they feel that they have to do it based on other decisions made in the same way.
Do you feel, however, that since this discussion is based on scale for the context of the original question, that there is ever a realistic time that this happens at three or fewer compute nodes? We are talking about three nodes for an entire business here. What business, anywhere, is that small and deploying systems where vendors interact with them in this manner? I'm not saying that theoretically it isn't possible, but this thread is asking for an example where this has ever happened.
Outside of pure theory, and even there I feel that it is hard to theorize, who has products that need these kinds of things while being so small as to not have benefits of the IPOD due to scale?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
I learned years ago to never prescribe malice to what you can attribute to ignorance in this industry. He likely was unhappy the number in his dashboard didn't say 100K!
And likewise, my rule of thumb is that willful ignorance is one of the worst forms of malice
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
One last thought...
IF the reason that Xen has 2% market share is because there is NO LOGICAL REASON for vSphere or paid Hyper-V (with VMM to manage) then that means 98% of IT people are idiots. If 98% are idiots, wouldn't that mean they should be outsourcing their IT as much as possible to their vendors or others? (and therefore not deploy Xen).
Catch-22
Agreed. That makes total sense. And I agree 100%. Almost all (and I totally mean that, something like 98%) of IT should be outsourced. The industry should shrink dramatically, the smaller pool of people who remain should be consolidated into fewer shops and those shops should demand far higher levels of excellence and continued training and raise salaries significantly as the industry tends to lose people who are really valuable because they can often make more money elsewhere and choose to.
However, this is what I'm talking about that assuming bad decisions will be made we then make bad recommendations based on that. It's not valuable to make recommendations and there is no point in doing so, to the 98%. They neither look for nor listen to good advice. Good advice always exists solely for those that will take it. It's the same discussion as "is college valuable for you (in IT.)" If every single person ever listened to that advice, it would be self defeating in weird ways. But they don't, advice around it is for the .1% who might listend and enact change. Good advice remains good advice, that there are reasons why bad decisions are made doesn't make them bad decisions.
I'm never sure if I can explain what I mean here well, but I see it a lot in IT - people give bad advice (could be in any arena, I see it constantly, though) based on the idea that "well they won't listen to my actual good advice." Sure, we know that they won't, but should we make bad recommendations because of that?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
The "IT guy is always" right attitude in IT bothers me.
Me too, that's why I rarely feel that IT should be allowed to decide things. But IT is the sole critical information holder for a lot of things in the business, performance, cost and security being key ones. No other department has a view into these factors. It is the job of business management to ensure that they have skilled IT that knows its role in the business, to pass decisions through them and to listen to them. And it is IT's job to understand that it is part of the context of the business and only knows certain factors.
That's why an IT veto is important because IT can identify scams, incompetence, lack of industry standards, bad practices and such that other departments would likely not even understand (what do you MEAN VB6 isn't a common thing any longer!!) - because they can filter out options that have no place being considered. Whereas the business needs to make most decisions because they know what is valuable to the business. Neither can act alone. The idea with much of this, though, is that IT would be bypassed and much of the most important IT decisions be made by people who are not IT.
Imagine doing this to HR or accounting. Would other departments demand that HR violate good practices around compensation or reporting? Or that accounting not correctly record expenditures or not pay all taxes? (These do happen, and it's normally very bad.)
-
This is my quote from the original challenge: "We all (I hope by now) know that SANs have their place and a super obvious one that explains why enterprises use them almost universally and know why that usage has no applicability to normal SMBs - scale."
I agree with why lots of shops might deploy systems like you are describing, even if I generally don't agree with that decision, but I'm pretty confident that the use cases that you are describing @John-Nicholson are tied, nearly universally, to a scale that would already prompt a SAN-based infrastructure (or similar.)
Have you seen these in small environments where the scale did not exist to warrant a SAN otherwise?
-
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
Security - Guest introspection support to hit compliance needs, Micro-segmentation requirements (EPIC has drop in templates for NSX, Possibly HCI at some point). If you want actual micro-semtnagation and inspection on containers there isn't anything on the market that competes with Photon yet. At some point there may be ACI templates but that will require network hardware lock in (Nexus 9K) and that's even crazier (Applications defining what switch you can buy!).
Really, once you even start to think of defining the storage, you have to define the switches too. Once you are into that realm of not allowing IT to screw anything up, you can't let them screw anything up. You'd really want to be defining cabling, UPS and more as well.
-
I'm definitely not trying to say that there are absolutely zero potential use cases for an IPOD, but only that they are so rare below the "scale" line (the drop dead number is three and the general rule of thumb is twelve) that I'm wondering if anyone has a real world example.
Even within the described examples, they are theoretical I believe, and very unlikely. How many of these have been observed?
It's worth pointing out the use case and adding it as an aside to a recommendation document. I'll give a completely different example, that hopefully explains my thinking...
If someone is deciding on if they want to attend university or not, we generally focus on things like the time and money, career advantages and such. But there are career choices, like doctor, lawyer or teacher, that require a degree and it is not in any way a decision, it simply is a requirement. That doesn't imply that the degree is useful for that field outside of the requirement, but the requirement is the requirement. So if an IPOD is a non-IT requirement without business context, it does not fall into the business context. While this should be assumed as being an exception to any case of business logic, it should probably be explicitly stated nonetheless because people often forget about the cases outside of the decision matrix - or focus solely on them and think that cases outside of the logic pool influence those within it (e.g. the career success of doctors tells us nothing about the value of a college degree outside of that one case where it is a requirement.)
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
This is my quote from the original challenge: "We all (I hope by now) know that SANs have their place and a super obvious one that explains why enterprises use them almost universally and know why that usage has no applicability to normal SMBs - scale."
I agree with why lots of shops might deploy systems like you are describing, even if I generally don't agree with that decision, but I'm pretty confident that the use cases that you are describing @John-Nicholson are tied, nearly universally, to a scale that would already prompt a SAN-based infrastructure (or similar.)
Have you seen these in small environments where the scale did not exist to warrant a SAN otherwise?
Have you see a flexpod or vBlock? Part of CI is defining the network switches and configuration of the fabric. The argument is even with a 20% capital markup the time to outcome out weighs the do it yourself approach, and historically they are right. The difference is going from CI to HCI has moved the time to value down exponentially. I think the logical progression for HCI vendors in some area's is to do just this.
Scale (long before you worked with them) in the old GPFS days had a stricter HCL for switches than any other iSCIS storage vendor I had ever seen. You know what, there was a reason. Scale out systems are incredibly vulnerable to shifty low end switching. I even tried deploying one with 3750X's (much more expensive than the 2910AL's, but practically much slower) and performance was awful until switches were replaced. The funny thing was the customer tried to blame scale (and not the slow Cisco switches that the network team was in love with). I would argue Scale is "ahead of the curve" in having HCL's and restrictions on outside factors that can make them look bad (this was something like 5 years ago).
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
This is my quote from the original challenge: "We all (I hope by now) know that SANs have their place and a super obvious one that explains why enterprises use them almost universally and know why that usage has no applicability to normal SMBs - scale."
I agree with why lots of shops might deploy systems like you are describing, even if I generally don't agree with that decision, but I'm pretty confident that the use cases that you are describing @John-Nicholson are tied, nearly universally, to a scale that would already prompt a SAN-based infrastructure (or similar.)
Have you seen these in small environments where the scale did not exist to warrant a SAN otherwise?
The deployment of shared storage (in whatever form it takes) I view about a demand for operational flexibility more than just a flat driver for availability of apps that can't HA themselves. The assumption that a SAN is driven entirely by scale alone, or HA alone is a false assumption.
Until HCI became more mainstream a DAS synchronous array (like his HUS) was really as bullet proof as you could get and still enjoy the operational benefit. Still for some customers (My old industry of TAS) that "pet rock" type deployment still has value over HCI.
-
2 nodes is still limited for HCI. For mission critical environments with low skills in house (which is the TAS industry) true 2 node shared nothing systems still pose some risk (Quorum is not a concept they easily grasp, and in house staff are fully capable of screwing it up and split braining things). Shared DAS is far harder to @#$@ up. HCI can scale down, but as you have to have witness systems (And DRDB's common use with heartbeat of an IP I find to be a bullshit excuse of a witness system as it lacks state) there is extra complexity that can be beyond many steady state ops teams.
-
White glove service (including onsite no matter where the hell the deployment is) has power. There's a lot you can do remote, but when the storage layer goes down, having a technician onsite who's badged and knows WTF they are doing is a powerful force. The farther up the stack you go this becomes less critical but for low level services that are stateful (and network used to access remotely) there is still value in good on site techs.
Now I recognize that most of this can be mitigated by outsourcing infrastructure operations. But at that point why would I buy a scale cluster (Vs. just put my stuff in one of the major public clouds, or THOUSANDS of VCAN partners who can hit my niche of compliance, operations capabilities, PaaS, and geographic connectivity requirements?). I"m seeing hundreds of public cloud providers adopt HCI, and realistically if the benefits (from cost, scaling etc) can be done in a hosted enviroment where there is better operations, engineering I would argue the real counter argument for small shops is not
HCI (with something like Scale) vs SAN, but rather HCI vs. Hosted Now maybe that is Scales end game (Have some key MSP partners they drive customers towards using) but building multi-tenant tools are hard, and that's a heavy arms race that will require quite a bit of capital to get right and would stray so far from their target market I don't see it.
-
-
@scottalanmiller said in The Inverted Pyramid of Doom Challenge:
@John-Nicholson said in The Inverted Pyramid of Doom Challenge:
I think we've reached stasis here. I've provided examples where the platform matters.
Okay, I'll buy that. The platform matters when internal IT has failed and you outsourced to an external IT department who has an interest in selling you something that you don't need to make extra money on probably the sale and definitely the consulting. Yes, I agree, but I don't agree that that doesn't match my original point. It's not in the interst of the customer, but there is a reason why they feel that they have to do it based on other decisions made in the same way.
Do you feel, however, that since this discussion is based on scale for the context of the original question, that there is ever a realistic time that this happens at three or fewer compute nodes? We are talking about three nodes for an entire business here. What business, anywhere, is that small and deploying systems where vendors interact with them in this manner? I'm not saying that theoretically it isn't possible, but this thread is asking for an example where this has ever happened.
Outside of pure theory, and even there I feel that it is hard to theorize, who has products that need these kinds of things while being so small as to not have benefits of the IPOD due to scale?
Sure. If the customer has oracle or SQL 2016 or Windows Datacenter or other licensing per socket, simply shaving down from 3 nodes to 2 could be significant.
I've also seen companies where they had a site with very low compute requirements (Port facility) where they needed to scale deep (400TB for Video archive). Replicated local or RAIN is crazy more expensive on this (they spend 87K on a HUS with a RAID 60 style DP pool for this if memory serves, good luck buying 800 or 1600TB of disks for that price...
I know you like trying to find absolute rules for the SMB (which to be honest they kind of need, because if there's anything I learned from consulting in that space, or watching random SpiceWorks comments it is that everyone is at a subconscious level drawn towards awful idea's) but we are running increasingly into a world of workloads and needs that completely have no relation to what that company or site's industry or employee count is. Simple exclusionary rules make even less sense.
Its like decisions on RAID for storage systems. Increasingly the historic rules (Deploy RAID 10, and size for capacity) is becoming awful advice, and with most storage systems that are modern its not even something you can choose anymore as the decision is abstracted at a RAIN level (or in the case of most modern storage appliances it a fixed erasure code set based on a stripe width of their NVRAM's ability to destage a write). The real savior of the SMB here is platforms, appliances and systems that remove the ability to go off and do something stupid, rather than "hard and fast rules" that increasingly don't matter (or are wrong).