Massive Azure Outage and No Support
-
What's worse is I've been warning that they were going to screw this up for months. For about six months we've been trying to get them to fix their account issues because we knew that they were getting this all messed up and were going to do something stupid. Their account managers kept saying that everything was "fixed" or everything was "fine."
Now we've been down for over eight hours and they refused to even talk to the technical people until the account was fixed. The account that was "fixed" for the last six months?
I am so not happy. That they didn't fix the account before this when I've been warning and warning them that this was risky makes this impossible to be an accident. They decided to play fast and loose with our account because they were lazy or didn't care. Everyone who works there leaves so quickly they probably thought (and were probably right) that it would be "someone else's problem" when it fell apart.
-
Did you have some customers hosted on Azure or was this exclusively NTG stuff?
-
@coliver said:
Did you have some customers hosted on Azure or was this exclusively NTG stuff?
Customers too.
-
@scottalanmiller said:
@coliver said:
Did you have some customers hosted on Azure or was this exclusively NTG stuff?
Customers too.
That really isn't good.
-
I'm told that they have decided that it is okay to turn them back on now. We will see.
-
@scottalanmiller said:
I'm told that they have decided that it is okay to turn them back on now. We will see.
Waiting,....with baited breath.
-
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
-
I've run into the multiple accounts on a vendors backend problem before.
Hell Disney has this problem too. If you register at their streaming movies website it never asks you to create a username, of course this leads you to the assumption that your email address is your username. But if you setup an account on store.disney.com they ask for a user or an email address.
This combined with the fact that their authentication process actually all goes through GO.COM left me without access to my store account for 4 days. Additionally since I use LastPass, Lastpass wasn't able to log me in correctly because of the whole damned GO.COM problem, because not all logon prompts provide the correct information as to where the credentials are going, so LP didn't offer me the needed information.
-
@MattSpeller said:
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
On-prem for the WIN!!!!
-
@MattSpeller said:
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
In this case, from what I know... it's not the cloud that has teh issue... it's the Account manager and Billing department.
-
@gjacobse Account managers and billing departments and the janky support are all part and parcel of it. I'd argue it's 100% a cloud issue. Just not a technical IT issue.
-
@MattSpeller said:
@gjacobse Account managers and billing departments and the janky support are all part and parcel of it. I'd argue it's 100% a cloud issue. Just not a technical IT issue.
It's a Microsoft issue. Anyone seen something like this with Rackspace, Amazon, Digital Ocean, etc.? This is twice in a month with MS on an issue like this.
-
Now, to buy time, they are claiming that they are turning the systems on. But it has been fifteen minutes and not a single one is back on yet. We are at nine hours down now and no answers, no fixes, no nothing.
-
@scottalanmiller Totally agree it's MS that screwed up, just pointing out that this is a little talked about issue with cloud computing. Maybe it should be called "Trust in Other Companies" computing. The other companies you mention still have billing departments and account managers and they're all human and capable of (unintentionally) giving you a real bad day. How do you build reliability into the cloud from outside?
-
@MattSpeller said:
@scottalanmiller Totally agree it's MS that screwed up, just pointing out that this is a little talked about issue with cloud computing. Maybe it should be called "Trust in Other Companies" computing. The other companies you mention still have billing departments and account managers and they're all human and capable of (unintentionally) giving you a real bad day. How do you build reliability into the cloud from outside?
Well one option is crossing cloud boundaries, a lot of companies do this. They host half on Amazon and half on Rackspace, for example.
-
You can have multiple accounts with a single provider. Unfortunately Azure causes problems with this and you can't safely use that technique with Microsoft because of the whacky authentication stuff that they try to do. Their authentication is why this all happened. They get so weird about how that stuff works that even they can't figure out which account is which.
-
This thread makes me glad we only have colocated systems. We did tons of research into the colo, and we are 200% happy with our choice. They have all the niceties... onsite generators, 2x power feeds, each w/ UPS redundancy, redundant network feeds.... I feel like our little, specialized cloud hosting platform we provide is more robust than MS now.
-
@RojoLoco said:
@MattSpeller said:
Stuff like this makes me glad we have (most) of our junk internal. So much trust to put in another company!
Edit: advantages to cloud are obvious, don't get me wrong.
On-prem for the WIN!!!!
Yes and no. On premises you and your team however small are responsible. The last two Non Profits I worked for I was the sole IT person, supporting (x) staff, (x) servers and (x) locations.
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
Ideally, in a hosted arrangement, you would have more than just 2 or 3 people to work on the issue, therefore allowing the task of rebuilding / replacing a system to be balanced - sleep is a good thing to have. You start to make mistakes after so many hours of being awake and dealing with a single issue.
-
@gjacobse said:
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
The cost of being a one person show is high, no doubt. I still love it.
-
@MattSpeller said:
@gjacobse said:
I walked in on Wednesday to a server crash - I walked out of the office at 1pm on Friday.
The cost of being a one person show is high, no doubt. I still love it.
I really love my team. That's not to say I didn't enjoy being the soul person. But sometimes you need to have a second pair of eyes to catch that obscure error you missed looking over the logs for teh 24th time.