ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Azure Outage... Again

    IT Discussion
    microsoft azure
    13
    79
    21.4k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • wirestyle22W
      wirestyle22 @scottalanmiller
      last edited by

      @scottalanmiller said in Azure Outage... Again:

      @aaronstuder said in Azure Outage... Again:

      I just created a VM - no problem seen here....

      What we've learned from rooms full of people who see Azure outages all the time is that nearly all outages are very localized. They seem to have their systems set up in such a way that even platform level issues only appear to certain blocks of people. Probably people running on different servers or whatever (not for the VMs, obviously, but for the console.) This lets MS claim zero outages even in the face of hundreds of pissed off customers in a single room talking about how they often have outages.

      Literally paying for a service that brings your entire company (more or less) to a standstill. I hope the client isn't a production company.

      1 Reply Last reply Reply Quote 0
      • gjacobseG
        gjacobse @Alex Sage
        last edited by

        @aaronstuder said in Azure Outage... Again:

        I just created a VM - no problem seen here....

        Ah - but which site are you running from?

        A 1 Reply Last reply Reply Quote 0
        • A
          Alex Sage @gjacobse
          last edited by

          @gjacobse East US

          scottalanmillerS 1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @Alex Sage
            last edited by

            @aaronstuder said in Azure Outage... Again:

            @gjacobse East US

            How can you tell? I would guess that that is the one that we are on too. I know what datacenters the VMs were in, but not which datacenter is providing the console.

            1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller
              last edited by

              We can't even open a ticket about our VMs being down. MS Support is offline for us, too. Their outage with Azure goes so deep in their infrastructure that it disables their technical problem reporting system. So we can't file an outage, because Azure is down that hard (for us.)

              This is how most Azure outages have been, both for us and for people that we have spoken to about it. This seems to be very common, they don't just lose the VMs, they lose everything including all visibility and communications channels.

              A 1 Reply Last reply Reply Quote 0
              • scottalanmillerS
                scottalanmiller @Alex Sage
                last edited by

                @aaronstuder said in Azure Outage... Again:

                @scottalanmiller said in Azure Outage... Again:

                Microsoft says: That they will respond to the outage in eight hours.

                Eight hours for Azure being 100% down (within our visible scope?) That's a pretty awful SLA for just checking their tickets to see that their platform has gone offline (within some scope.)

                For all Internet facing Virtual Machines that have two or more instances deployed in the same Availability Set, we guarantee you will have external connectivity at least 99.95% of the time. - http://uptime.is/99.95

                Problem there is that MS lies through their teeth about not having outages and refuses to acknowledge them. And so far, almost all outages that we've seen remove:

                • Ability to open tickets
                • Ability to report outages
                • Ability to track downtime
                • All Availability Sets and Failover

                So those SLAs are totally fake. They don't even have systems for working with them. Paying for extra support would be crazy as they don't respect the support systems that they have.

                1 Reply Last reply Reply Quote 0
                • A
                  Alex Sage @scottalanmiller
                  last edited by

                  @scottalanmiller How you tried from the US?

                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller @Alex Sage
                    last edited by

                    @aaronstuder said in Azure Outage... Again:

                    @scottalanmiller How you tried from the US?

                    Yup, NY and KY.

                    1 Reply Last reply Reply Quote 0
                    • scottalanmillerS
                      scottalanmiller
                      last edited by

                      And the client sees it down from PA and MD.

                      1 Reply Last reply Reply Quote 0
                      • gjacobseG
                        gjacobse
                        last edited by

                        0_1461686119016_2016-04-26 11_54_43-NTG - PURPLEPRINCESS - Connected.png

                        This is what we are showing.. and we have I believe four systems running under Azure..

                        1 Reply Last reply Reply Quote 0
                        • scottalanmillerS
                          scottalanmiller
                          last edited by

                          Way more than four.

                          1 Reply Last reply Reply Quote 0
                          • A
                            Alex Sage
                            last edited by

                            What the status of your subscription?

                            scottalanmillerS 1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller @Alex Sage
                              last edited by

                              @aaronstuder said in Azure Outage... Again:

                              What the status of your subscription?

                              Can't check on it. The outage has taken out the system that shows it.

                              A 1 Reply Last reply Reply Quote 0
                              • scottalanmillerS
                                scottalanmiller
                                last edited by

                                Which is what we see with most outages... they lose some core database that reports subscriptions, this cascades to the console and on to the VMs. It's, and this is just me guessing, probably a database instance that handles the subscription data or some data that builds the subscription that has failed and then all of the other outages are likely from dependencies on that system. We've see that or almost exactly that a few times and tons of other companies (hundreds) that we have interfaced with (mostly via MS conferences) have reported the exact same problem as what they see most often.

                                1 Reply Last reply Reply Quote 0
                                • A
                                  Alex Sage @scottalanmiller
                                  last edited by

                                  @scottalanmiller said in Azure Outage... Again:

                                  Can't check on it. The outage has taken out the system that shows it.

                                  0_1461686683807_upload-f39f0a08-79e7-49c5-b108-42e8f96ed4af

                                  1 Reply Last reply Reply Quote -1
                                  • gjacobseG
                                    gjacobse
                                    last edited by

                                    0_1461686730602_2016-04-26 12_05_15-NTG - PURPLEPRINCESS - Connected.png

                                    A 1 Reply Last reply Reply Quote 1
                                    • A
                                      Alex Sage @gjacobse
                                      last edited by

                                      @gjacobse That seems like a issue.

                                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                                      • scottalanmillerS
                                        scottalanmiller @Alex Sage
                                        last edited by

                                        @aaronstuder said in Azure Outage... Again:

                                        @gjacobse That seems like a issue.

                                        Yes, that's why we think that their loss of subscription data is the core of the issue. Their VMs are dependent on the subscription data but they can't keep their subscription data working.

                                        wirestyle22W 1 Reply Last reply Reply Quote 0
                                        • wirestyle22W
                                          wirestyle22 @scottalanmiller
                                          last edited by

                                          @scottalanmiller said in Azure Outage... Again:

                                          @aaronstuder said in Azure Outage... Again:

                                          @gjacobse That seems like a issue.

                                          Yes, that's why we think that their loss of subscription data is the core of the issue. Their VMs are dependent on the subscription data but they can't keep their subscription data working.

                                          How would they have configured this? Wouldn't any of their servers be clustered within multiple data centers? How does this happen with such a huge service?

                                          scottalanmillerS 1 Reply Last reply Reply Quote 0
                                          • scottalanmillerS
                                            scottalanmiller @wirestyle22
                                            last edited by

                                            @wirestyle22 said in Azure Outage... Again:

                                            @scottalanmiller said in Azure Outage... Again:

                                            @aaronstuder said in Azure Outage... Again:

                                            @gjacobse That seems like a issue.

                                            Yes, that's why we think that their loss of subscription data is the core of the issue. Their VMs are dependent on the subscription data but they can't keep their subscription data working.

                                            How would they have configured this? Wouldn't any of their servers be clustered within multiple data centers? How does this happen with such a huge service?

                                            They have several known issues in this system. My guess is that they either have another external system that manipulates this one that feeds in bad data and causes outages that way, or that the code of the system that interacts with it has bugs and causes issues that way. The former, I think, is the far more likely based on a few factors - namely that account "type" often affects this. For example, because we are an MS Partner, there have been reports that some partner system has regularly connected to Azure's database and caused it to corrupt.

                                            No amount of clustering, multiple data centers or keeping servers up can fix this problem in the least. The problem is, from what we've been told, all from their workflows and security. Basically they have an unhealthy, non-working system that is given permission to control Azure and has been known to "randomly" cause Azure to totally fail.

                                            1 Reply Last reply Reply Quote 1
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 2 / 4
                                            • First post
                                              Last post