ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Vendor Mistake - VMware Infrastructure Decisions

    Scheduled Pinned Locked Moved IT Discussion
    dellinfiniovendorsvmwarestorage
    57 Posts 9 Posters 11.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      StorageNinja Vendor @KOOLER
      last edited by

      @KOOLER said in Vendor Mistake - VMware Infrastructure Decisions:

      @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

      Before I started here a couple of months ago, my boss purchased a couple of Dell R630s and a PowerVault MD3820i (20 drive bays) to be our new infrastructure at HQ. We have dual 10Gb PowerConnect switches and two UPS devices, each connected to a different circuit. The plan is to rebuild the infrastructure on vSphere Standard (licenses already purchased) and have a similar setup in a datacenter somewhere (replicate the SANs, etc.). We're using AppAssure for backups (again, already purchased).

      The PowerVault has 16 SAS drives that are 1.8 TB 7200 RPM SED drives and 4 SAS drives that are 400 GB SSD for caching. Well, we made disk groups and virtual disks using the SEDs (letting the SAN manage the keys), but it turns out we cannot use the SSDs they sent us for caching. In fact, they don't have SED SSDs for this model SAN.

      At the time the sale was made, Dell ensured my boss everything would work as he requested (being able to use the SSDs for caching with the 7200 RPM SED drives). Now that we know this isn't going to be the case, we have some options.

      First, they recommended we trade in the PowerVault for a Compellent and Equalogic. The boss did not want that because he was saying you are forced to do RAID 6 on those devices and cannot go with RAID 10 in your disk groups. As another option, Dell recommended we put the SSDs in our two hosts and use Infinio so we can do caching with the drives we have. In this case we would make Dell pay for the Infinio licenses and possibly more RAM since they made the mistake.

      But I'm wondering if perhaps there is another option. Each server has 6 drive bays. So we have 20 drives total. Couldn't we have Dell take the SAN back, give us another R630, and pay for licenses of VMware vSAN for all 3 hosts? Each server has four 10 Gb NICs and two 1 Gb NICs. That might require we get additional NICs. But in this case, I'm not sure drive encryption is an option or if we can utilize the SEDs at all.

      I've not double-checked the vSAN HCL or anything for the gear in our servers as this is just me spit balling. Is there some other option we have not considered? We're looking to get the 14 TB or so of usable space that RAID 10 will provide, but the self-encrypting drives were deemed a necessity by the boss. And without some type of caching, we will not hit our IOPs requirements.

      Any advice is much appreciated.

      Keep R630s, refund PowerVault, refund AppAss. Get VMware VSAN and Veeam (accordingly).

      I've got (a non-trivial amount) of R630's in my lab running vSAN. You'll want the HBA 330 ideally (you can settle for the PERC H730 if you already have it) but otherwise the server works fine. Only limit over the R730/R730XD is fewer drive bays, and no GPU support.

      1 Reply Last reply Reply Quote 1
      • S
        StorageNinja Vendor @scottalanmiller
        last edited by

        @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

        @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

        @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

        @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

        @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

        @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

        I'm also assuming you are turning RAID off on each host so Starwind can provide RAIN for you (thus creating the storage pool).

        No, you leave RAID on on the hosts and Starwind provides Network RAID. There is no RAIN here.

        So you'd leave RAID on and then make a small local VMFS datastore for the Starwind VM to run on so that Starwind can use the rest of the unformatted storage on the host for its network RAID?

        You just follow the Starwind install guide. But yes, that is what is going on.

        After reading each of these, I finally understand how it works:
        http://www.vladan.fr/starwind-virtual-san-product-review/
        http://www.vladan.fr/starwind-virtual-san-deployment-methods-in-vmware-vsphere-environment/
        https://www.starwindsoftware.com/technical_papers/HA-Storage-for-a-vSphere.pdf

        So, in a nutshell, you do use RAID on the host as you normally would and even provision VMware datastores as you normally would. It's the VMDKs you present to the Starwind VM that get used as your virtual iSCSI target. And you can add in the cache size of your choice from the SSD datastores on your ESXi host.

        So if I'm patching servers like I should, I'd have to patch the VMs running Starwind as well. Oh man would I hate to install a patch from MS that bombs my storage. I guess theoretically that isn't too different from installing some firmware on a physical SAN that has certain bugs in it. If one Starwind VM gets rebooted, you still have your replication partner presenting storage to the hosts and are ok.

        Right. And Hyper-V alone has very tiny, solid patches. Nothing like patching the OS.

        Hyper-V with a console is just as big as windows server from a patching perspective, and even Core Install's see patches with regular (IE monthly quite often) frequency. The install requirements for The ~150MB VMKernel are tiny vs the 10GB+ for Hyper-V Core installs. ESXi regularly goes ~6 months without needing a patch. Most of the patch surface is in upper stack things.

        scottalanmillerS 1 Reply Last reply Reply Quote 1
        • S
          StorageNinja Vendor @NetworkNerd
          last edited by

          @NetworkNerd said:

          @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

          @John-Nicholson said in Vendor Mistake - VMware Infrastructure Decisions:

          @scottalanmiller If your doing a 3 node vSAN for a low cost deployment you should go single socket and get more core's per proc. Leaves you room to scale later and costs the vSAN cost in half.

          They are likely stuck here with whatever was already bought. But good info for a greenfield deployment. Or if they manage to return these for three R730 for example.

          I'm not entirely certain we'll be stuck with what we bought. My boss and I were on a conference call with folks from Dell yesterday afternoon. They were talking about different options in SAN devices that would meet our requirements (whether it was Compellent, EMC, etc.), but the biggest issue was that these options were so expensive. Again, not one of them mentioned the potential for a VSAN deployment, so we brought it up (using either VMware VSAN or Starwind). The Dell team has to go back and redesign a quote for gear that would better support a VSAN deployment. In their words, they would likely have to return the servers and the PowerVault we have right now (not sure about the other gear - PowerConnect switches, TrippLite devices, APC PDUs, AppAssure appliance, and ip KVM switch).

          I'll be curious to see what comes back when they re-quote.

          Honestly may just be a matter of the inside team isn't familiar with it yet (They just re-assigned who has to know what products, and people are flying all over the place training people). Worst case call the VMware inside SDS desk (They are in Austin, right across the parking lot from Spiceworks HQ). Those guys have been piecing together vSAN quotes and have heads dedicated to work with your Dell team and make sure stuff is good.

          Now off to pack for ANZ for 2 weeks to go some of the mentioned training....

          scottalanmillerS 1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @StorageNinja
            last edited by

            @John-Nicholson said in Vendor Mistake - VMware Infrastructure Decisions:

            @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

            @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

            @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

            @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

            @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

            @NetworkNerd said in Vendor Mistake - VMware Infrastructure Decisions:

            I'm also assuming you are turning RAID off on each host so Starwind can provide RAIN for you (thus creating the storage pool).

            No, you leave RAID on on the hosts and Starwind provides Network RAID. There is no RAIN here.

            So you'd leave RAID on and then make a small local VMFS datastore for the Starwind VM to run on so that Starwind can use the rest of the unformatted storage on the host for its network RAID?

            You just follow the Starwind install guide. But yes, that is what is going on.

            After reading each of these, I finally understand how it works:
            http://www.vladan.fr/starwind-virtual-san-product-review/
            http://www.vladan.fr/starwind-virtual-san-deployment-methods-in-vmware-vsphere-environment/
            https://www.starwindsoftware.com/technical_papers/HA-Storage-for-a-vSphere.pdf

            So, in a nutshell, you do use RAID on the host as you normally would and even provision VMware datastores as you normally would. It's the VMDKs you present to the Starwind VM that get used as your virtual iSCSI target. And you can add in the cache size of your choice from the SSD datastores on your ESXi host.

            So if I'm patching servers like I should, I'd have to patch the VMs running Starwind as well. Oh man would I hate to install a patch from MS that bombs my storage. I guess theoretically that isn't too different from installing some firmware on a physical SAN that has certain bugs in it. If one Starwind VM gets rebooted, you still have your replication partner presenting storage to the hosts and are ok.

            Right. And Hyper-V alone has very tiny, solid patches. Nothing like patching the OS.

            Hyper-V with a console is just as big as windows server from a patching perspective, and even Core Install's see patches with regular (IE monthly quite often) frequency. The install requirements for The ~150MB VMKernel are tiny vs the 10GB+ for Hyper-V Core installs. ESXi regularly goes ~6 months without needing a patch. Most of the patch surface is in upper stack things.

            Any recommendation of Hyper-V obviously means without a console.

            1 Reply Last reply Reply Quote 1
            • scottalanmillerS
              scottalanmiller @StorageNinja
              last edited by

              @John-Nicholson said in Vendor Mistake - VMware Infrastructure Decisions:

              @NetworkNerd said:

              @scottalanmiller said in Vendor Mistake - VMware Infrastructure Decisions:

              @John-Nicholson said in Vendor Mistake - VMware Infrastructure Decisions:

              @scottalanmiller If your doing a 3 node vSAN for a low cost deployment you should go single socket and get more core's per proc. Leaves you room to scale later and costs the vSAN cost in half.

              They are likely stuck here with whatever was already bought. But good info for a greenfield deployment. Or if they manage to return these for three R730 for example.

              I'm not entirely certain we'll be stuck with what we bought. My boss and I were on a conference call with folks from Dell yesterday afternoon. They were talking about different options in SAN devices that would meet our requirements (whether it was Compellent, EMC, etc.), but the biggest issue was that these options were so expensive. Again, not one of them mentioned the potential for a VSAN deployment, so we brought it up (using either VMware VSAN or Starwind). The Dell team has to go back and redesign a quote for gear that would better support a VSAN deployment. In their words, they would likely have to return the servers and the PowerVault we have right now (not sure about the other gear - PowerConnect switches, TrippLite devices, APC PDUs, AppAssure appliance, and ip KVM switch).

              I'll be curious to see what comes back when they re-quote.

              Honestly may just be a matter of the inside team isn't familiar with it yet (They just re-assigned who has to know what products, and people are flying all over the place training people). Worst case call the VMware inside SDS desk (They are in Austin, right across the parking lot from Spiceworks HQ). Those guys have been piecing together vSAN quotes and have heads dedicated to work with your Dell team and make sure stuff is good.

              Now off to pack for ANZ for 2 weeks to go some of the mentioned training....

              You think Dell engineers don't know about VMware? That seems... terrifying 😉

              1 Reply Last reply Reply Quote 0
              • NetworkNerdN
                NetworkNerd @Net Runner
                last edited by NetworkNerd

                @Net-Runner said in Vendor Mistake - VMware Infrastructure Decisions:

                I have a VMware-based cluster of two ready-nodes purchased from Starwind https://www.starwindsoftware.com/starwind-hyperconverged-appliance half a year ago so I will try to share my experience on that matter. These are completely DELL-based and the pricing is very fair compared to what DELL OEM-partners want for the same configurations.
                As already mentioned above, in this particular scenario, StarWind runs inside a VM on each host. The underlying storage is presented over a standard datastore. Alternatively, you can pass-through the whole RAID controller to StarWind VM in case if your ESX resides on a bootable USB/SD/SataDOM/whatever which is a common and good practice nowadays. The usage of hardware RAID makes the overall performance of a single server much faster than you can achieve using software RAINs provided by either VMware vSAN or MSFT S2D (I’ve done some benchmarking on that matter).
                ESX hosts are connected over iSCSI to both StarWind VMs simultaneously. These VMs are mirroring the internal storage and presenting this storage back to ESX as a single MPIO-capable iSCSI device. Since round robin policy is used there is no storage failover in case if one StarWind VM is being softly restarted for patching or the whole physical host suddenly dies. In the case of single host power outage, only the migration of production VMs takes place but storage remains active which I find quite awesome.
                Another thing that I do enjoy in StarWind is that it uses RDMA-capable networks (I have Mellanox Connectx3) for synchronization which leaves a lot of CPU resources for primary tasks instead of serving storage requests.
                Right now I am waiting for Linux-based StarWind VSA implementation which is told to arrive soon.

                What license of vSphere came with that, and what version of vSphere are you running on the ready nodes?

                1 Reply Last reply Reply Quote 0
                • NetworkNerdN
                  NetworkNerd
                  last edited by

                  Here's an update for folks following this thread. I was told Dell found a 1.6 TB SED SSD certified with another Dell storage appliance which uses the same firmware and controllers as our PowerVault MD3820i. They think it may work with our configuration and have shipped us one to test. If that does not work correctly, we will continue to look at VSAN options (VMware or Starwind).

                  1 Reply Last reply Reply Quote 1
                  • NetworkNerdN
                    NetworkNerd
                    last edited by

                    Here's the latest:We had been doing some testing with Infinio while waiting for the SSD from Dell using diskspd inside a VM that had VMDKs on multiple datastores that were LUNs on the SAN.  Their read caching works very well if you need something for that purpose.

                    Dell sent us the 1.6 TB SED SSD mentioned above (not officially a supported configuration), but it actually made the SAN overall slower using our benchmarking tools and would only apply the cache to one of the controllers for whatever reason.  Dell understood and was willing to help us pursue additional options to make it right.

                    During the process of Dell trying to get us one of those drives that they thought would work in our SAN, I had mentioned we should look at VMware VSAN.  But they were quoting it using exact ready node configurations (dual CPU sockets per node), which would have put us over our vSphere licensed limit for this location (4 sockets) in addition to having to purchase VSAN Standard licenses.  I suggested single socket and 4 hosts.  There are SED options that will work with VSAN, but it really limits you in terms of choices.

                    As far as the end solution goes, it looks like we'll get bumped to Enterprise Plus in our vSphere licensing to take advantage of VM Encryption as well as getting VSAN Standard for each host for a hybrid config.  That way we can use larger spinning disks in the hosts and let the software handle the encryption.  We will have to have an external KMS which will also be provided as part of the solution.

                    The only thing to answer now is whether VxRail does the trick or we go with some kind of modified ready node / build your own host for VSAN. The SAN we have now and 2 R630 hosts plus two of the 10 GB PowerConnect switches will go back to Dell to exchange.

                    Starwind was a consideration, but it did not seem as easy to manage and maintain as VSAN for a 4-node configuration to get the storage capacity needed.

                    1 Reply Last reply Reply Quote 2
                    • NetworkNerdN
                      NetworkNerd
                      last edited by

                      After many conversations between my boss and his Dell team, here's what we're getting (as of early next week):

                      • Upgrade to vSphere Enterprise Plus and vSAN Standard for 6 sockets
                      • Four Dell R730s with single socket Xeon procs and 10 drives each (8 10K SAS 1.2 TB non-SED HDDs, 2 SSDs) for two vSAN disk groups per host, running ESXi on mirrored SD cards
                      • Hytrust for KMS (to be used with VM Encryption) with support paid for by Dell

                      Here's what we are returning:

                      • Dell PowerVault MD 3820, all SAS SEDs in it, all SSDs that were originally for caching
                      • Two PowerEdge R630 servers with dual socket Xeon processors and no internal drives
                      • Two PowerConnect N4032 switches that were slated for connectivity to the SAN only
                        We will be keeping 2 of the PowerConnects we originally ordered to stack and use for the VSAN cluster here at HQ.

                      We originally had vSphere Standard and vCenter Standard for 6 sockets (4 sockets for here at HQ and 2 sockets for the DR site). Those 6 sockets will still be spread as 4 at HQ and 2 at the DR site, making a 4-node vSAN cluster at HQ and a 2-node cluster at the DR site with witness). We're keeping the AppAssure appliance as well.

                      So with the vSAN 6.6 release just this week, it means we will be on the bleeding edge once everything is configured. The setup would probably make a great series of blogs assuming I have the time to write them.

                      Thanks to everyone here for the help and advice. I'm excited to play with the new toys!

                      1 Reply Last reply Reply Quote 2
                      • scottalanmillerS
                        scottalanmiller
                        last edited by

                        Cool. They seem to have really come through.

                        1 Reply Last reply Reply Quote 1
                        • 1
                        • 2
                        • 3
                        • 3 / 3
                        • First post
                          Last post