Someone doesn't like local storage for large amounts of data
-
@dafyre said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@Dashrender said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@olivier said in Someone doesn't like local storage for large amounts of data:
The thing, initially, was about having VMs with large VDIs. Which is for me not a good practice.
But if you need to store a large amount of data, it's better to connect to a remote file share in the VM and keep small system disks (excepts for db/web usage, which are not huge in general).
That's all.
edit: is it more clear now?
Let's see if I reword it correctly....
If your VM needs a lot of file storage.... then it is better to mount that from a file server rather than keeping it in the original VM?
Ok, I get that, but this goes against the "new fangled" HCI (call it what you want) use of local "attached" storage?
It doesn't not really. That's what caught @Dashrender it's more two things...
- Split up workloads to keep size down of individual loads
- Resort to raw storage when containerized storage gets too large and the above cannot be actioned
what does resort to raw storage mean?
Use direct access to the storage rather than a VDI. The size of the VDI is the concern.
So Xen as an example, you can use a raw LVM partition for a VM rather than VDI file. This fixes the large VDI problem.
A typical setup would be to have one smaller VDI, say 20GB, for the OS and then a raw partition, say 30TB, for the files.
Why complicate things like that? Why not just make a 20GB LVM for the OS, and a 30TB LVM for the data?
How does it complicate anything? Having two LVMs is just as complicated, if not moreso.
-
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
Isn't this 2016 or did I read my calendar wrong?
Are you surprised that enormous files are a problem?
-
@Dashrender said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@Dashrender said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@olivier said in Someone doesn't like local storage for large amounts of data:
The thing, initially, was about having VMs with large VDIs. Which is for me not a good practice.
But if you need to store a large amount of data, it's better to connect to a remote file share in the VM and keep small system disks (excepts for db/web usage, which are not huge in general).
That's all.
edit: is it more clear now?
Let's see if I reword it correctly....
If your VM needs a lot of file storage.... then it is better to mount that from a file server rather than keeping it in the original VM?
Ok, I get that, but this goes against the "new fangled" HCI (call it what you want) use of local "attached" storage?
It doesn't not really. That's what caught @Dashrender it's more two things...
- Split up workloads to keep size down of individual loads
- Resort to raw storage when containerized storage gets too large and the above cannot be actioned
what does resort to raw storage mean?
Use direct access to the storage rather than a VDI. The size of the VDI is the concern.
So Xen as an example, you can use a raw LVM partition for a VM rather than VDI file. This fixes the large VDI problem.
A typical setup would be to have one smaller VDI, say 20GB, for the OS and then a raw partition, say 30TB, for the files.
How does this make the situation any better? It still takes hours to migrate that data from one host to another. Does being raw somehow enable faster access to that data?
Technically faster, but that's not the reason. Stability is the driver here.
-
This whole topic came up under the discussion about Continuous Replication.
Dustin wants to replicate his VMs from his primary to a secondary. His main VM is a filer it has many (like 4-7 large - 500 GB+ ) volumes.
From what I can tell that @olivier was suggesting that the 'data' of this server should be elsewhere, not on the VM host. Of course the data could be on a SAN/NAS/Other VM host, whatever, but @olivier was suggesting that it shouldn't be on the XS in question because it would be to big and slow. OK fine - but Dustin still wants to protect his data, and make it have very small downtime windows. So, if Dustin puts the data portion on another XS box - Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
So in the stated above case, what benefit is there in putting the data any place else other than on the VM - because no matter what, Dustin wants the full data in two places (live and near live).
Does this make sense?
-
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
Isn't this 2016 or did I read my calendar wrong?
Are you surprised that enormous files are a problem?
No, I am not surprised at the file size, I'm a 'lil baffled that you want to use a raw partition
-
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
Isn't this 2016 or did I read my calendar wrong?
Are you surprised that enormous files are a problem?
No, I am not surprised at the file size, I'm a 'lil baffled that you want to use a raw partition
But aren't the two things one and the same? If you have a limitation on file sizes, raw is the only other option.
-
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
-
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
-
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
No, the idea of putting local storage into files is a post-virtualization concept.
-
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
You can still snapshot raw. Raw can be an image file or a volume or a full disk. Raw doesn't mean not virtualized.
-
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
Agreed. The problem that is being run into here is one of replication capacity and affects a NAS the same that affects a VM. So you solve both in the same way.
In a VM, you turn to Gluster, et al. In physical you turn to Exablox or similar.
-
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
You can still snapshot raw. Raw can be an image file or a volume or a full disk. Raw doesn't mean not virtualized.
ANd we did, a lot, prior to virtualizing.
-
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
You can still snapshot raw. Raw can be an image file or a volume or a full disk. Raw doesn't mean not virtualized.
ANd we did, a lot, prior to virtualizing.
Ya I still do for our workstations.
-
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
Agreed. The problem that is being run into here is one of replication capacity and affects a NAS the same that affects a VM. So you solve both in the same way.
In a VM, you turn to Gluster, et al. In physical you turn to Exablox or similar.
We have two Isilons coming. One is here and ready to be installed. Much easier than managing all of that myself.
Our guys can generate about 20TB a week between them all.
-
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
Agreed. The problem that is being run into here is one of replication capacity and affects a NAS the same that affects a VM. So you solve both in the same way.
In a VM, you turn to Gluster, et al. In physical you turn to Exablox or similar.
We have two Isilons coming. One is here and ready to be installed. Much easier than managing all of that myself.
Our guys can generate about 20TB a week between them all.
Looked at Isilon a bit a few weeks ago. Definitely nice gear there.
-
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
Agreed. The problem that is being run into here is one of replication capacity and affects a NAS the same that affects a VM. So you solve both in the same way.
In a VM, you turn to Gluster, et al. In physical you turn to Exablox or similar.
We have two Isilons coming. One is here and ready to be installed. Much easier than managing all of that myself.
Our guys can generate about 20TB a week between them all.
Looked at Isilon a bit a few weeks ago. Definitely nice gear there.
For the price it should be.
-
@stacksofplates Oh yeah, we didn't go with it, not cost effective at all. The price was a bit crazy.
-
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates Oh yeah, we didn't go with it, not cost effective at all. The price was a bit crazy.
We shaved it down a bit by supplying our own rack, power cables, PDU, etc. They tried to throw all of that in the quote.
Power cables at ~$60 a piece adds up.
-
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller said in Someone doesn't like local storage for large amounts of data:
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
Dustin would still need/want to CR that host to another host allowing him to spin up that data very quickly in the case of a failure.
Use tools built for that. GFS2, Gluster, Ceph, Swift, Cinder, etc. The VM would remount after booting in the new host and the storage still fails over.
Agreed. The problem that is being run into here is one of replication capacity and affects a NAS the same that affects a VM. So you solve both in the same way.
In a VM, you turn to Gluster, et al. In physical you turn to Exablox or similar.
We have two Isilons coming. One is here and ready to be installed. Much easier than managing all of that myself.
Our guys can generate about 20TB a week between them all.
There pretty nice. We'd gotten Demo units. We use the VMAX though.
-
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
You can still snapshot raw. Raw can be an image file or a volume or a full disk. Raw doesn't mean not virtualized.
@stacksofplates said in Someone doesn't like local storage for large amounts of data:
@FATeknollogee said in Someone doesn't like local storage for large amounts of data:
@scottalanmiller Isn't that part of why we all virtualize so we don't have deal with raw?
You can still snapshot raw. Raw can be an image file or a volume or a full disk. Raw doesn't mean not virtualized.
RAWs biggest limitation is you can't storage vmotion it. You can vmotion the pointer but if you are retiring a SAN or something you will be doing it manually.