Cross Posting - Storage Spaces Conundrum
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
Doesn't this double or more the cost? You'd need at least two times the storage.
-
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.
Why won't one node do it? Where is the bottle neck in one node? We really don't know enough from the OP to know where the bottle neck really is - we've only been told that it's the disk throughput, but that's really not enough information. Number of users simultaneously accessing, how much data, how it's accessed, etc. Maybe the real bottleneck is the network, we just don't have enough information.
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
Since the general consensus around these parts is that RAID cards don't fail often, it's not something you make redundant within a single box. So that only leaves b - two copies of the data.
-
@travisdh1 said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
I don't follow this. Can't SANs have terabytes of connectivity to their servers? Sure it's possible the OP has a single 1 GB connection from his server to his SAN, but it's also possible that he has three 10 GB connections. Again we don't have enough information.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
The MD2000 units are known to not even do that properly. The "redundant" RAID cards are active/passive instead of active/active, which actually increases the risk of a RAID controller failure. I'm guessing that red-herring was from a sales person when the current storage was purchased. Let's be glad for the OP that they are moving off of it.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Why won't one node do it? Where is the bottle neck in one node? We really don't know enough from the OP to know where the bottle neck really is - we've only been told that it's the disk throughput, but that's really not enough information. Number of users simultaneously accessing, how much data, how it's accessed, etc. Maybe the real bottleneck is the network, we just don't have enough information.
Storage capacity is the bottleneck with a single host with the amount of storage the OP is trying to attain.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
@travisdh1 said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
I don't follow this. Can't SANs have terabytes of connectivity to their servers? Sure it's possible the OP has a single 1 GB connection from his server to his SAN, but it's also possible that he has three 10 GB connections. Again we don't have enough information.
I have to stop burying my questions: How is it connected to the hosts?
Yeah, we can't really know until more details are forthcoming.
-
Just doing basic calculations (regardless of risks)
70x1.8TB drives in RAID5 with a spare drive is 122400 GB (122TB)
In RAID 10 that's 63000 GB (63TB)
-
@Dashrender said
Why won't one node do it? Where is the bottle neck in one node?
Because you can increase speed by lowering the demand on a single device.
If I take 100 users and put them to a single box, that box needs to be able to cope with 100 all hammering it at once.
Two boxes, make that job easier but now you gain 2 separate items to fall over and die before you have a reliability issue with up time.
Again, you could decrease the cost of a single really expensive node, with 4 nodes which go some of the way there but are then combined so you get a total storage pot which is bigger.
-
For the amount of storage he's talking, if you're looking at scale-out, I can definitely say Exablox would be good here.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
Doesn't this double or more the cost? You'd need at least two times the storage.
Not necessarily. You need full redundancy to do this even with only a single point of failure. Going to stuff like Exablox takes you to triple mirroring only, so not doubling over where you started. Others can do it with only normal mirroring, like Scale does.
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
The Scale does RAIN with only the overhead of RAID 10 across the entire cluster. So you never lose more storage than you would have already lost.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
Or don't use RAID cards
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@Dashrender said
Why won't one node do it? Where is the bottle neck in one node?
Because you can increase speed by lowering the demand on a single device.
But you can also increase speed by making on device faster. Scale up rather than scale out. Scale up is going to get really costly here, he needs a VMAX to keep scaling up. But it can be done.
We used to do single SANs with 96Gb/s FC connected to them. They were pretty fast.
-
@dafyre said in Cross Posting - Storage Spaces Conundrum:
For the amount of storage he's talking, if you're looking at scale-out, I can definitely say Exablox would be good here.
That's what I would think.
-
@DustinB3403 said in Cross Posting - Storage Spaces Conundrum:
Hello Spiceheads (from here),
I am currently looking at implementing a large file server. I have a Lenovo server with 70x 1.8tb 10k sas drives attached via DAS. This server will be used as a file server. Serving up 80% small files 1 - 2mb and 20% large files 10GB+.
> What I am not sure about is how to provision the drives. Do I use RAID? Should I use storage spaces? Or should I go with something else like ScaleIO, OpenIO, Starwinds etc..?
I am looking for a solution that is scalable so if I wanted to increase the volume and I was also thinking about a little future proofing so setting this up so I could scale it out if I wanted to.
This dose need to be resilient with a quick turn around should a disk go down and it also needs to be scalable.
Looking forward to hearing your views.
StarWind assumes you use some local RAID (hardware or software). We do replication and per-node redundancy is handled by RAID. So we do RAID61 (RAID1-over-RAID6) for SSDs and HDDs, RAID51 (RAID1-over-RAID5) for SSDs, RAID01 (RAID1-over-RAID0) for SSDs and HDDs (3-way replication is recommended), and like RAID101 (RAID1-over-RAID10) for HDDs and SSDs. It's very close to what SimpliVity does if you care. ScaleIO does 2-way replication on the smaller block level and needs no local RAID (but they take away one node from the capacity equation so from 3 nodes raw you'll get [(3-1)/2] you can really use). OpenIO is something I've never seen before you posted so I dunno what they do.
-
@KOOLER said in Cross Posting - Storage Spaces Conundrum:
OpenIO is something I've never seen before you posted so I dunno what they do.
We have it running here They are here in ML, too.
-
@scottalanmiller said in Cross Posting - Storage Spaces Conundrum:
@KOOLER said in Cross Posting - Storage Spaces Conundrum:
OpenIO is something I've never seen before you posted so I dunno what they do.
We have it running here They are here in ML, too.
That's interesting! Nice to see more storage startups from Europe (France?).
are they VM-running or do they have native port ?