Cross Posting - Storage Spaces Conundrum
-
I cannot image a file server needing that kind of speed. Are these files being read and wrote consistently or something?
A DB or logging server I could see, but not a normal file server.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Do you have IOPS numbers? How is the old SAN configured?
Too many questions to do more than speculate.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@scottalanmiller Thanks for the feedback. My long term goal was to purchase more of these thus providing scale out. RAID 10 is the plan at this scale. I have read a lot of threads with your comments about the different scale out options you use or suggest. Buying 3x servers with 24 disks is a possibility for sure. What would be your suggestion going down the 3 server route?
Dell R730xd is really nice for building your own storage. HPE Proliant DL380 G9 is quite nice, too. Although for scale out, I'd often lean to SuperMicro. Talk to OpenIO, their product likely fits your needs very well for build your own scale out of this nature. You need their enterprise product with the network file system option to do what you want.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@Dashrender Yes i think that SSD for the amount of storage we need is going to be too expensive. I have £39k to spend on the entire solution at this time. This is not to say we won't have more money available later. But I also need to think of backups etc in that same pot of cash.
The bottleneck is the drives they just can't read and write fast enough....I think that that might give you the ability to move to Exablox and have a totally built out, totally supported solution in your price envelope. If not, it should be close. In the US the price would be $30K for the three base units then the drives added on top of that. You would use NL-SAS drives at a fraction of the cost per TB as what you are looking at now.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Scott would have to say if RAID 5 or 6 is doable with SSD at this scale?
RAID 6, yes.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
Doesn't this double or more the cost? You'd need at least two times the storage.
-
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.
-
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.
Why won't one node do it? Where is the bottle neck in one node? We really don't know enough from the OP to know where the bottle neck really is - we've only been told that it's the disk throughput, but that's really not enough information. Number of users simultaneously accessing, how much data, how it's accessed, etc. Maybe the real bottleneck is the network, we just don't have enough information.
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
Since the general consensus around these parts is that RAID cards don't fail often, it's not something you make redundant within a single box. So that only leaves b - two copies of the data.
-
@travisdh1 said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
I don't follow this. Can't SANs have terabytes of connectivity to their servers? Sure it's possible the OP has a single 1 GB connection from his server to his SAN, but it's also possible that he has three 10 GB connections. Again we don't have enough information.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
The MD2000 units are known to not even do that properly. The "redundant" RAID cards are active/passive instead of active/active, which actually increases the risk of a RAID controller failure. I'm guessing that red-herring was from a sales person when the current storage was purchased. Let's be glad for the OP that they are moving off of it.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Why won't one node do it? Where is the bottle neck in one node? We really don't know enough from the OP to know where the bottle neck really is - we've only been told that it's the disk throughput, but that's really not enough information. Number of users simultaneously accessing, how much data, how it's accessed, etc. Maybe the real bottleneck is the network, we just don't have enough information.
Storage capacity is the bottleneck with a single host with the amount of storage the OP is trying to attain.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
@travisdh1 said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
@DustinB3403 Hopefully some answers to your questions.
- I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
- It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
- I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
- I need at least 50 TB to start but I anticipate that to double in a year.
Hope this helps.
Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!
I don't follow this. Can't SANs have terabytes of connectivity to their servers? Sure it's possible the OP has a single 1 GB connection from his server to his SAN, but it's also possible that he has three 10 GB connections. Again we don't have enough information.
I have to stop burying my questions: How is it connected to the hosts?
Yeah, we can't really know until more details are forthcoming.
-
Just doing basic calculations (regardless of risks)
70x1.8TB drives in RAID5 with a spare drive is 122400 GB (122TB)
In RAID 10 that's 63000 GB (63TB)
-
@Dashrender said
Why won't one node do it? Where is the bottle neck in one node?
Because you can increase speed by lowering the demand on a single device.
If I take 100 users and put them to a single box, that box needs to be able to cope with 100 all hammering it at once.
Two boxes, make that job easier but now you gain 2 separate items to fall over and die before you have a reliability issue with up time.
Again, you could decrease the cost of a single really expensive node, with 4 nodes which go some of the way there but are then combined so you get a total storage pot which is bigger.
-
For the amount of storage he's talking, if you're looking at scale-out, I can definitely say Exablox would be good here.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@munderhill said in Cross Posting - Storage Spaces Conundrum:
The bottleneck is the drives they just can't read and write fast enough....
@munderhill said
- The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
- No the system is not virtual.
Are you talking about the current system or the new system in terms of not being virtual?
"Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "
How many users?
If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.
And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.
There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.
Doesn't this double or more the cost? You'd need at least two times the storage.
Not necessarily. You need full redundancy to do this even with only a single point of failure. Going to stuff like Exablox takes you to triple mirroring only, so not doubling over where you started. Others can do it with only normal mirroring, like Scale does.
-
@Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:
@Dashrender said
Doesn't this double or more the cost? You'd need at least two times the storage.
Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.
The Scale does RAIN with only the overhead of RAID 10 across the entire cluster. So you never lose more storage than you would have already lost.
-
@Dashrender said in Cross Posting - Storage Spaces Conundrum:
Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).
Or don't use RAID cards