Designing for tech startup: Network, AD, Backup etc
-
@JaredBusch said in Designing for tech startup: Network, AD, Backup etc:
A petabyte is an insane amount of data for a medium sized company (500 you said right?).
You would be looking at 250 8TB drives in RAID10 or 168 12TB drives in RAID 10.Yea, my math was giving errors long before I got this result. but that is an insane number of drives.
@Pete-S said in Designing for tech startup: Network, AD, Backup etc:
The setup with firewalls, etc looks like a lot like our colo setup except that we have more hosts but no end users and no POE switches.
Firewalls will need to sync states between each other which is usually on it's own interface. So an arrow between the firewalls.Good point. I'll make note to this in the plan.
@Pete-S said in Designing for tech startup: Network, AD, Backup etc:
Don't you want to spread out the POE switches? Run two 10GbE links to every POE switch and place them close to the end users. Single mode fiber if there is any distance involved.
Over all - they likely could / would be. Depends on the layout of the office / building. anything beyond the magic copper limit would be fiber to another POE
@travisdh1 said in Designing for tech startup: Network, AD, Backup etc:
Backup wise, I'd seriously consider tape at that sort of scale.
Designing storage systems at that scale is something that @StorageNinja deals withThanks - noted.
@Pete-S said in Designing for tech startup: Network, AD, Backup etc:
1000 TB of storage is a lot. But if you think about it it's 62.5 x 16TB of usable storage.
So say 70 drives. That's 84 drives in RAID6. Probable divided up in a number of arrays but just to get a feel for how many drives you need.
84 drives is not a terrible amount. You can get 36 3.5" drives in 4U on a standard server. So 3 servers or 12U, that's just 1/3 of a rack. If you use SSDs it will be more compact and much higher performance but way more expensive.I believe that mirrors what @JaredBusch was getting at drive number wise. I've worked with systems that have up to 10-12 drives... not 80-200 drives... that is a number of points of failure. Because we all know that spinning rust fails.
-
has anyone used Free RAID Calculator before?
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Yeah, but the math is very straightforward, not much call for it.
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Yeah, but the math is very straightforward, not much call for it.
while I would agree,... when you're dealing with that Petabyte, it's nice to know your math is right -
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Problems with it include....
The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity.
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Yeah, but the math is very straightforward, not much call for it.
while I would agree,... when you're dealing with that Petabyte, it's nice to know your math is right -
Can't really consider a petabyte on RAID. So not useful for storage at that scale.
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Problems with it include....
The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity.
good thing I am ignoring that aspect. for any type of performance, I would go with a full hybrid system.
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
has anyone used Free RAID Calculator before?
Problems with it include....
The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity.
good thing I am ignoring that aspect. for any type of performance, I would go with a full hybrid system.
Hybrid isn't going to fix the fundamental issue of RAID but being viable at even a fraction of this size.
-
As RAID arrays get large, you have to move more and more towards RAID 10. Using roughly the largest drives available broadly on the market (12TB), a single petabyte would be 180 drives in a single RAID array. This is way, way larger than is practical to have in a single array from both a spindle count, and a storage volume number.
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
As RAID arrays get large, you have to move more and more towards RAID 10. Using roughly the largest drives available broadly on the market (12TB), a single petabyte would be 180 drives in a single RAID array. This is way, way larger than is practical to have in a single array from both a spindle count, and a storage volume number.
Considering the 16TB is so new - I wouldn't recommend them.
I need to go back and re-re-read the IPOD of yours.....
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
Considering the 16TB is so new - I wouldn't recommend them.
16TB is an SSD.
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
I need to go back and re-re-read the IPOD of yours.....
That's a separate issue, but also huge. But as this is purely a storage question, we need to focus there. RAID essentially stops being viable around 100TB - 200TB. If you are dealing with really slow, low priority archival storage maybe slightly larger.
For a large petabyte scale storage system, RAIN is really your only option.
-
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
Is this sort of scale something @scale could deal with? (Had to drop the pun.)
-
@travisdh1 said in Designing for tech startup: Network, AD, Backup etc:
Is this sort of scale something @scale could deal with? (Had to drop the pun.)
They don't make storage systems for a long time.
-
It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage.
Can you get some clarity of why they think they need that much storage? How much storage are they currently using?
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
It seems like many use Lustre with ZFS for huge high performance storage (tens of GB/s).
Supermicro, Dell EMC, HPE and others have reference solutions for it."... is ideal for organizations that are able to self-support such as universities, National Labs, and others with such capabilities."
Supermicro solution allows you to expand with 0.5PB or 1PB at a time.
From what I can see you need 18U rack space for a 1 PB solution.
If you fill up an entire rack you have 3PB usable storage.
https://supermicro.com/en/solutions/lustre -
@IRJ said in Designing for tech startup: Network, AD, Backup etc:
It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage.
That matches all the rest of the setup.... all needs, vendors, etc. that don't make much sense and sound like a non-technical person throwing out words overhead in an airport.
-
@IRJ said in Designing for tech startup: Network, AD, Backup etc:
Can you get some clarity of why they think they need that much storage? How much storage are they currently using?
The background is that the system design is all politically motivated and not rooted in business or technical needs. We spent some time on that. There's not even a known use case for the storage (it's unclear if it will be object, block/SAN, file/NAS, etc.) The need is to have "petabyte storage" without any other specification.
-
Hi @gjacobse , consider something like a tiered approach to the problem. 1Pb are a lot of data.
Maybe 5-10Tb of fast SSD for caching, 50-100Tb of spinning disks for caching/capacity and the rest will go to the cloud.
For instance, a single AWS Storage Gateway appliance could be the solution if you have good internet uplink.
Another solution could be Azure Stack.
Feel free to contact me if need advices about that kind of setup.