Best Dedup Software
- 
 I have a customer that has 5 different NAS desktop-style boxes on their network all different sizes from different vendors. They are out of space and want to move to something rack-based large enough to hold everything. Thinking of using Synology as it's simple enough. Their previous IT person had some of the boxes backing up to each other so there is a lot of copies of the same data strown all over the place between the 5 boxes. They want 1 clean copy of the existing data on the new NAS. There are 2 NAS boxes that have the primary bulk of the recent data so I was thinking of moving that to the new NAS and run a dedup to remove any copies / old file backups and then I want to compare what's on the new NAS to the other boxes and delete copies and move data that I don't have on the new NAS. In the end, they want to get rid of all the old NAS boxes and just have 1 NAS box that is backed up to the cloud and to tape to be taken offsite. They also want a search solution to run against the NAS and the Synology built-in search sucks and just wears out the drives as it never stops indexing. They have the old google physical search appliance that needs to come out once a new search solution is found. Any opinions on the dedupe software and the search software r appliance? 
- 
 @eleceng said in Best Dedup Software: I was thinking of moving that to the new NAS and run a dedup to remove any copies / old file backups and then I want to compare what's on the new NAS to the other boxes and delete copies and move data that I don't have on the new NAS. If you have dedupe on a storage it will store all duplicated files, but only the first file will actually take up any space on the drive. Technically speaking deduplication works at the block level, not file level. To be able to compare blocks the storage server need to compute a checksum for every block so there is a cost to deduplication as you need more CPU & RAM and it will not perform as well. But what you are talking about is removing duplicate files - which is a different thing. 
- 
 @pete-s said in Best Dedup Software: @eleceng said in Best Dedup Software: I was thinking of moving that to the new NAS and run a dedup to remove any copies / old file backups and then I want to compare what's on the new NAS to the other boxes and delete copies and move data that I don't have on the new NAS. If you have dedupe on a storage it will store all duplicated files, but only the first file will actually take up any space on the drive. Technically speaking deduplication works at the block level, not file level. To be able to compare blocks the storage server need to compute a checksum for every block so there is a cost to deduplication as you need more CPU & RAM and it will not perform as well. But what you are talking about is removing duplicate files - which is a different thing. Some dedupe works at the file level. That's how it used to always be done. But that's not what he's describing here at all, as you said. 
- 
 @eleceng said in Best Dedup Software: Any opinions on the dedupe software and the search software r appliance? For this kind of thing, you don't need software for a couple reasons. One is that you need to manually maintain the master file system for them to share. And second because you can just write a script that compares checksums and lists the master list of files for you to store. No need for software as the OS can already do this. 
