ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Checking multiple Directories to confirm all files are identical

    Scheduled Pinned Locked Moved IT Discussion
    windowscomparisonfilemanagementpowershell
    30 Posts 9 Posters 3.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DustinB3403D
      DustinB3403 @dafyre
      last edited by

      @dafyre said in Checking multiple Directories to confirm all files are identical:

      @dustinb3403 That's what I was thinking.

      You'll still be in the shape of how do you compare two stupidly large files, though.

      Yeah, while that is certainly a part of the challenge, the larger portion is just checking to see if the bulk is all aligned and matching...

      If any tooling had some way to "skip large files" and just jot down their names then a simple stare and compare might work in that case.

      dafyreD 1 Reply Last reply Reply Quote 0
      • F
        flaxking @DustinB3403
        last edited by

        @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

        @flaxking said in Checking multiple Directories to confirm all files are identical:

        @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

        @flaxking said in Checking multiple Directories to confirm all files are identical:

        I would think you would be able to use robocopy to do a diff

        Probably, but the issue still comes down to system resources.

        Anything that is storing in memory will quickly consume the available resources.

        Maybe if I pipe the about to a file it won't be so bad..

        It's bound to be a lot more efficient than your powershell.

        It's still going to consume more ram than any host in the environment has to process the job. Just between any 2 directories there's over 20 million files.

        I don't know how it's implemented, so I can't say. Just create a new powershell script that doesn't store as much in memory. I think if you pipe to ForEach-Object it actually starts operating before the get-childitem gets all the objects and then don't store those objects in a variable. So maybe it will start garbage collection before you are done

        1 Reply Last reply Reply Quote 0
        • dafyreD
          dafyre @DustinB3403
          last edited by

          @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

          @dafyre said in Checking multiple Directories to confirm all files are identical:

          @dustinb3403 That's what I was thinking.

          You'll still be in the shape of how do you compare two stupidly large files, though.

          Yeah, while that is certainly a part of the challenge, the larger portion is just checking to see if the bulk is all aligned and matching...

          If any tooling had some way to "skip large files" and just jot down their names then a simple stare and compare might work in that case.

          So are you looking to compare bit for bit -- or just file name and size ?

          DustinB3403D 1 Reply Last reply Reply Quote 1
          • DustinB3403D
            DustinB3403 @dafyre
            last edited by

            @dafyre said in Checking multiple Directories to confirm all files are identical:

            @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

            @dafyre said in Checking multiple Directories to confirm all files are identical:

            @dustinb3403 That's what I was thinking.

            You'll still be in the shape of how do you compare two stupidly large files, though.

            Yeah, while that is certainly a part of the challenge, the larger portion is just checking to see if the bulk is all aligned and matching...

            If any tooling had some way to "skip large files" and just jot down their names then a simple stare and compare might work in that case.

            So are you looking to compare bit for bit -- or just file name and size ?

            Name, size date ideally. bit for bit is overkill and I can't image the client would want to wait for who knows how long to get an answer for this.

            1 Reply Last reply Reply Quote 0
            • DashrenderD
              Dashrender
              last edited by

              One thought - run a md5 hash and output filename date hash to a file, then compare the contents of the files between the servers.

              you could run the job individually on each server so all three plus devices could run at once - assuming not all on the same VM host.

              DustinB3403D 2 Replies Last reply Reply Quote 3
              • DustinB3403D
                DustinB3403 @Dashrender
                last edited by

                @dashrender said in Checking multiple Directories to confirm all files are identical:

                One thought - run a md5 hash and output filename date hash to a file, then compare the contents of the files between the servers.

                you could run the job individually on each server so all three plus devices could run at once - assuming not all on the same VM host.

                That actually isn't a bad idea, time consuming still but would probably be way more lightweight than trying to perform a live comparison between the systems.

                Just use something like Meld to compare the text files after the fact.

                1 Reply Last reply Reply Quote 0
                • DustinB3403D
                  DustinB3403 @Dashrender
                  last edited by

                  @dashrender said in Checking multiple Directories to confirm all files are identical:

                  One thought - run a md5 hash and output filename date hash to a file, then compare the contents of the files between the servers.

                  you could run the job individually on each server so all three plus devices could run at once - assuming not all on the same VM host.

                  This appears to be working, thanks for the idea!

                  dir D:\Files -Recurse | Get-FileHash -ea Continue > C:\D-Files.txt
                  

                  Dumps out one large file, I could then use a file comparison tool to quickly check these outputs.

                  DashrenderD 1 Reply Last reply Reply Quote 1
                  • DashrenderD
                    Dashrender @DustinB3403
                    last edited by

                    @dustinb3403 You'll likely have to do some type of line fixup, i.e. if a file is missing, then every line after that would be a mismatch...

                    DustinB3403D 1 Reply Last reply Reply Quote 0
                    • DustinB3403D
                      DustinB3403 @Dashrender
                      last edited by

                      @dashrender said in Checking multiple Directories to confirm all files are identical:

                      @dustinb3403 You'll likely have to do some type of line fixup, i.e. if a file is missing, then every line after that would be a mismatch...

                      Maybe, Meld has actually been pretty good about that type of "issue" and can find the record anyways.

                      1 Reply Last reply Reply Quote 1
                      • ObsolesceO
                        Obsolesce @DustinB3403
                        last edited by Obsolesce

                        @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

                        In a windows environment if you wanted to check multiple network directories, with millions of files ranging in sizes from tiny (a few KB) to large 4GB+ how would you do it.

                        If you need to search files on a server over network share, on the server you should install the windows search feature, and ensure the shared files and locations are indexed. That will greatly speed up the searching via Win10 computers.

                        1 Reply Last reply Reply Quote -1
                        • ObsolesceO
                          Obsolesce @DustinB3403
                          last edited by Obsolesce

                          @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

                          Ideally I'd like to compare them all at once, but setting the "golden standard" here may be difficult.

                          Wait, what? Why do you have multiple directories that should have identical data? That makes absolutely no sense. There are methods to use in file server administration to avoid this...

                          1 Reply Last reply Reply Quote 0
                          • ObsolesceO
                            Obsolesce @DustinB3403
                            last edited by

                            @dustinb3403 said in Checking multiple Directories to confirm all files are identical:

                            I know I could use a tool like Create-Synchronicity to force 1 other directory to match the source, but I would prefer to find and list the differences in the directories.
                            Maybe powershell can help?

                            Yeah, PowerShell can help with this in the same way closing the front door of a house will fix a fire inside of it.

                            1 Reply Last reply Reply Quote 1
                            • 1
                            • 2
                            • 1 / 2
                            • First post
                              Last post