ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Data archive is not backup! What do you use?

    IT Discussion
    5
    26
    2.0k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      Francesco Provino
      last edited by

      I'm set with Veeam (VM level) and rsnapshot/attic (file level) regarding the backups; but now I'm facing the problem of archiving very old, seldom readed data that we must preserve for at least 5 years…

      I'm digging into zbackup, obnam, and also plain tar encryption… I absolutely want to use pure FLOSS software for this purpose.
      After archiving I'll upload the data to something like glacier/B2, and local cold storage of course.

      What do you think about that? What do you use for long-term archiving?

      1 Reply Last reply Reply Quote 0
      • J
        JaredBusch
        last edited by

        I would just pick a compression method and zip them up and upload.

        This could easily be scripted with powershell or bash depending on your OS.

        1 Reply Last reply Reply Quote 3
        • M
          matteo nunziati
          last edited by

          +1 for tar.bz2, you can encrypt it if you want.

          F 1 Reply Last reply Reply Quote 2
          • F
            Francesco Provino @matteo nunziati
            last edited by

            @matteo-nunziati said in Data archive is not backup! What do you use?:

            +1 for tar.bz2, you can encrypt it if you want.

            I agree, it's fairly common and almost every distrp ship it as default. I've read in many sites that xz and other lzma-based compress even more, but there are some doubts about the recoverability, the format is more complex and at least 10x slower…

            S 1 Reply Last reply Reply Quote 0
            • S
              scottalanmiller
              last edited by

              We use B2 for our long term cold storage as well.

              1 Reply Last reply Reply Quote 0
              • S
                scottalanmiller @Francesco Provino
                last edited by

                @Francesco-Provino said in Data archive is not backup! What do you use?:

                @matteo-nunziati said in Data archive is not backup! What do you use?:

                +1 for tar.bz2, you can encrypt it if you want.

                I agree, it's fairly common and almost every distrp ship it as default. I've read in many sites that xz and other lzma-based compress even more, but there are some doubts about the recoverability, the format is more complex and at least 10x slower…

                BZ is so good. Probably not worth pushing it farther in most cases.

                F 1 Reply Last reply Reply Quote 0
                • S
                  scottalanmiller
                  last edited by

                  I think that the biggest question will be around how you select what to archive and manage it.

                  1 Reply Last reply Reply Quote 0
                  • F
                    Francesco Provino @scottalanmiller
                    last edited by

                    @scottalanmiller said in Data archive is not backup! What do you use?:

                    @Francesco-Provino said in Data archive is not backup! What do you use?:

                    @matteo-nunziati said in Data archive is not backup! What do you use?:

                    +1 for tar.bz2, you can encrypt it if you want.

                    I agree, it's fairly common and almost every distrp ship it as default. I've read in many sites that xz and other lzma-based compress even more, but there are some doubts about the recoverability, the format is more complex and at least 10x slower…

                    BZ is so good. Probably not worth pushing it farther in most cases.

                    Now experimenting with ZPAQ...

                    S 1 Reply Last reply Reply Quote 0
                    • S
                      StrongBad @Francesco Provino
                      last edited by

                      @Francesco-Provino said in Data archive is not backup! What do you use?:

                      @scottalanmiller said in Data archive is not backup! What do you use?:

                      @Francesco-Provino said in Data archive is not backup! What do you use?:

                      @matteo-nunziati said in Data archive is not backup! What do you use?:

                      +1 for tar.bz2, you can encrypt it if you want.

                      I agree, it's fairly common and almost every distrp ship it as default. I've read in many sites that xz and other lzma-based compress even more, but there are some doubts about the recoverability, the format is more complex and at least 10x slower…

                      BZ is so good. Probably not worth pushing it farther in most cases.

                      Now experimenting with ZPAQ...

                      How is it?

                      F 1 Reply Last reply Reply Quote 0
                      • F
                        Francesco Provino @StrongBad
                        last edited by

                        @StrongBad said in Data archive is not backup! What do you use?:

                        @Francesco-Provino said in Data archive is not backup! What do you use?:

                        @scottalanmiller said in Data archive is not backup! What do you use?:

                        @Francesco-Provino said in Data archive is not backup! What do you use?:

                        @matteo-nunziati said in Data archive is not backup! What do you use?:

                        +1 for tar.bz2, you can encrypt it if you want.

                        I agree, it's fairly common and almost every distrp ship it as default. I've read in many sites that xz and other lzma-based compress even more, but there are some doubts about the recoverability, the format is more complex and at least 10x slower…

                        BZ is so good. Probably not worth pushing it farther in most cases.

                        Now experimenting with ZPAQ...

                        How is it?

                        This one. It's included by default in most linux distro.
                        It does deduplication… but it really takes forever, 11 hours to compress 43 Gb to 21Gb using 100% CPU of a 4 core xeon e3.

                        I think I'll stick with bzip2 or lzma (xz) for now, I don't think higher compression ration are really worth the price.

                        1 Reply Last reply Reply Quote 0
                        • S
                          scottalanmiller
                          last edited by

                          Wow, that is a lot of CPU!!

                          F 1 Reply Last reply Reply Quote 0
                          • F
                            Francesco Provino @scottalanmiller
                            last edited by

                            @scottalanmiller said in Data archive is not backup! What do you use?:

                            Wow, that is a lot of CPU!!

                            Yes, maybe because I also set the chunk compression to the maximum ratio.

                            1 Reply Last reply Reply Quote 0
                            • F
                              Francesco Provino
                              last edited by

                              I think deduplication is not worth the cpu/ram cost in most cases (thinking about ZFS E.G.).

                              S 1 Reply Last reply Reply Quote 1
                              • S
                                scottalanmiller @Francesco Provino
                                last edited by

                                @Francesco-Provino said in Data archive is not backup! What do you use?:

                                I think deduplication is not worth the cpu/ram cost in most cases (thinking about ZFS E.G.).

                                That's generally true. Most storage vendors agree with you when the engineers are talking. Sales people, of course, love selling deduplication.

                                1 Reply Last reply Reply Quote 1
                                • S
                                  scottalanmiller
                                  last edited by

                                  Deduplication tends to be good for archival data or as an offline process that runs only during idle times directly on the storage. Inline dedupe is rarely worth it.

                                  F 1 Reply Last reply Reply Quote 0
                                  • F
                                    Francesco Provino @scottalanmiller
                                    last edited by

                                    @scottalanmiller said in Data archive is not backup! What do you use?:

                                    Deduplication tends to be good for archival data or as an offline process that runs only during idle times directly on the storage. Inline dedupe is rarely worth it.

                                    Deduplication makes the archives much more fragile. A bit flip in the right chunk can potentially blow the whole archive.

                                    What percentage of gained space is worth the loss of recoverability?
                                    With b2 at 0.005, glacier at 0.004, magnetic and tape storage still getting cheaper, why add complexity and risk for a little saving? The space gained is ~10% or less compared with LZMA compression for my dataset, that is a typical smb one.

                                    M S 2 Replies Last reply Reply Quote 1
                                    • M
                                      matteo nunziati @Francesco Provino
                                      last edited by

                                      @Francesco-Provino never used those (b2, glacier) how do you access them? REST API? client? anything special required?

                                      S F 2 Replies Last reply Reply Quote 0
                                      • S
                                        scottalanmiller @Francesco Provino
                                        last edited by

                                        @Francesco-Provino said in Data archive is not backup! What do you use?:

                                        Deduplication makes the archives much more fragile. A bit flip in the right chunk can potentially blow the whole archive.

                                        Not really, it would only impact deduped data. So the data that is stored many, many times yes each copy would be effected, but only data that was all the same.

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          scottalanmiller @matteo nunziati
                                          last edited by

                                          @matteo-nunziati said in Data archive is not backup! What do you use?:

                                          @Francesco-Provino never used those (b2, glacier) how do you access them? REST API? client? anything special required?

                                          They are basically the same as S3. We use B2 and we access it via the API. There is a toolkit for Linux which is super easy to use.

                                          1 Reply Last reply Reply Quote 0
                                          • S
                                            scottalanmiller
                                            last edited by

                                            https://mangolassi.it/topic/9210/getting-started-with-backblaze-b2-cli

                                            M 1 Reply Last reply Reply Quote 2
                                            • 1
                                            • 2
                                            • 1 / 2
                                            • First post
                                              Last post