Linux: File Compression Utilities
-
File compression is a common task on any platform. On Linux this is almost always handled by a set of small, discrete utilities. Each utility typically handles a single compression algorithm keeping each as small as possible. The common utilities include gzip, bzip2 and xz.
Using either of these commands is very simple. They simply require the command the file that we want to compress. Let's look at a file that we have, the du -b command will tell us exactly how big the original file is in bytes.
$ du -b hostedrpm.out 19815 hostedrpm.out
Now let's use gzip to compress this file:
$ gzip hostedrpm.out $ du -b hostedrpm.out.gz 6558 hostedrpm.out.gz
What you should notice in doing this is that the gzip command was very simple and the file that we had was compression and had .gz appended to the end of it. If you look in the directory where you try this, you will notice that the original file has disappeared. This is an "in place" compression process.
Uncompressing is simple and uses the gunzip command.
$ gunzip hostedrpm.out.gz
And voila, we have our original file back again, just as it was before. Now we can try this same process with bzip2.
$ bzip2 hostedrpm.out $ du -b hostedrpm.out.bz2 5571 hostedrpm.out.bz2
Here we can see the same process has happened. In this case the file extension appended is .bz2. And to decompress our file again...
$ bunzip2 hostedrpm.out.bz2
The big difference between gzip and bzip2 is that gzip focuses on being fast while compression levels are normally not that impressive and bzip2 focuses on heavy compression while taking more time. Of course compression and speed varies by the file type, but gzip is almost universally faster (read: uses less CPU time) and bzip2 nearly always compresses a bit better.
The newcomer is xz which utilises the LZMA compression algorithm, the same as 7Zip. (We can use 7Zip itself on UNIX but 7Zip uses its own packaging format that does not fully support UNIX so it is generally only used to open files made on Windows.) The xz utility works identically to its compression siblings.
$ xz hostedrpm.out $ du -b hostedrpm.out.xz 6144 hostedrpm.out.xz $ unxz hostedrpm.out.xz
All of these compression utilities are lossless and can be safely used with any file. By and large, gzip is used for nearly all tasks and has the greatest compatibility across different UNIX variants.
On Windows, we are used to compression and archiving tools being combined into a single package. This is a great example of how the Windows mindset favours monolithic utilities and UNIX favours modular ones. These utilities only handle compression and decompression of individual files. To get Windows Zip-like functionality we combine these utilities with an archiving utility which we will explore separately. UNIX also has all of the same utilities and functions as Windows which we will cover as well.
Part of a series on Linux Systems Administration by Scott Alan Miller
-
Any particular reason you left tar out of this set?
-
@dafyre said in Linux: File Compression Utilities:
Any particular reason you left tar out of this set?
-
@DustinB3403 said in Linux: File Compression Utilities:
@dafyre said in Linux: File Compression Utilities:
Any particular reason you left tar out of this set?
I just saw that go up, ha ha ha.
-
@dafyre said in Linux: File Compression Utilities:
Any particular reason you left tar out of this set?
It's not compression
-
@dafyre said in Linux: File Compression Utilities:
@DustinB3403 said in Linux: File Compression Utilities:
@dafyre said in Linux: File Compression Utilities:
Any particular reason you left tar out of this set?
I just saw that go up, ha ha ha.
I put the placeholder into the main list so that people would see it, as well.
-
@scottalanmiller said in Linux: File Compression Utilities:
@dafyre said in Linux: File Compression Utilities:
@DustinB3403 said in Linux: File Compression Utilities:
@dafyre said in Linux: File Compression Utilities:
Any particular reason you left tar out of this set?
I just saw that go up, ha ha ha.
I put the placeholder into the main list so that people would see it, as well.
Odd, I don't remember seeing it. I been up since 5:30am and just got done replacing a UPS, so my brain is still not in gear yet.
-
It doesn't update as "new" when I just update the list. Only if I comment on the thread.