How-To Guides and Blog

How to Manage Compressed tar Archives in Linux

NBH Support
No Comments

Archiving files is very important, especially when creating backups and transferring files across networks. It helps to keep multiple files in one folder rather than having to send files bit by bit.

Let me bring back your memory to Windows which almost everyone is conversant with. In Windows, we have a famous compressing tool called WinRar, which helps to compress files to rar, rar4, and zip format while working in the Windows environment. This is similar to Linux systems but with the tar tool to help compress files to either gzip, bzip2 and xz. The tar command can list the content of an archive or extract the content to a specified location.

In this article, we will be discussing how to create new compressed archive files and extract already compressed file to a location.

HOW TO ARCHIVE FILES AND DIRECTORIES WITH THE TAR COMMAND

Before getting to know how the tar command can be used, let’s understand the different options which are required to use the tar command.

To use the tar command, you can need two or more of the following options;

The tar command along with

c: will create an archive

t: will list the content of the archive

x: will extract the content of the archive

f: It signifies filename. It adds a name to the file you want the archive to
operate on

v: Also known as verbose. It displays the archiving progress. It shows which files are added and extracted from the archive. It is barely used, but it is also an important option.

However, the tar command doesn’t show prior notice while overwriting files. So, therefore, before you start archiving a file, make sure no archive has the same name in the directory when new archive is created.

The first option while creating an archive is the c option followed by the f option and finally the list of the files you need to add to the archive. If you don’t want the archived file to be saved in the current directory, you can also specify the full path (relative path) of the directory you wish to the archived file to be stored.

Please note that specifying the absolute path when using the tar command, the forward-slash (/ ) of the path is removed from the filename by default. This is to prevent the command from overwriting essential files in the system.

You can see from the command below how the forward-slash is being removed when executing your tar command

Now let’s try our first simple step in creating an archive of multiple files

You can also compress an entire directory and its content

The command above just archive the directory “human.” You might choose to archive the whole configuration file of the /etc directory using the same method above. However, you should note that for the tar command to be able to archive a file, it is mandatory for the user executing the tar command to have read privileges. For example, most system files such as the /etc can only be read by the root user. So archiving such file will require root privileges else the tar command will only archive files which do not include read permissions and also directories which do not include read and execute permissions

The tar command by default, retain file ownership and permissions. However, the SELinux policy is not retained by default. To go about this, you have to include the – -xattrs option while creating the archive

HOW TO LIST THE CONTENT OF A TAR ARCHIVE

To list the content of an archive, the first option is the t option followed by the f option (filename), a single space, then followed by the archive you want to operate.

The command below shows a typical example of how it is done.

HOW TO EXTRACT THE FILES CREATED BY TAR COMMAND

It is best to extract tar archives to an empty directory to prevent it from overwriting another file with the same filename. So before you tried to extract any tar archive; create a new directory, and send your archive to that directory. Also, know that an archived tar file will reserve the original owner and group owner when run by the root user. If a regular user runs the command, the file will automatically be owned by the user who extracted the file.

Lets put this in practice for a clearer understanding of the concept

The command above shows how an archive tar can be extracted. Please note that by default, an extracted archive file retains the umask value which is subtracted from the permission. It is a security measure in Linux to prevent regular users from having the execute permission. To retain the permissions as default, you will have to include the p option while extracting your files.

Let’s take a practical example of how this is done

HOW TO CREATE A COMPRESSED TAR ARCHIVE

As mentioned earlier, there are different compression algorithm that is supported by the tar command. The most widely used is the gzip compression algorithm. It is the oldest and the fastest compression algorithm when compared to others. The bzip2 is less widely used, and it leads to smaller archive files when compared to gzip. It is not recommended when compressing images because of the low picture quality result. The next is the xz compression algorithm. It is the newest, and it offers the best compression ratio when compared to other methods.

To create a compressed archive, one of the following options should be specified in your command

z: This signifies a gzip compression algorithm. It ends with an extension of .gz. Example, filename.tar.gz

j: This signifies the bzip2 compression algorithm. It ends with the .bz2 extension. Example filename.tar.bz2

J: This signifies the xz compression algorithm. It ends with the .xz extension. Example filename.tar.xz

Now lets kick-off in learning how to create archive tar with all of these compression algorithms.

Firstly, let’s start with how to create a gzip file. To create a gzip compress tar archive, you need to specify the c option for create, the z option which represents the gzip algorithm and the f option for the filename. The filename must end with the .tar.gz extension.

Let’s create a tar compressed gzip archive of the /var/log/messages file in our current directory

To create a bzip2 file, you need the j option alongside the other options as specified in the creation of an archive file.

To create the xz compressed file, you use the J option as seen below

HOW TO EXCLUDE DIRECTORIES AND FILES WHILE USING THE TAR COMMAND

In some cases, you might want to exclude some files and directories while using the tar command. To go about this, you need to append the – -exclude option to your command to exclude files, given a pattern.

For example, let’s say you compressed the /etc directory and you need to exclude the crony.weekly and the cron.monthly directory. You can do that as follows

The command has excluded the directories specified during compression. You can try that on your system and go through your output to see the result. You can see the processes as they execute simply because of the v option specified with the tar command

HOW TO EXTRACT A COMPRESSED TAR ARCHIVE

While extracting a compressed tar file, it is necessary to determine the directory where you planned to extract the files. Like said before, it is not recommended to extract a file to the directory where it is created to avoid overwriting existing files. You need to create a new directory where you can extract your files then cd into that directory and extract your files there

Please note that while extracting your compressed tar files, it is not necessary to use the same compression algorithm used while creating the file, but it is required to use the option for decompressing the file to the tar command.

Now, let’s decompress our already compressed /var/log/messages directory using all the method

First, let’s decompress the gzip-compressed file

Let’s extract the content of our compressed-bzip2 file to our newly created directory

Finally, let’s do for the content of the compressed-xz file to the same directory. Please, this is just an example. You are not expected to extract all files in the same directory. It is preferable to extract all files to different directories.

CONCLUSION

Now you can compress all your documents as one and remotely forward to a remote network rather than having to send them bit by bit. Things made easy when you learn your commands.

Besides what has been mentioned, you can also use the gzip,bzip2, and xz to compress single files. For example, you can run gzip myfile.tar on the command line, which will result in myfile.tar.gz file or bzip2 myfile.tar to result in myfile.tar.b2z file or xz myfile.tar to result in myfile.tar.xz file. Please this for just a single file and not when you are running multiple files or a directory.

You can also decompress the compressed file by running gunzip to decompress a gzip file, bunzip2 to decompress a b2z file and unxz to decompress xz file respectively on the command line. For example, bunzip2 myfile.tar.b2z will result in a decompressed file of myfile.tar

Always practice mastering how all these commands works

REFERENCES:
manpage tar

Red Hat System Administration I Student workbook