tar (packing program)

from Wikipedia, the free encyclopedia
tar
Screenshot of the GNU tar help display


Screenshot of the GNU tar help display

File extension : .tar
MIME type : application / x-tar
Magic number : At offset 257 ustar \ 0 for POSIX formats or ustar \ 040 \ 040 \ 0 for GNU tar format, tar \ 0 at offset 508 for star and xstar format

( ASCII-C notation )

Type: Data archiving



Screenshot of the GNU tar help display

tar is a very common packing program in the Unix environment . The file format used by the program is also called tar .

The name was made t ape ar formed Chiver (tape archiver), as originally data on the program tape drives were backed up. At the same time, tar is also the English word for tar (the program is used to "glue" files together uncompressed to form one file).

Tar offers the possibility to write files, directories and other objects of a file system sequentially to a single file or to restore them from the same. The resulting file has the extension .tarand is also referred to in English as a tarball (dt. Lumps of tar or tar ball ).

The MIME type for tar files is application / x-tar .

Compression

First files (circles) are packed with tar , then this archive is compressed with gzip .

The random access to individual files is not with tar possible because the archive files do not have a directory that holds the file offsets for quick access, as for example in Zip is the case (this does not mean that it is not individual files from an archive can be unzipped). Dispensing with this additional structure also makes it possible to easily enlarge archives and, above all, to extract files from incomplete or defective archives.

Today, tar archives are found more often in tar files than on tapes. These archive files are usually compressed to reduce their size. Typical Unix packing programs such as compress , gzip , bzip2 , xz or lzma are usually used. The approach of first appending all files uncompressed and then compressing them is known as solid compression and is now also used with other archive formats such as RAR or 7-Zip . Depending on the compression program used, the file extensions of a tarball are usually .tar.Z (compress), .tar.gz or .tgz (gzip), .tar.bz2 or .tbz2 or .tbz (bzip2) or .tar.xz for short or .txz (xz), or tar.lzma (lzma).

If solid compression is not desired, the individual files can first be compressed and then incorporated into the tarball. This means that it is still possible to unpack incomplete individual parts of a tar archive if an algorithm for solid compression has been selected that cannot be restarted after a defective block. However, the disadvantages (limited file size due to the necessary temporary space for the compression of individual files, or complete failure if files change during archiving) predominate, so that this approach is usually not chosen. In addition, the compression rate is usually lower than with solid compression, which also includes the attributes of the file in the compression. In addition, there is only a slight speed advantage when unpacking individual files, since the archive must be searched sequentially for this anyway.

Problems and alternatives

tar archives are very popular with Unix-like operating systems because they can seamlessly handle many of the properties of these systems. In the background, many software updates and backup programs use tar archives, for example apt-get and duplicity . However, tar archives have disadvantages:

In contrast to Zip archives, a tar file does not contain a table of contents. Software that wants to process a tar archive must always read the entire file in order to know what is in it. Only then can the software extract the required part of the archive. With the update option, new or changed files are appended to the back of the tar archive (and old or deleted files are kept in the same place), which is technically the simplest solution, but makes the problem of the missing table of contents even worse. These disadvantages stem from the fact that tar was originally designed for backing up data to tape drives .

The tar format appeared in an update for UNIX version 7 in 1979, ustar and pax are specified in the POSIX standard. The GNU tar common under Linux does not quite correspond to the POSIX standard. In particular, the often inadequate ability to save access control lists make tar and GNU tar data backup programs of limited use for some users . The insufficient support of sparse files in some implementations can also lead to problems when re-importing an archive. star or bsdtar try to avoid these disadvantages.

Another system-related disadvantage is the type of compression. Solid compression means that the loss of a single block can result in the loss of the entire tape library if the compression program can no longer synchronize after this point. In this area there have been attempts to date such as afio , which compresses file by file but is based on a private variant of the cpio format, which POSIX has now declared obsolete , and certain block-by-block compressing algorithms, to which to a certain extent bzip2 already counts.

A Unix command that is very similar in its functions to tar is cpio . The POSIX standard pax specifies to combine the tar and cpio commands and is a result of the so-called tar wars that were waged around 1992.

Unlike jar archives, tar archives such as cpio and zip archives do not contain any information about the character set of the file names. As a rule, UTF-8 is used in the file systems as with jar .

Sample calls

Create archives with content from /etcand /home:

tar cvf test.tar /etc/ /home/             # Erstellt ein neues Archiv, der Inhalt besteht aus den Verzeichnissen /etc und /home
tar cvf - /etc /home | gzip > test.tar.gz # Dasselbe, aber mittels einer [[Pipeline (Unix)|Pipe]] werden die Daten umgehend in eine komprimierte [[gzip]]-Datei umgeleitet
tar czvf test.tar.gz /etc/ /home/         # *GNU tar* Kurzform, dasselbe, aber ohne Pipe
tar -czvf test.tar.gz /etc/ /home/        # *GNU tar* Alternative: Das führende Minus kann weggelassen werden
tar --create --gzip --verbose --file test.tar.gz /etc/ /home/ # auch dieser Stil ist möglich

Update archive, e.g. for backup purposes:

tar uvf test.tar /etc/ /home/             # u für "Update". Neue und geänderte Dateien werden dem Archiv hinzugefügt. Gelöschte Dateien verbleiben im Archiv.
tar --update --verbose --file test.tar /etc/ /home/ # ausführliche Form

The update option does not work with compressed archives.

Extract archive:

tar xvf test.tar
gunzip < test.tar.gz | tar xvf -
tar xzvf test.tar.gz                      # *GNU tar* Kurzform
tar -xzvf test.tar.gz                     # *GNU tar* Alternative
tar -xzvf test.tar.gz --no-anchored singlefile.txt # einzelnes File auspacken

View archive content:

tar tvf test.tar
gunzip < test.tar.gz | tar tf -
tar tzvf test.tar.gz                      # *GNU tar* Kurzform
tar -tzvf test.tar.gz                     # *GNU tar* Alternative

The notation of the commands without a leading minus is the compatible UNIX syntax and should be used with preference.

tar and Windows

Since Windows 10 1803 tar is also installed. With older Windows versions, files packed with tar cannot be unpacked or opened directly. An additional program is necessary for this. Archive programs such as 7-Zip , TUGZip or IZArc can unpack tar under Windows, but other common archive programs can also open tar archives.

Web links

Individual evidence

  1. opengroup.org
  2. https://blogs.msdn.microsoft.com/commandline/2018/03/07/windows10v1803/