ZIP file format
ZIP | |
---|---|
File extension : |
.zip
|
MIME type : | application / zip |
Magic number : |
504B.0304 hex PK \ x03 \ x04 ( ASCII-C notation ) |
Developed by: | Phil Katz |
Type: | Data compression |
Container for: | any files |
Standard (s) : | PKWARE / IANA |
The ZIP file format (of English zipper , zipper ') is a format for lossless compressed files, on the one hand reduces the space required for archiving and secondly as a container file can be functions, summarized in several related files or entire directory trees. The file extension for zip-archived files is .zip . The MIME type is application / zip .
history
The ZIP format was originally developed in 1989 with the programs PKZIP (compress) and PKUNZIP (decompress) by the American Phil Katz and has since been expanded a few times . Katz originally used a different file format ( ARC ). This format was developed by Software Enhancements Associates (SEA) and was distributed as shareware . Katz wrote his own, much faster version of this software and distributed it as PKARC . When SEA sued him, he withdrew PKARC and instead developed PKZIP, which used a more efficient algorithm . The rapid spread of PKZIP made SEA and ARC meaningless.
features
Container
The ZIP format is initially a data container in which several files can be stored compressed or uncompressed and also individually decompressed (extracted). The format also enables the associated storage location path to be saved. It is also possible to encrypt the otherwise only compressed files with a password .
No progressive compression
The ZIP format does not support progressive compression (also English solid called), the files are compressed individually. On the one hand, this enables flexible handling (deleting / adding files from the archive without having to recompress everything; extraction of individual files without having to decompress previous files), but has the disadvantage that redundancies between the files are not taken into account during compression can. This disadvantage can be circumvented by first archiving the files uncompressed and then saving the zip file created in this way in a compressed format (usually only useful if there are a large number of similar files).
Non-sequential format
The files are available as file entries ( english file entries stored) in any order. The file entries all begin with a local header ( English local header ) that describes the file entry, and initiates the data portion of the effective content. In order to handle these randomly arranged entries to ensure the zip file is located at the end in each case a central directory ( English central directory ), which references all file entries from the local headers. The order of the file entries and the corresponding references in the central directory can differ from one another. It is a non-sequential structure that best with the concept of random access ( English random access ) can be described.
On the other hand, this non-sequential format also has the effect that, in contrast to the tar format that has been used since 1977 and standardized since 1988 , incomplete archives or archives that are defective in the rear cannot be unpacked at all.
Multivolume
It is still possible to divide the archive over several files (for example, to split large files into pieces that each fit on a CD or DVD).
Packing algorithms
In addition to the deflate method, which is the best method up to PKZip version 2.x , ZIP supports a number of other compression algorithms :
method | Short text | comment |
---|---|---|
0 | Store | The file is saved without compression. |
1 | Unshrinking | Dynamic Lempel-Ziv-Welch algorithm |
2 | Expanding - compression level 1 | |
3 | Expanding - compression level 2 | |
4th | Expanding - compression level 3 | |
5 | Expanding - compression level 4 | |
6th | Imploding | |
7th | Tokenization | |
8th | Deflating | LZSS and Huffman - entropy coding |
9 | Enhanced Deflating ( DEFLATE64 ) | |
10 | PKWARE Data Compression Library Imploding (formerly IBM TERSE) | |
11 | reserved | |
12 | Bzip2 | |
13 | reserved | |
14th | LZMA | Lempel-Ziv-Markow algorithm |
15th | reserved | |
16 | reserved | |
17th | reserved | |
18th | IBM TERSE (new) | |
19th | IBM LZ77 z Architecture (PFS) | |
95 | Xz (LZMA2) 1.0.4 | Extension by WinZIP 18.0 (November 2013) |
96 | JPEG compression | Extension by WinZIP 12.0 (September 2008) |
97 | WavPack | Extension by WinZIP 11.0 Beta (October 2006) |
98 | PPMd version 1, rev 1 | Extension by WinZIP 10.0 Beta (August 2005) |
99 | AES encrypted | Extension through WinZIP |
Extensions
There are now extensions that have been introduced subsequently, such as the Zip128 extension.
Spread, meaning
The file format and the compression method Deflate are public domain and thus achieved worldwide distribution and importance.
The deflate method can be found as a quasi-standard in numerous other formats, such as the image file formats Portable Network Graphics (PNG) and Tagged Image File Format (TIFF), the OpenDocument and the Office Open XML format of the ISO .
Programs
In addition to PKZIP, there are numerous other programs that can process this file format. These include free programs such as Info-ZIP , PeaZip , Xarchiver or 7-Zip , whose optimized deflate algorithm can also generate slightly smaller PKZIP-2.xx-compatible files. There are also commercial programs such as WinRAR or WinZip .
Program and class libraries for accessing zip files are available for numerous programming languages. For example, the Java Platform, Standard Edition (Java SE) since 1997 (Version 1.1) has included the “java.util.zip” package with the corresponding classes for compression and decompression. Next there is the class library Zip64File which zip files as so-called random access files ( English random access files can handle). Zip64File is available to the public in full, free of charge and including the source code.
The BOMArchiveHelper program integrated in the macOS system also generates and decompresses in Zip format. The Windows file explorer is also able to pack and unzip zip files, so that no additional software usually has to be installed here.
Name, confusion of names
According to the company PKWare, the name zip (English for zip fastener ) refers to the packing of many individual files in a larger container and not to the compression function of the program.
Not every compression program with a name containing the string "ZIP" works with the ZIP file format. The most important examples are gzip from the GNU project and bzip2 , which each compress only a single file in a separate format. In this case, to archive several files, another program must be used before compression (usually tar in connection with gzip and bzip2 ). Also with 7-Zip , although the ZIP file format is fully supported, but their own archive format 7z is not compatible with ZIP.
With version 12.1, WinZip introduced the zipx extension of the ZIP format, which marks the use of newer compression methods than DEFLATE, in particular BZip, LZMA, PPMd, Jpeg and Wavpack.
The word “zip” is occasionally used as a synonym for “compressed archiving”, but this does not necessarily mean the packing as a zip file.
ZIP compression in other file formats
The following file formats are zip files, but they must contain certain files:
- Java Archive (JAR) - a Zip-based format forJavaprogram data
- Android Package (APK) - Similar to JAR files but for Android
- OpenDocument (ODF) - thefile format usedbyApache OpenOffice,among others,is a format based on severalXMLfiles, which iscompressed into individual filesusingZip
- Office Open XML fromMicrosoft Officealso contains a ZIP compressed XML files
- EPUB - e-book file format
See also
Web links
- PKWARE: .ZIP File Format Specification , October 1, 2014 (English)
- Specification for AES encryption in zip formats
- zlib Technical details (English)
- RFC 1950 - ZLIB Compressed Data Format Specification version 3.3 (English)
- Year 2038 year 2038 problem
Individual evidence
- ↑ iana.org IANA
- ↑ ZIP File Format Specification ( en ) PKWARE Inc. October 1, 2014. Retrieved August 18, 2017.
- ↑ pkware.cachefly.net
- ↑ winzip.com
- ↑ kb.winzip.com
- ↑ winzip.com
- ↑ imagewz.winzip.com (PDF)
- ↑ winzip.com
- ↑ winzip.com
- ↑ winzip.com
- ↑ winzip.com
- ↑ Compressing (zipping) and unzipping (unzipping) files. In: Windows Support. Microsoft Corporation, accessed May 8, 2020 .
- ↑ APK: What is it actually? - Giga , on April 28, 2014
- ↑ Android package - entry in the Android wiki ; u. a. with "The programming language used is mostly Java, [...]", in the section "Create APK file" (last changed on November 5, 2017)