LHa (compression program)

from Wikipedia, the free encyclopedia
LHa

Screenshot
LHarc help display in the command line
Basic data

Maintainer "LHa for UNIX": Koji Arai
developer Haruyasu Yoshizaki et al.
Current  version "LHa for UNIX": 1.14i-ac20081023 git rev: 7c3cd95
(October 5, 2019)
operating system available across platforms
programming language C.
category Data compression
License Version and implementation dependent. The implementation "LHa for UNIX" is open source.
github.com/jca02266/lha
LZH
File extension : .lzh, .lha
MIME type : application / x-lzh-compressed
Type: Data compression
Container for: any files


LHa is a family of compression programs for file archiving. The associated file format LZH is based on the LZHUFF method, in which first repetitive sections of a data stream are deduplicated using the Lempel-Ziv-Storer-Szymanski algorithm (LZSS) and then compressed even more with entropy coding according to Huffman . The widely used deflate algorithm was derived from the LHA source code .

LZH file format and LZHUFF algorithm

history

The LZH format was designed in 1988 by the physician Haruyasu Yoshizaki ( 吉 崎 栄 泰 , Yoshizaki Haruyasu ) with the support of Professor Haruhiko Okumura ( 奥 村 晴 彦 ) from Matsusaka University (today: Mie Chūkyō University ) for his LHarc compression program .

Filename extensions and MIME type

In addition to the cross-platform file extension .lzh is on the Amiga from Commodore extension .lhaused and historically well .pma(PMarc) and .lzs(LArc). The MIME type is application/x-lzh-compressed.

Byte order

The byte order of the LZH format is little-endian .

Header format

In LZH archives, each file contained therein is preceded by a header that contains information on the respective file. The LZH format can contain three types of headers, namely level-0, level-1 or level-2 headers. The internal structure of the LZH format is shown schematically in the following two tables.

level-0
LZH header
Compressed data
LZH header
Compressed data
...
level-1, level-2
LZH header
Extension header
Extension header
...
Compressed data
LZH header
Extension header
Extension header
...
Compressed data
...

Compression methods

The LH methods use a string replacement method based on the Lempel-Ziv-Storer-Szymanski algorithm (LZSS) and an entropy coding according to Huffman .

The file format allows the use of different packing methods, usually different versions of the LH algorithm with differences in the

  • Window length (up to 4k with LArc, up to 64k with LHa),
  • maximum word length (LArc: 17, LHa: 60, 256),
  • the level of the limit value of the LZSS algorithm (2, 3) and
  • static or dynamic Huffman:
Canonical LZH
-lh0- -lh1- -lh2- -lh3- -lh4- -lh5- -lh6- -lh7- -lhd-
Sliding dictionary length uncompressed 4 KiB 8 KiB 8 KiB 4 KiB 8 KiB 32 KiB 64 KiB empty folder
Max. Word length 60 bytes 256 bytes 256 bytes 256 bytes 256 bytes 256 bytes 256 bytes
Huffman dynamic dynamic static static static static static

Historical and non-canonical methods:

LArc methods: -lzs-, -lz2-, -lz3-, -lz4-, -lz5-, -lz7-, -lz8-;
LHa Joe Jared extensions: -lh8-, -lh9-, -lha-, -lhb-, -lhc-, -lhe-; -lhx-;
PMarc methods: -pm0-, -pm1-, -pm2-,-pms-

Implementations by LHa

The starting point was the packaging program LArc from another author. Originally the program was called LHarc . A completely rewritten version was temporarily called LHx and ultimately published as LH . In order not to conflict with the "load high" command of the same name from MS-DOS 5.0, which was new at the time , it was renamed LHa .

Use and distribution

The first major popularity was not achieved by LHarc itself, but a manipulated version called LHice or ICE with version number 1.14 that was distributed in mailbox networks around 1989. It was practically identical to LHarc, but the ending of the generated files was “.ice”, and “ freezing ” or “ melting ” was displayed instead of “ packing ” or “ unpacking ” in the progress bar . In the version LHarc 2.0, which followed soon afterwards, such manipulations were made more difficult by encrypting the program-internal text strings. Nonetheless, a hacked version of LHarc 2.0 also appeared, which was called “FOOBAR” (“Florian Orjanov's and Olga Bachetzka's ARchiver”) and created archives with the file extension “.foo”.

The format was used by id Software to compress the installation files of their early computer games such as Doom . LHa has been ported to many operating system environments and is the most widely used archive format on the Amiga , especially on Aminet .

The LZH algorithm was used by companies such as AMI for their BIOS in order to efficiently use the limited space of the memory modules on the motherboard in the computer.

The LZH format is hardly used in Europe and the USA today, but it is still very popular in Japan . The company Microsoft has for their operating system Windows XP released an extension for compressed folders in LZH format in Japan.

y2k11 bug

Timestamps since 2011 are set to 1980. This error requires an update of the packing program. It was in assembly language of the overflow test cmpi.l #2010,d6 used obviously a transposed digits to the 2100th

See also

Web links

Individual evidence

  1. https://lha.osdn.jp/history.html
  2. Andreas Stiller: Processor Patches . In: Heinz Heise (Ed.): C't . No. 5, 2001, pp. 240-241. Retrieved July 17, 2016.
  3. support.microsoft.com
  4. aminet.net for the past
  5. aminet.net for the future