diff

from Wikipedia, the free encyclopedia

diff is a Unix program that compares the differences between two text files line by line or section by section.

Basics

diff is a basic command line function of a Unix system. A wide range of computer programs take advantage of this. The program is output in text form and is also often referred to as diff ( file extension .diff ). The output format enables machine processing.

Version management applications are diverse: for example, the output of diff can be used as input for the Unix program patch , in order to apply the changes that diff has detected to another text file. There are also a number of programs that display the differences output by diff in a clear graphical form. In addition, diff forms the basis of all merge functions.

Program function

The first versions of the program were programmed to compare text files. However, since 1980 diff has also supported binary files .

call

diff is executed on the command line with the names of two text files as parameters :

$ diff telefonliste2007.txt telefonliste2008.txt

The two text files are compared line by line, which means that even if a line in the two files only differs in a single character, diff interprets this as a deleted line and a newly inserted line.

output

The following two files are compared using diff.

Telefonliste2007.txt:
Mayer, Susanne, Lager, -212
Schmid, Carola, Geschäftsleitung, -435
Schmidt, Eberhard, Vertrieb, -479
Schmitt, Marie, Labor, -804
Telefonliste2008.txt:
Mayer, Susanne, Lager, -212
Schmid, Carola, Geschäftsleitung, -435
Schmitt, Marie, Labor, -804
Waldmann, Ernst, Labor, -805
Zundel, Walter, Vertrieb, -476

The output of diff then looks like this:

3d2
< Schmidt, Eberhard, Vertrieb, -479
4a4,5
> Waldmann, Ernst, Labor, -805
> Zundel, Walter, Vertrieb, -476

There are different output formats. The above example corresponds to the normal output ( normal diff ) without any further options. The lines that begin with an angle, opening bracket are only available in the first file, those that begin with an angle, closing bracket are only available in the second file. Lines that are the same in both files are not output. The individual blocks are separated by so-called change commands , which specify which action (add lines - a , change - c or remove - d ) should be carried out in which lines.

The so-called unified format ( unified diff ) is obtained with the -u option . Each line that only appears in the first file is marked with a minus sign, and each line that only appears in the second file with a plus sign. Lines common to both files are indicated by a space.

Usually, not all lines are output, but only blocks of lines that are close to a difference. At the beginning of the output there are two lines, which are marked with three minus signs and three plus signs respectively. They show the files to which the diff applies. Each block begins with a line surrounded by an at sign (@) . This shows the lines in which the relevant block begins in both files and, separated by a comma, how long it is in the respective file.

Output of the above example in unified diff format:

--- telefonliste2007.txt    2007-12-28 13:12:34.000000000 +0100
+++ telefonliste2008.txt    2008-07-28 14:16:26.000000000 +0100
@@ -1,4 +1,5 @@
 Mayer, Susanne, Lager, -212
 Schmid, Carola, Geschäftsleitung, -435
-Schmidt, Eberhard, Vertrieb, -479
 Schmitt, Marie, Labor, -804
+Waldmann, Ernst, Labor, -805
+Zundel, Walter, Vertrieb, -476

history

The diff program was in the early 1970s on the Unix - operating system from AT & T Bell Labs developed in Murray Hill, New Jersey, United States. The final version belonging to this very early Unix system was written entirely by Douglas McIlroy . These investigations were published in 1976 in a document co-authored by James W. Hunt , who also wrote one of the initial versions of diff.

McIlroy's work was influenced by Steve Johnson's comparison program on GECOS and Mike Lesk's proof program, which, like diff, also originated on Unix . Proof produced line-by-line changes like diff and used angle brackets (“>” and “<”) to represent line insertions and line deletions in the program output. The heuristic method these programs used was considered unreliable. The potential usefulness of a diff tools inspired McIlroy to develop a new, more robust program that had a variety of things, but still a good performance in the processor and memory limits of the PDP-11 - Hardware delivered. Its success was a result of working with the people at Bell Labs, including Alfred V. Aho , Elliot Pinson , Jeffrey Ullman, and Harold S. Stone .

Free software implementations

The graphical diff tool Kompare

The GNU Project provides an implementation of diff (and diff3, which compares three files) in the package .

Several tools that run on different platforms are based on the diffutils engine of the GNU project and represent a graphical front end for the same information. Some of these programs can also edit and merge files.

See also

  • Kompare  - a graphical user interface for diff
  • meld  - very comprehensive graphical user interface for diff
  • WinMerge  - an open source diff for Windows
  • patch  - the counterpart of diff, which is used to reconstruct files based on their differences
  • Delta coding  - a data compression method in which only the differences to the original file are saved in the case of similar files, for example in version management

literature

  • James W. Hunt, M. Douglas McIlroy: An Algorithm for Differential File Comparison . In: Computing Science Technical Report . No. 41 . Bell Laboratories, June 1976 (English).
  • David MacKenzie, Paul Eggert, Richard Stallman: Comparing and Merging Files with GNU Diff and Patch . 2002, ISBN 0-9541617-5-0 (English).
  • E. Myers: An O (ND) Difference Algorithm and Its Variations . In: Algorithmica . tape 1 , no. 2 , 1986, p. 251-266 (English, CiteSeerX ).

Web links

Commons : File comparison  - collection of images, videos and audio files

Individual evidence

  1. Comparing and Merging Files - Detailed Description of Normal Format. In: GNU Diffutils Manual. Retrieved February 17, 2009 .