Fletcher's checksum
Fletcher's Checksum (translated as "Fletcher's checksum", also Fletcher checksum or Fletcher algorithm ) is an error detection method in the form of a checksum introduced in 1982 by John G. Fletcher (1934–2012) . The method can be used to detect errors, for example in digital data transmission . It is based on the fact that the individual bits of the message are not only added up, but also weighted depending on their position. The algorithm thus represents a compromise between the slower but more sensitive cyclical redundancy check (CRC) and error-prone procedures such as a simple checksum or vertical redundancy check . Fletcher's checksum is used in various network protocols and file systems.
The algorithm
The binary transmitted message is divided into sections of the length , each bit thus forming a block . The last block is filled up for the duration of the algorithm and each block is interpreted as an unsigned integer . Let the number of blocks be . The check value consists of two different sums: The first ( ) sums up all the blocks step by step, the second ( ) is the sum of all intermediate results of the first sum. The first block is entered several times , the last block only once. If bits are inverted or two blocks with different values are swapped due to a disturbance during transmission, the test value will normally change: The error is detected.
Since the test value should be able to be transmitted in a fixed number of bits (in general ) regardless of the length of the message , both sums are calculated modulo last and then integrated into the message. Usually or is chosen. The latter is preferable, despite slightly more elaborate calculation because is comparable in size, in contrast to but not through 2 - the basis of the binary system - divisible . It is irrelevant for the result whether the modulo operation is used after each addition or at the end. In order to avoid overflow , it takes place in one implementation during the calculation.
The number of possible test values from Fletcher's Checksum is . Assuming that the message is long enough, the probability that a corrupt message will have the same check value is reduced . In principle, any natural number can be chosen; the algorithm would accordingly be called Fletcher- . Fletcher-16 and Fletcher-32 are common, in some cases also Fletcher-8.
Sample calculation
If the message "Wikipedia" is transmitted in binary ASCII , the calculation shown in the table takes place for Fletcher-16 . If an intermediate result of a total is greater than or equal to 255, modulo is used, shown here as .
message | W. | i | k | i | p | e | d | i | a | Result |
---|---|---|---|---|---|---|---|---|---|---|
binary representation | 01010111 | 01101001 | 01101011 | 01101001 | 01110000 | 01100101 | 01100100 | 01101001 | 01100001 | |
decimal value | 87 | 105 | 107 | 105 | 112 | 101 | 100 | 105 | 97 | |
87 | 192 | 299 44 | 149 | 261 6 | 107 | 207 | 312 57 | 154 | 154 | |
87 | 279 24 | 68 | 217 | 223 | 330 75 | 282 27 | 84 | 238 | 238 |
Pseudocode
Below is a pseudocode for Fletcher-16 with without any optimization. Both sums are returned consecutively as a test value.
Fletcher_16 (data) Input: Array data von 8-Bit-Integern Output: 16-Bit-Prüfsumme zu data sum1 := 0 sum2 := 0 for i := 0 to length (data) do sum1 := (sum1 + data[i]) modulo 255 sum2 := (sum2 + sum1) modulo 255 endfor checksum := sum1 gefolgt von sum2 return checksum
variants
In a variant, which is closer to Fletcher's original algorithm, two test blocks of the length are appended to the end of the message, which are selected such that both sums of the test value for the new, extended message are zero. To do this, the two sums and are initially calculated as before. The value is then assigned to the first test block and the second . Shortly after Fletcher's publication, Betty Salzberg showed that the position of the two successive test blocks can be selected as desired within the message without impairing the test properties. If the blocks are to be in place or , their value is to be selected as and , in each case modulo . The number of blocks corresponds to the original message. In fact, the two blocks can be in freely selectable positions and as long as the amount and are relatively prime .
Keith Sklower developed a recursion that processes two blocks at once and at the end produces the same check value as Fletcher's checksum. Furthermore, the can in certain fault models be advantageous to check value in the trailer instead, as in most network protocols in the header to accommodate.
optimization
If the sums are saved in more than bits, the modulo operation does not have to be carried out after each addition, but only when the variable threatens to overflow. The worst case is assumed, i.e. a message in which each block has the highest possible value in order to determine an upper bound for the number of additions without overflow. If both sums have the same data type , this is done first . When implementing Fletcher-16 with and and using unsigned 16-bit integers .
In addition, the modulo operation can be replaced by bit-wise operators . If , instead of (noted in the programming language C ) , is sufficient , the one's complement arithmetic could be used, in which the addition of the carry (end-around carry) required in this arithmetic produces the appropriate result. However, today's hardware uses almost exclusively two's complement arithmetic; the addition of the carry can in this case be imitated by, provided that the data type used can accommodate more than bits. In the literature on checksums in general and also by Fletcher, the ones and two's complement versions are accordingly often referred to.
x & (M-1)
x % M
(x & (M-1)) + (x >> K)
Fletcher's checksum can be parallelized as well as vectorized . The algorithm can also be accelerated by general optimization methods , in particular loop unrolling .
history
John Fletcher developed the algorithm at the Lawrence Livermore National Laboratory in the late 1970s . In January 1982 the IEEE Communications Society published Fletcher's article on the checksum and its properties in IEEE Transactions on Communications . In this he justifies his development of an arithmetic checksum with the fact that computers are designed for arithmetic operations rather than for polynomial division , as required by the cyclic redundancy check. Fletcher described the test value not as the composition of two, but of any number of sums, whereby the first is calculated as and each additional sum adds up the intermediate results of the previous one. However, using more than two sums does not improve the algorithm enough to warrant the slowdown.
Adapted variants of Fletcher's algorithm are used, among other things, in the fourth layer (transport) of the ISO network protocol standard and in the IS-IS protocol. J. Zweig and C. Partridge suggested in 1990 that the TCP should be expanded so that Fletcher's checksum could be used. This extension has had the status of Historic since 2011 . The checksum is used both in the ZFS file system introduced in 2006 and in the Apple File System introduced in 2016 .
In 1995, Mark Adler to Adler-32 is a variation of Fletcher-32 before, modulo 65,521 (the largest in the prime of less than 2 ^{16} ) is expected.
Properties and comparison with other methods
If the number of bytes in the check value is the same, Fletcher's checksum is subject to a well-implemented cyclical redundancy check (CRC) in error detection. The calculation can start before the message has been completed because there are no direct dependencies . If individual blocks are changed after the checksum has already been calculated, it can be updated without accessing the unchanged blocks.
The following properties always relate to the algorithm . Fletcher's Checksum detects all bundle errors up to the length ; it is susceptible to burst errors that invert a block of only ones to one of only zeros or vice versa, because modulo these two blocks are equivalent. If this special type of error is left out of consideration, all bundle errors up to the length are recognized. For messages up to a length of bits, the method has a Hamming distance of , for longer messages . Fletcher-16 no longer recognizes all two-bit errors from a message length of 2,056 bits, while a distance between the two errors of up to 65,535 bits is sufficient for a suitable CRC with a check value of the same length. A weakness of the implementation of Fletcher-16 is revealed here, since the distance here must be less than or equal to 16 bits.
Fletcher's Checksum does not recognize any false zeros attached to a message, as this means that the sums no longer change. This can be prevented by either knowing the length of the message in advance or informing the recipient within it.
Although with Adler-32 the occupation of a prime number distributes the possible test values better, there are fewer of them overall, so that the more complex calculation almost always tests worse than Fletcher-32.
With the same number of bits in the check value, Fletcher's checksum is therefore superior to simple algorithms such as the vertical redundancy check , a simple sum or the XOR checksum and the Adler checksum. If a CRC can be implemented, this should be preferred.
literature
- John G. Fletcher: An Arithmetic Checksum for Serial Transmissions . In: IEEE Transactions on Communications . Volume 30, Issue 1, 1982, doi: 10.1109 / TCOM.1982.1095369 , pp. 247-252.
- Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. In: Computer Communication Review . Volume 18, Issue 5, 1988, doi: 10.1145 / 53644.53648 , pp. 63-88.
- John Kodis: Fletcher's Checksum: Error correction at a fraction of the cost . In: Dr. Dobb's Journal . Volume 17, Issue 5, 1992, p. 32.
- Theresa C. Maxino, Philip J. Koopman: The Effectiveness of Checksums for Embedded Control Networks . In: IEEE Transactions on Dependable and Secure Computing . Volume 6, Issue 1, 2009, doi: 10.1109 / TDSC.2007.70216 , especially chapter 3.4 (PDF; 676 kB).
Web links
- J. Zweig, C. Partridge: TCP Alternate Checksum Options . RFC 1146, 1990.
- Implementation of Fletcher-16 in various programming languages (English)
- Peter Kankowski: Hash functions: An empirical comparison . 2009.
Individual evidence
- ↑ In Memoriam John Fletcher . Lawrence Livermore National Laboratory, accessed March 16, 2017.
- ↑ ^{a } ^{b } ^{c } ^{d } ^{e} John G. Fletcher: An Arithmetic Checksum for Serial Transmissions . 1982, p. 248.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, pp. 71f.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 76f.
- ^ ^{A } ^{b } ^{c} Theresa C. Maxino, Philip J. Koopman: The Effectiveness of Checksums for Embedded Control Networks . 2009, p. 69.
- ^ Betty Salzberg: A Modification of Fletcher's Checksum . In: IEEE Transactions on Communications . Volume 31, Issue 2, 1983, doi: 10.1109 / TCOM.1983.1095789 , p. 296.
- ↑ ^{a } ^{b} Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 65.
- ^ Keith Sklower: Improving the Efficiency of the OSI Checksum Calculation . In: Computer Communication Review . Volume 19, Issue 5, 1989, doi: 10.1145 / 74681.74684 , pp. 32–43 (PDF; 524 kB).
- ↑ Jonathan Stone, Michael Greenwald, Craig Partridge, James Hughes: Performance of Checksums and CRC's over Real Data . In: IEEE / ACM Transactions on Networking . Volume 6, Issue 5, 1998, doi: 10.1109 / 90.731187 .
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 68, for the derivation see Appendix A, p. 76ff.
- ^ Douglas W. Jones: Modulus without Division, a tutorial . 2001.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, Appendix D, pp. 84ff.
- ↑ John Kodis: Fletcher's checksum: Error correction at a fraction of the cost . 1992, Enter Fletcher chapter.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 70, p. 72.
- ↑ For the code used, see Gerard J. Holzmann: Design and Validation of Computer Protocols . Prentice Hall, 1991, ISBN 0-13-539925-4 , p. 63.
- ^ Adrian Farrel: The Internet and Its Protocols: A Comparative Approach . Morgan Kaufmann, San Francisco 2004, ISBN 155860913X , p. 181 ( limited preview in Google Book Search).
- ^ J. Zweig, C. Partridge: TCP Alternate Checksum Options . RFC 1146, 1990.
- ^ L. Eggert: Moving the Undeployed TCP Extensions RFC 1072, RFC 1106, RFC 1110, RFC 1145, RFC 1146, RFC 1379, RFC 1644, and RFC 1693 to Historic Status . RFC 6247, 2011.
- ↑ James Guilford, Vinodh Gopal: Fast Computation of Fletcher Checksums . Intel Data Center Group, January 14, 2016, Introduction chapter.
- ↑ For the code used, see Matthias Luft: ERNW Whitepaper 65 - APFS Internals for Forensic Analysis . April 16, 2018, p. 7f.
- ↑ Theresa C. Maxino, Philip J. Koopman: The Effectiveness of checksums for embedded control networks . 2009, p. 67.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 65, for formulas and their derivation see Appendix B, p. 80ff.
- ↑ Theresa C. Maxino, Philip J. Koopman: The Effectiveness of checksums for embedded control networks . 2009, p. 64f.
- ↑ Theresa C. Maxino, Philip J. Koopman: The Effectiveness of checksums for embedded control networks . 2009, p. 65.
- ^ John G. Fletcher: An Arithmetic Checksum for Serial Transmissions . 1982, p. 251.
- ↑ Anastase Nakassis: Fletcher's Error Detection Algorithm: How to implement it efficiently and how to avoid the most common pitfalls. 1988, p. 73.
- ↑ Theresa C. Maxino, Philip J. Koopman: The Effectiveness of checksums for embedded control networks . 2009, p. 70.