Magic number (computer science)
A Magic Number ( English magic number ) has been in the program three meanings:
- Originally from the Unix world, it is a special value that identifies a certain file format at the beginning of the file (as evaluated, for example, by the Unix file command ).
- A conspicuous value to mark a register or a memory area that should later be examined for errors using a debugger . Such marking magic numbers are mostly selected from the following domains:
- ASCII (most used)
-
hexadecimal representation of numbers (for example = 305419896
0x12345678
) - Sometimes Hexspeak is used
- A numerical value that appears in the source code of a program (also called “ hard coded value ” ), the meaning of which cannot be immediately recognized - its meaning is therefore “magical”. Such magic numbers should be avoided and replaced by well-named constant definitions, the names of which clearly indicate the meaning and origin.
Magic numbers to identify file types
An early convention in Unix-style operating systems was for binaries to start with two bytes containing a "magic number" that indicates the type of file . In the beginning, this was used to identify object files for different platforms . Gradually, this concept was carried over to other files, and now there is a magic number in almost every binary file.
Many other types of files have content that identifies the type of file. For example, XML begins with the special string “ <?xml
”, which identifies the file as XML. If you convert this beginning of the file into a number, you can quickly determine the file type using a simple comparison without having to know much about the format.
Some examples:
- The point with important network parameters of the BOOTP / DHCP protocol begins with a ( hexadecimal ) magic cookie
0x63825363
. -
Compiled Java class files ( bytecode ) begin with
0xCAFEBABE
. -
GIF files start with the ASCII code for
GIF89a
'' (0x474946383961
) orGIF87a
'' (0x474946383761
) -
JPEG / JFIF files begin with
0xFFD8FF
and still contain the ASCII equivalent forJFIF
'' (0x4A464946
). -
PNG files begin with an 8-byte MagicByte, which identifies the file as a PNG and enables file transfer problems to be detected:
\211 P N G \r \n \032 \n
(0x89504e470d0a1a0a
) - Standard MIDI files contain the ASCII character string
MThd
'' (0x4D546864
) followed by metadata. -
Unix - scripts of all kinds normally start with a shebang ,
#!
'(0x23
0x21
), followed by a path to the interpreter (eg.#!/usr/bin/perl
' For Perl ) -
EXE files for MS-DOS as well as Microsoft Windows EXE and DLLs start with the ASCII characters
MZ
'' (0x4D5A
) or rarelyZM
'' (0x5A4D
), the initials of the inventor of this format, Mark Zbikowski , see MZ file . - The Berkeley Fast File System - superblock is identified by
0x19540119
or0x011954
depending on the version; both are the date of birth of designer Marshall Kirk McKusick . - Game Boy and Game Boy Advance programs have a 48 or 156 byte magic number. This number encodes a bitmap of the Nintendo logo.
- Old Fat binaries (the code for both the 68K and the PowerPC processor included) on Mac OS 9 start with the ASCII string of,
Joy!
'(English for joy! ; Hexadecimal0x4A6F7921
). -
TIFF files begin with
II
orMM
, depending on the endianness ("II" corresponds to Intel and "MM" to Motorola), followed by0x2A00
or0x002A
(in the decimal system 42 ). -
ZIP files begin with
PK
, the initials of their inventor, Phil Katz . This also applies to other files compressed with the Deflate algorithm, such as. B. the Microsoft Office formats DOCX, XLSX and PPTX.
The Unix command file
reads and interprets magic numbers from files. The Linux kernel module binfmt misc also uses magic numbers to identify the file type of an application. The real "magic" is in the file /usr/share/misc/magic.mgc
(under Debian /usr/share/file/magic.mgc
).
Magic numbers as marking in programming
Hexadecimal numbers are often used to represent values on data carriers or other storage . Most of the numbers look quite “uninteresting” and “random”. Sometimes, however, it is advantageous to have an immediately noticeable value (for example when troubleshooting).
0xDEADBEEF
(decimal: 3,735,928,559) is a number in hexadecimal notation, which is read as ' dead beef ' ( English for “dead beef”).
Usually a value like 0xDEADBEEF
Rarely occurs and is therefore used to indicate special values. The number itself has no special meaning and can just as easily be replaced by other “readable” values such as 0xABABABAB
, 0x00C0FFEE
or 0x0BADF00D
(English “ bad food ”, for example “bad food”).
Since such a value rarely occurs (with uniform distribution of 32-bit numbers with a probability of 1: 2 32 = 1: 4,294,967,296, even less often according to Benford's law ), it is often used by software developers to detect errors such as buffer overflows or Find or examine uninitialized variables. So if the value appears in memory, the programmer should take a closer look at this point. Memory areas that should not be 0xDEADBEEF
written by the program are also written full for debugging purposes . If the program writes in this area, it will be noticed immediately.
Many versions of the PowerPC processor initialize their registers with 0xDEADBEEF
after a hardware reset. 0xDEADBEEF
was in the original Mac OS - operating system and also the imported 1990 RS / 6000 servers from IBM used for diagnostic purposes.
Decimal numbers are also used, e.g. B. to “give a face” to numbers in concepts and / or presentations, to be placeholders, but at the same time to show everyone understandably that the specific value of the number is completely irrelevant. The programmers like to choose the value 42 , which is proclaimed in the SF novel The Hitchhiker's Guide to the Galaxy by an omniscient computer as the solution to all problems. Other examples are well-known “commonplace numbers” such as “ 08/15 ” (German machine gun from the world wars) or “ 4711 ” (well-known perfume brand ).
Example: » The customer“ 4711 ”orders the article“ 08/15 ”. Another customer "42" also orders this article. What should happen if only one item “08/15” is in stock? «
Magic numbers in code
The term magic number ( English ' magic number ' , often also ' hard coded value ' ) also describes the bad programming style of writing values directly between the commands of the source code . In many cases, this makes program code more difficult to read and incomprehensible. It is usually better to define numbers with meaning as constants and thus to give them a meaningful name. In addition, such a number is easier to change throughout the code, since other numbers often depend on it.
An example in Pascal- like pseudocode that mixes 52 numbers in one field (array):
for i from 1 to 52 { j:= i + randomInt(53 - i) - 1 swapEntries(i, j) }
The function randomInt(x)
generates a number between 1 and x and swapEntries(i, j)
swaps the entries i and j in the field. 52 is a magic number. Better style is the following program:
constant int cardGame_deckSize:= 52 for i from 1 to cardGame_deckSize { j:= i + randomInt(cardGame_deckSize + 1 - i) - 1 swapEntries(i, j) }
The advantages here are:
- Easier to understand. A programmer who reads the first program will wonder what the meaning of the 52 means, and may search a long time before he realizes the meaning behind it.
- Easier to change. If, in the example above, the magic number is to be changed subsequently throughout the program, this dependent 53 must also be changed. In larger programs, such a procedure can be very confusing and time-consuming. Errors can arise which later have to be eliminated in a very laborious manner and which may be very difficult to identify. In contrast, in the example below, the definition of the constant only needs to be changed in the first line.
- All significant numbers are at the beginning of the program so that the overview cannot be lost.
- Simplified parameterization. If the above program is to mix fields of any size,
cardGame_deckSize
a so-called parameter can simply be made from it. Example:
function shuffle (int cardGame_deckSize) { for i from 1 to cardGame_deckSize { j:= i + randomInt(cardGame_deckSize + 1 - i) - 1 swapEntries(i, j) } }
- Typos are avoided. The compiler will have no problem if you typed 42 instead of 52, but the program will not work properly. If, on the other hand , you type, the error is recognized by the compiler.
cardGame_dekcSize
Disadvantages are:
- The code will be extended. If many constants are used in one line, line breaks must be inserted.
- It makes debugging difficult on systems that do not display the values of constants. (However, it makes debugging a lot easier when using a symbolic debugger.)
- If the constant has not been introduced sensibly, the reader may have to take another look at its definition.
See also
Web links
- File Signatures Table (English)
- PNG file signature, Rationale (English)
- Hex Oddities (English)