Magic number (computer science)

from Wikipedia, the free encyclopedia

A Magic Number ( English magic number ) has been in the program three meanings:

  1. Originally from the Unix world, it is a special value that identifies a certain file format at the beginning of the file (as evaluated, for example, by the Unix file command ).
  2. A conspicuous value to mark a register or a memory area that should later be examined for errors using a debugger . Such marking magic numbers are mostly selected from the following domains:
  3. A numerical value that appears in the source code of a program (also called hard coded value ), the meaning of which cannot be immediately recognized - its meaning is therefore “magical”. Such magic numbers should be avoided and replaced by well-named constant definitions, the names of which clearly indicate the meaning and origin.

Magic numbers to identify file types

An early convention in Unix-style operating systems was for binaries to start with two bytes containing a "magic number" that indicates the type of file . In the beginning, this was used to identify object files for different platforms . Gradually, this concept was carried over to other files, and now there is a magic number in almost every binary file.

Many other types of files have content that identifies the type of file. For example, XML begins with the special string “ <?xml”, which identifies the file as XML. If you convert this beginning of the file into a number, you can quickly determine the file type using a simple comparison without having to know much about the format.

Some examples:

  • The point with important network parameters of the BOOTP / DHCP protocol begins with a ( hexadecimal ) magic cookie 0x63825363 .
  • Compiled Java class files ( bytecode ) begin with 0xCAFEBABE.
  • GIF files start with the ASCII code for GIF89a'' ( 0x474946383961) or GIF87a'' ( 0x474946383761)
  • JPEG / JFIF files begin with 0xFFD8FFand still contain the ASCII equivalent for JFIF'' ( 0x4A464946).
  • PNG files begin with an 8-byte MagicByte, which identifies the file as a PNG and enables file transfer problems to be detected: \211 P N G \r \n \032 \n( 0x89504e470d0a1a0a)
  • Standard MIDI files contain the ASCII character string MThd'' ( 0x4D546864) followed by metadata.
  • Unix - scripts of all kinds normally start with a shebang , #!'( 0x23 0x21), followed by a path to the interpreter (eg. #!/usr/bin/perl' For Perl )
  • EXE files for MS-DOS as well as Microsoft Windows EXE and DLLs start with the ASCII characters MZ'' ( 0x4D5A) or rarely ZM'' ( 0x5A4D), the initials of the inventor of this format, Mark Zbikowski , see MZ file .
  • The Berkeley Fast File System - superblock is identified by 0x19540119or 0x011954depending on the version; both are the date of birth of designer Marshall Kirk McKusick .
  • Game Boy and Game Boy Advance programs have a 48 or 156 byte magic number. This number encodes a bitmap of the Nintendo logo.
  • Old Fat binaries (the code for both the 68K and the PowerPC processor included) on Mac OS 9 start with the ASCII string of, Joy!'(English for joy! ; Hexadecimal 0x4A6F7921).
  • TIFF files begin with IIor MM, depending on the endianness ("II" corresponds to Intel and "MM" to Motorola), followed by 0x2A00or 0x002A(in the decimal system 42 ).
  • ZIP files begin with PK, the initials of their inventor, Phil Katz . This also applies to other files compressed with the Deflate algorithm, such as. B. the Microsoft Office formats DOCX, XLSX and PPTX.

The Unix command filereads and interprets magic numbers from files. The Linux kernel module binfmt misc also uses magic numbers to identify the file type of an application. The real "magic" is in the file /usr/share/misc/magic.mgc(under Debian /usr/share/file/magic.mgc).

Magic numbers as marking in programming

Hexadecimal numbers are often used to represent values ​​on data carriers or other storage . Most of the numbers look quite “uninteresting” and “random”. Sometimes, however, it is advantageous to have an immediately noticeable value (for example when troubleshooting).

0xDEADBEEF(decimal: 3,735,928,559) is a number in hexadecimal notation, which is read as ' dead beef ' ( English for “dead beef”).

Usually a value like 0xDEADBEEFRarely occurs and is therefore used to indicate special values. The number itself has no special meaning and can just as easily be replaced by other “readable” values ​​such as 0xABABABAB, 0x00C0FFEEor 0x0BADF00D(English “ bad food ”, for example “bad food”).

Since such a value rarely occurs (with uniform distribution of 32-bit numbers with a probability of 1: 2 32 = 1: 4,294,967,296, even less often according to Benford's law ), it is often used by software developers to detect errors such as buffer overflows or Find or examine uninitialized variables. So if the value appears in memory, the programmer should take a closer look at this point. Memory areas that should not be 0xDEADBEEFwritten by the program are also written full for debugging purposes . If the program writes in this area, it will be noticed immediately.

Many versions of the PowerPC processor initialize their registers with 0xDEADBEEFafter a hardware reset. 0xDEADBEEFwas in the original Mac OS - operating system and also the imported 1990 RS / 6000 servers from IBM used for diagnostic purposes.

Decimal numbers are also used, e.g. B. to “give a face” to numbers in concepts and / or presentations, to be placeholders, but at the same time to show everyone understandably that the specific value of the number is completely irrelevant. The programmers like to choose the value 42 , which is proclaimed in the SF novel The Hitchhiker's Guide to the Galaxy by an omniscient computer as the solution to all problems. Other examples are well-known “commonplace numbers” such as “ 08/15 ” (German machine gun from the world wars) or “ 4711 ” (well-known perfume brand ).
Example: » The customer“ 4711 ”orders the article“ 08/15 ”. Another customer "42" also orders this article. What should happen if only one item “08/15” is in stock? «

Magic numbers in code

The term magic number ( English ' magic number ' , often also ' hard coded value ' ) also describes the bad programming style of writing values ​​directly between the commands of the source code . In many cases, this makes program code more difficult to read and incomprehensible. It is usually better to define numbers with meaning as constants and thus to give them a meaningful name. In addition, such a number is easier to change throughout the code, since other numbers often depend on it.

An example in Pascal- like pseudocode that mixes 52 numbers in one field (array):

for i from 1 to 52
{
  j:= i + randomInt(53 - i) - 1
  swapEntries(i, j)
}

The function randomInt(x)generates a number between 1 and x and swapEntries(i, j)swaps the entries i and j in the field. 52 is a magic number. Better style is the following program:

constant int cardGame_deckSize:= 52
for i from 1 to cardGame_deckSize
{
  j:= i + randomInt(cardGame_deckSize + 1 - i) - 1
  swapEntries(i, j)
}

The advantages here are:

  • Easier to understand. A programmer who reads the first program will wonder what the meaning of the 52 means, and may search a long time before he realizes the meaning behind it.
  • Easier to change. If, in the example above, the magic number is to be changed subsequently throughout the program, this dependent 53 must also be changed. In larger programs, such a procedure can be very confusing and time-consuming. Errors can arise which later have to be eliminated in a very laborious manner and which may be very difficult to identify. In contrast, in the example below, the definition of the constant only needs to be changed in the first line.
  • All significant numbers are at the beginning of the program so that the overview cannot be lost.
  • Simplified parameterization. If the above program is to mix fields of any size, cardGame_deckSizea so-called parameter can simply be made from it. Example:
function shuffle (int cardGame_deckSize)
{
  for i from 1 to cardGame_deckSize
  {
    j:= i + randomInt(cardGame_deckSize + 1 - i) - 1
    swapEntries(i, j)
  }
}
  • Typos are avoided. The compiler will have no problem if you typed 42 instead of 52, but the program will not work properly. If, on the other hand , you type, the error is recognized by the compiler.cardGame_dekcSize

Disadvantages are:

  • The code will be extended. If many constants are used in one line, line breaks must be inserted.
  • It makes debugging difficult on systems that do not display the values ​​of constants. (However, it makes debugging a lot easier when using a symbolic debugger.)
  • If the constant has not been introduced sensibly, the reader may have to take another look at its definition.

See also

Web links