Buffer overflow

from Wikipedia, the free encyclopedia

Buffer overflows ( English buffer overflow ), or - in particular - also stack overflows (English , stack overflows ' ) called, among the most frequent vulnerabilities in current software that u is. a. can take advantage of the Internet . In the event of a buffer overflow due to errors in the program, excessively large amounts of data are written to a memory area reserved for this purpose - the buffer or stack  - which means that memory locations after the target memory area are overwritten.

Does not turn it around an entire block of data, but a destination address of a single record, even from a , pointer overflow ' (from the English pointer , for " pointer ") are referred to, indicating where the record should be written down in the buffer.

Danger from buffer overflows

A buffer overflow can cause the program concerned to crash, data corruption or damage to the data structures of the program's runtime environment . With the latter, the return address of a subroutine can be overwritten with any data, which means that an attacker can, by transmitting any machine code, execute any commands with the privileges of the process susceptible to the buffer overflow. The aim of this code is usually to give the attacker more convenient access to the system so that he can then use the system for his own purposes. Buffer overflows in common server and client software are also exploited by internet worms.

A particularly popular target in Unix systems is root access, which gives the attacker all access rights. However, this does not mean, as is often misunderstood, that a buffer overflow which "only" leads to the privileges of a "normal" user is not dangerous. Achieving the coveted root access is often much easier if you already have user rights ( privilege escalation ).

Buffer overflow attacks are an important issue in computer security and network security . They can be attempted not only over any type of network, but also locally on the system. As a rule, they can only be resolved by means of error corrections ( patches ) provided by the manufacturer at short notice .

In addition to negligence in programming, buffer overflows are mainly made possible by computer systems based on the Von Neumann architecture , according to which data and program are in the same memory. Because of this proximity to hardware, they are only a problem with assembled or compiled programming languages . Apart from errors in the interpreter , interpreted languages ​​are usually not susceptible, since the memory areas for data are always under the complete control of the interpreter.

With the Protected Mode , which was introduced in the 80286 , the program, data and stack memory can be physically separated from one another by segmenting the linear memory . Access protection is provided by the memory management unit of the CPU. The operating system just has to ensure that no more memory is made available than the linear address space. OS / 2 was the only widely used operating system to use memory segmentation.

Programming languages

The main cause of buffer overflows is the use of programming languages that do not offer the option of automatically monitoring the limits of memory areas in order to prevent memory areas from being exceeded. This includes in particular the C language , which puts the main emphasis on performance (and originally the simplicity of the compiler) and dispenses with monitoring, as well as the C development C ++ . Here a programmer is sometimes forced to generate the corresponding code by hand, often deliberately or out of negligence. The check is often implemented incorrectly, since these program parts are usually not tested or are inadequately tested during the program tests. In addition, the complex range of languages ​​(in the case of C ++) and the standard library provide a large number of error-prone constructs, to which there is hardly any alternative in many cases.

The frequently used programming language C ++ offers only limited possibilities for the automatic checking of field boundaries. As a further development of the C programming language, it adopts all the properties of C, but the risk of buffer overflows when using modern language resources (including automatic memory management) can be largely avoided. Due to habit, reasons of compatibility with existing C code, system calls in the C convention and for performance reasons, these options are not always used. In contrast to languages ​​such as Pascal or Ada, runtime checks are not part of the language, but can be retrofitted in some use cases (e.g. with smart pointers ).

Since most programming languages ​​also define standard libraries, choosing a language usually means using the corresponding standard libraries. In the case of C and C ++, the standard library contains a number of dangerous functions, some of which do not allow safe use at all and some of which there are no alternatives.

At the programming language level, the risk of buffer overflows can be reduced or eliminated by using programming languages ​​that are conceptually more secure than C ++ or C. There is a much lower risk, for example, in programming languages ​​of the Pascal family Modula , Object Pascal or Ada.

Buffer overflows, for example in the Java programming language, are almost impossible because the execution is monitored in the bytecode . But there are also buffer overflows in Java, the cause of which lies in the runtime system and which affects several JRE versions. On the other hand, the Java runtime environment throws one java.lang.StackOverflowErrorif the method call stack overflows due to a faulty endless recursion. This is a logical programming error by the application programmer, not by the runtime system.

Processors and programming style

Other peculiarities of C and C ++ and of the most commonly used processors make buffer overflows likely. The programs in these languages ​​partly consist of sub- programs . These have local variables.

With modern processors it is common to put the return address of a subroutine and its local variables on an area called a stack . When the subroutine is called, first the return address and then the local variables are placed on the stack. With modern processors such as the Intel Pentium , the stack is managed by built-in processor commands and grows downwards . If fields or character strings are used in the local variables, these are usually described above . If the field boundary is not checked, the return address on the stack can be reached by crossing the field and, if necessary, intentionally modified.

The following program section in C, which is often used in a similar form, shows such a buffer overflow:

void input_line()
{
    char line[1000];
    if (gets(line))     // gets erhält Zeiger auf das Array, keine Längeninformation
        puts(line);     // puts schreibt den Inhalt von line nach stdout
}

For processors that describe the stack downwards, it looks like this before calling gets (function of the standard library of C) (if one disregards the possibly existing base pointer):

Return address
1000th characters
... ...
3rd character
2nd character
1st character ← Stack pointer
The stack grows downwards, the variable is overwritten upwards

gets reads a line from the input and writes the characters from line [0] to the stack. It does not check the length of the line. According to the semantics of C, gets only receives the memory address as a pointer, but no information about the available length. If you now enter 1004 characters, the last 4 bytes overwrite the return address (assuming that an address is 4 bytes in size), which can be directed to a program section within the stack. If necessary, you can enter a suitable program in the first 1000 characters .

00@45eA/%A@4 ... ... ... ... ... ... ... ... ... ... ... ... ... 0A&%
Input, is written to the stack by gets (1004 characters)
modified return address
line, 1000th characters
...
line, 5th character third byte in the code
line, 4th character second byte in the code
line, 3rd character Destination of the return address, program code start
line, 2nd character
line, 1st character ← Stack pointer
Overwriting of the return address and program code in the stack

If the program has higher privileges than the user, he can use the buffer overflow to obtain these privileges through a special input.

Countermeasures

Program creation

A very sustainable countermeasure consists in the use of type-safe programming languages ​​and tools, such as Java or C # , in which compliance with assigned memory areas is checked with the compiler when translating into machine language , but is monitored with the corresponding program code at runtime at the latest . It is essential here that pointer variables can only be changed according to strict, restrictive rules, and in this context it is also helpful if only the runtime system performs automatic garbage collection.

When creating programs, it is therefore important to check all field boundaries. This is the responsibility of the programmer in the case of outdated, non-type-safe programming languages. However, the use of programming languages ​​that automatically monitor field boundaries should preferably be considered, but this is not always easily possible. When using C ++, the use of C-style fields should be avoided as much as possible.

void input_line()
{
    char line[1000];
    if (fgets(line, sizeof(line), stdin))   // fgets überprüft die Länge
        puts(line);  // puts schreibt den Inhalt von line nach stdout
}
Countermeasure: fgets checks the input length

Checking the program code

Special verification tools allow the code to be analyzed and possible weak points to be discovered. However, the field boundary checking code can be buggy, which is often not tested.

Compiler support

A very large selection of existing programs is available in C and C ++. Modern compilers such as new versions of the GNU C compiler allow checking code generation to be activated during translation.

Due to their design, languages ​​like C do not always allow the field boundaries to be checked (example: gets ). The compilers have to go other ways: They insert space for a random number (also called "canary") between the return address and the local variables. This number is determined when the program is started, and each time it assumes different values. With each subroutine call, the random number is written into the area provided for it. The required code is generated automatically by the compiler. Before leaving the program via the return address, the compiler inserts code that checks the random number for the intended value. If it was changed, the return address cannot be trusted either. The program is canceled with a corresponding message.

Return address
Random number barrier
line, 1000th characters
...
line, 3rd character
line, 2nd character
line, 1st character ← Stack pointer
Countermeasure: random number barrier

In addition, some compilers can also be made to generate a copy of the return address below the local fields when the subroutine is called . This copy is used when returning, which makes it much more difficult to exploit buffer overflows:

Return address
line, 1000th characters
...
line, 3rd character
line, 2nd character
line, 1st character
Copy of the return address ← Stack pointer
Countermeasure: Copy of the return address

Compiler and Compiler Extensions

For the GNU Compiler Collection , for example, there are two common extensions that implement measures such as those described above:

Heap overflow

A heap overflow is a buffer overflow that occurs on the heap . Memory on the heap is allocated when programs request dynamic memory, for example via malloc () or the new operator in C ++ . If data is written into a buffer on the heap without checking the length and the amount of data is larger than the size of the buffer, then the buffer is written beyond the end of the buffer and a memory overflow occurs.

By heap overflows arbitrary code can be obtained by overwriting of pointers to functions are executed on the computer especially when the heap is executable. For example, FreeBSD has heap protection, but this is not possible here. They can only occur in programming languages in which there is no length check when buffer access is performed. C , C ++ or assembler are vulnerable, Java or Perl are not.

For example, on June 23, 2015, Adobe announced that such a buffer overflow could cause any malicious code to be executed on systems and thus take control of the system on which the Flash Player was installed.

example

#define BUFSIZE 128

char * copy_string(const char *s)
{
    char * buf = malloc(BUFSIZE); // Annahme: Längere Strings kommen niemals vor

    if (buf)
        strcpy(buf, s); // Heap-Überlauf, falls strlen(s) > 127

    return buf;
}

Since strcpy () does not check the sizes of the source and target, but rather expects a zero-terminated ('\ 0') memory area as the source, the following variant is also unsafe (however, it will not overshoot "buf", but possibly beyond the end of the memory area allocated to "s").

char * buf;

buf = malloc(1 + strlen(s)); // Plus 1 wegen des terminierenden NUL-Zeichens
if (buf)
    strcpy(buf, s);

The strncpy command, on the other hand, copies a maximum of n characters from the source to the destination and thus works if s is null-terminated or greater than BUFSIZE.

char *buf;

if ((buf = malloc(BUFSIZE)) != NULL) { // Überprüfung des Zeigers.
    strncpy(buf, s, BUFSIZE - 1);
    buf[BUFSIZE - 1] = '\0';  // Nachteil: Die Zeichenkette muss manuell terminiert werden.
}
return buf;

Some operating systems, e.g. B. OpenBSD , offer the function strlcpy , which in turn ensures that the target string is null-terminated and simplifies the detection of a truncated target string.

See also

literature

Web links

Individual evidence

  1. Vulnerability in the Sun Java Runtime Environment - LinuxCommunity , on January 17, 2007
  2. Sun Java JRE up to 1.5.x GIF Image Handler buffer overflow - vuldb.com , on January 22, 2007 (last change on July 7, 2015)
  3. What is a stack? . June 2, 2012.
  4. ^ Dennis Schirrmacher: Adobe: Emergency update for Flash Player. In: Heise Security. Heise online , June 24, 2015, accessed June 29, 2015 .
  5. Security updates available for Adobe Flash Player - CVE number: CVE-2015-3113. In: Adobe Security Bulletin. Adobe Inc. , June 23, 2015, accessed June 29, 2015 .