Linker (computer program)

from Wikipedia, the free encyclopedia
Libraries (lib) and / or object files (obj) are combined (linked) by the linker to form libraries, dynamic libraries (dll) or executable files (exe).

A linker or binder (also: "binding loader") is a computer program that compiles (connects) individual program modules to form an executable program. On IBM - mainframe systems , the linker is linkage editor called (English).

Most programs contain parts or modules that can be used in other programs. Several compiled modules with functions (so-called object files ) can be combined into function libraries ( program libraries ) . The code is added to the main program by the linker if the corresponding function is required.

In order to be able to use a program module in another program, the symbolic addresses of the functions and variables of the module must be converted into memory addresses. The linker takes on this task. The linking process takes place after compilation and is usually the last step in creating a program. A general distinction is made between static and dynamic linking.

Static left

This static linking is the process, which typically occurs at the end of the program under development. The result is a fully assembled program. In the case of completely statically linked programs, this consists of a single file . With static linking, the program module resolution of the application is carried out once at the time of development, in contrast to dynamic linking , in which this happens every time at runtime. An advantage of static linking is an increased portability of an application, since it does not rely on the provision of program modules such. B. is instructed by the operating system, since the application carries it out itself. Installation of the program is therefore not necessary. Disadvantages are a potentially higher memory requirement, since program modules cannot be used by other programs, as well as the need to recompile and link the entire application if an improved version has been published for a sub-module.

Because of these disadvantages, some C libraries on Unix-like operating systems often no longer fully support static linking. For example, glibc enforces dynamic linking for modules that affect user authentication . Programs that use these modules are always dependent on the presence of a suitable "runtime version" of glibc.

Dynamic linking

It is also possible to postpone resolving the function and variable names until the program is actually executed. In this case we speak of dynamic linking. Depending on the operating system, this is done by loading complete dynamic libraries (or dynamically linked library (DLL) or shared library ) or by specifically loading a subprogram from a program library. This has the advantage that libraries or programs can easily be exchanged afterwards, the calling programs become smaller and the memory is only required once if several programs use the same components. The disadvantage is that it has to be ensured that the correct library is installed in the correct version (see e.g. DLL conflict ). Reloaded libraries are often referred to as plug-ins .

Mixed forms of static and dynamic link types are the norm. Certain subroutines are statically linked to the calling program, others are dynamically reloaded.

Language-specific variants when loading

Overloaded

Overloading ” means the redefinition of the identifier ( name mangling ) of a subroutine depending on the parameter selection . The following examples are only possible in C ++ or Java, not but in pure C where the overload is not intended functions and the attempt to realize such, would trigger a translation error.

The function void function(int x);is completely different from void function(float x);. Both functions have different implementations, different names in the object file and have nothing more to do with each other than that they have the same name. So only the function name is overloaded.

The following types of calls are problematic for understanding and for the translator:

short y;
function(y);

Here the translator has to decide whether to carry out a type conversion ( cast ) after intor after floatand call the corresponding variant of the function. The first case would be obvious, but a lot depends on the translator used; the programmer has no idea what is going on in the underground of the machine code. In such cases, some translators choose what is presumably correct (which may be wrong in a specific case), while other translators, such as GNU, tend to output an error in order to require the user to make a decision. He must then function((float)(y));specify the selection with a notation as in type conversion.

In general, it is better not to use the overloading option too freely, but only for clear differences such as variants of subroutines with different numbers of parameters. But also here the combination with parameters with default arguments leads to irritations. A parameter-sensitive function call with pointers of different types that cannot be derived via base classes ( inheritance ) can be described as safe . In any case, the translator checks the pointer type correctness and either reports an error or uses exactly the right subroutine:

class ClassA;
class ClassB;
function(class A*);  // Ist deutlich unterschieden von
function(class B*);

if ClassA and ClassB are in no way derived (inherited) from each other.

Overwrite

Overwriting ”, which is to be distinguished from “overloading”, is a dynamic link in which a method (a subroutine) of a base class is covered by the method of the same name and parameterized in the derived class. The method that corresponds to the instance of the data is called at runtime. This is conveyed through the table of virtual methods , a basic concept of object-oriented programming .

Naming conflicts

The process of linking creates a single, large, non-hierarchical, common name space . This often leads to name conflicts in large or very complex projects. For these cases weak links are common, in which the link sequence decides which module is used where. Programming languages ​​such as B. C ++ solve the problem by addressing module contents using hierarchically structured names. The problem of the presence of a library in different versions, for example, remains unsolved. At the time of linking, the problem can only be solved by giving the linker different search paths depending on the library required - each of the libraries in question differs in terms of its name, but is indistinguishable in terms of content for a linker, as the same in it Symbols are present. After the first static link, however, the matter is not a problem, since the library used can be called from then on using its name.

literature

  • Leon Presser, John R. White: Linkers and Loaders . In: ACM Computing Surveys . tape 4 , 3 (Sept.), 1972, ISSN  0360-0300 , pp. 149–167 ( berkeley.edu [PDF; 1,3 MB ]).
  • John R. Levine: Linkers and Loaders . Morgan-Kauffman, San Francisco 1999, ISBN 1-55860-496-0 .
  • Stanley B. Lippman: C ++ Primers . Addison-Wesley Longman, Amsterdam 2005, ISBN 0-201-72148-1 .
  • Randal E. Bryant, David R. O'Hallaron: Computer Systems: A Programmer's Perspective (3rd edition, especially chapter 7) . Pearson, Boston 2016, ISBN 978-0-13-409266-9 .

Web links

Individual evidence

  1. ^ IBM Corporation (ed.): Operating System 360, Linkage Editor, Program Logic Manual . New York 1967.
  2. Chapter on main memory, slide 6 (PDF) informatik.uni-ulm.de Operating Systems - Lecture in the main course
  3. a b Ulrich Drepper : Static Linking Considered Harmful ( English ) redhat.com. Archived from the original on December 21, 2004. Retrieved January 13, 2012: “ There are still too many people out there who think (or even insist) that static linking has benefits. This has never been the case and never will be the case. [...] "