Variable (programming)

from Wikipedia, the free encyclopedia

In programming , a variable is an abstract container for a variable that occurs in the course of a computing process . Normally, a variable in the source text is identified by a name and has an address in the memory of a machine.

The value represented by a variable and possibly also the size can - in contrast to a constant  - be changed during the runtime of the computing process.

Types of variables

A basic distinction is made between value variables and referential variables . A value is stored directly in a value variable, while a referential variable contains the memory address of the actual value, a function or an object as a value . This is why referential variables are also known as pointers .

Examples in the programming language C #

const int i = 3;         // Konstante; keine Variable
int j = 3;               // Wertevariable
object k = (object)3;    // referenzielle Variable auf einen Wert
object o = new object(); // referenzielle Variable auf ein Objekt
object n = null;         // referenzielle Variable auf das null-Objekt (Zeiger auf Speicheradresse 0)
Func<int> f = () => 3;   // referenzielle Variable auf eine Funktion

Use of variables

Different ways of using variables can be distinguished:

  • Input variables receive values ​​that are entered into the program , function or method from outside (see parameters ).
  • Output variables later contain the results of the calculation.
  • Reference variables ( pointers ) serve as both input and output variables. The value can be changed during the calculation.
  • Auxiliary variables record values ​​that are required in the course of the calculation.
  • Environment variables represent the external boundary conditions of a program .
  • Metasyntactic variables are used to simply name entities or parts of the program code .
  • Runtime variables

Views of variables

The concept of variables is interpreted differently by programming languages :

Accordingly, different programming languages define the concept of variables very differently. In the case of Java it says:

"A variable is a memory location."

In the language definition of Scheme , however, it says

“Scheme allows identifiers to stand for storage locations that contain values.
Such identifiers are called variables. "

In general, four aspects have to be distinguished for a variable in an imperative programming language :

  • the storage space itself as a container for data,
  • the data stored in the memory location,
  • the address of the memory location and
  • the identifier under which the storage space can be addressed.

The situation is also complicated by the fact that different memory locations can be addressed under a certain identifier at different points in the program or at different times in the program execution and that there are also anonymous, i.e. nameless variables.

L-value and R-value of variables

It is typical for imperative programming languages that an identifier on the left side of a value assignment has a different meaning ("L value") than on its right side ("R value"). The instruction

x := x + 1

means: "Take the value of the variable with the name x, increase it by one and save this at the address of x." The L value of a variable is its address, the R value its content.

Variables as parameters of functions

The parameters of a function are also represented in their declaration by variables, which are then called formal parameters . When the function is called, expressions are then assigned to the formal parameters as actual parameters . There are different mechanisms for transferring the actual parameters to the function. Handover by value and handover by reference are widespread .

Types of variables

Each variable in a program is necessarily associated with a certain data type (in short: type ). This is necessary because only the data type determines which operations on and with the variable are sensible and permissible. The data type of a variable can also determine the memory size of the variable. As a rule, the programmer has the option of defining this type in a declaration . In many programming languages , such an explicit declaration is even mandatory. Other programming languages ​​offer implicit declarations that are used when there are no explicit declarations. For example, Fortran knew the convention that variables whose names begin with letters between Iand are Nof type INTEGERand all others are of type REAL, unless otherwise specified. Other programming languages ​​know what are known as latent types. Declarations are not necessary here; the machine recognizes the type of a variable when it is used for the first time by its content and then silently continues this type. In some programming languages, the type of variable that is not explicitly specified can, under certain conditions, also be inferred by the compiler using type inference based on other types with which the variable is related.

In dynamically typed programming languages , the type of a variable can only arise during the runtime of a program and can also change during the program execution. The following example in JavaScript illustrates this:

function show(value) {
    if (typeof value === 'number')
        value += " ist eine Zahl";

    console.log(value);
}

show('Hallo Welt!');
show(42);

Variables in a block structure

An important concept of programming languages is the subroutine , whether it is called procedure , function , method or something else. The most general form of this concept is the block, first introduced in the ALGOL 60 programming language . Virtually all programming languages ​​that offer this concept in some form allow blocks to have their own variables that can be clearly distinguished from the variables of other blocks. Such variables are called local variables . A variable that is available for all blocks throughout the program is called a global variable. The PHP programming language even knows the concept of a superglobal variable that is available for all programs that are being processed by a PHP interpreter at the same time .

Global variables seem convenient because they are visible throughout the program . It is not necessary to pass them as parameters when calling a function . But they can also easily become a source of error if, for example, a global variable is accidentally or even deliberately used for different purposes.

It can also happen that you use a local variable with the same name as the global variable that you assume has not yet been used in the program . If this name already exists as a variable with a suitable type , if it is checked at all by the compiler or the runtime system, then its value is overwritten in an uncontrolled manner and vice versa. A hard-to-find bug is often the result.

Experienced developers only use global variables on a modular level and only when it cannot be avoided.

Visibility area of ​​variables (scope)

Under the visibility range (Engl. Scope refers) to a variable the program section where the variable is available and visible. Since a local variable can have the same name as a global variable in most programming languages , visibility areas are not necessarily connected: By declaring a local variable, the global variable of the same name is "hidden" for a certain block, i.e. it is in this block Block not visible.

The visibility rules can be set in two different mutually exclusive ways. The concept of attachment is important here. Binding here means the assignment of a specific name to the associated variable.

  • Lexical (or static ), that is, the surrounding source text determines the binding
  • dynamic , that is, the execution layer at the runtime of the program determines the binding.

In the case of a lexical link, the links for a whole block are always the same because they are dictated by the block structure alone . Therefore, one can understand how the function behaves simply by analyzing the source code of a function . This concept therefore supports modular programming .

With dynamic binding rules, new bindings can always be introduced at runtime. These then apply until they are explicitly canceled or covered by a new dynamic link . Analyzing the program text of a function does not necessarily mean that one can understand how it behaves. Even analyzing the entire program text of a program does not help. It cannot be statically understood how the function behaves. Their behavior depends on the respective (directly and indirectly) calling functions. This contradicts the concept of modular programming.

Most modern programming languages only support lexical binding, for example C ++ , C , ML , Haskell , Python , Pascal . A few only support dynamic binding, for example Emacs Lisp , Logo , and some, for example Perl and Common Lisp , allow the programmer to specify for each local variable whether it should be bound according to lexical (static) or dynamic rules.

Different scopes of variables

Example lexical as opposed to dynamic

C ++ and C use lexical (static) scopes :

int x = 0;
int f() { return x; }
int g() { int x = 1; return f(); }

In the program fragment above, it g()always returns 0 (the value of the global variable x, not the value of the local variable xin g()). This is because f()only the global is xvisible in. If, on the other hand, dynamic binding rules were used, g()1 would always be returned.

In Perl , local variables can be declared lexically (statically) with the keyword myor dynamically with the misleading keyword local . The following example makes the difference clear again:

$x = 0;
sub f { return $x; }
sub g { my $x = 1; return f(); }
print g()."\n";

This example used myto put $xin g()a lexical (static) visibility section . Therefore it will g()always return 0. f()can g()s' $xdo not see, although at the time at which f()still exists is called.

$x = 0;
sub f { return $x; }
sub g { local $x = 1; return f(); }
print g()."\n";

This example used localto give $xin g()a dynamic visibility area. It will therefore g()always return 1 because it f()was g()called by.

In practice, you should always use " " for local variables in Perlmy .

Lifetime of variables

The lifetime of a variable is the period in which the variable has reserved storage space . If the storage space is released again for other purposes, the variable "dies" and can no longer be used. Local variables are created each time the function is called. As a rule, the memory space is released again when the function is exited. In classic block-structured programming languages, the visibility area and lifespan of variables are coordinated in such a way that memory space for a variable only needs to be allocated as long as code is executed from the visibility area. As a consequence, a particularly simple type of memory management can be used: Local variables are automatically created on a stack at runtime as soon as a block is entered. Therefore, these variables are sometimes called automatic variables . Technically speaking, an activation block is created on a runtime stack that is removed again when the block is exited.

In some programming languages , for example C , there is an addition staticto the declaration that only restricts the visibility of a variable to the namespace of the function, but not its lifespan. The visibility of such a variable behaves like that of a local variable, the lifetime, on the other hand, like that of a global variable, that is, when entering the function that encloses it , it has exactly the same value as at the end of the last call of the function. No special devices are required for this on the implementation side: the variable is simply created in the activation block of the main program.

Memory allocation

The specifics of variable assignment and the representation of their values ​​vary widely, both between programming languages and between implementations of a particular language. Many language implementations allocate space for local variables whose size applies to a single function call on the call stack and whose memory is automatically reclaimed when the function returns. In general, name binding binds the name of a variable to the address of a particular block in memory, and operations on the variable manipulate that block. Referencing is more common with variables whose values ​​are large or unknown when the code is compiled . Such reference variables ( pointers ) refer to the position of the value instead of storing the value itself, which is allocated from an area of ​​memory called the heap .

Bound variables have values. However, a value is an abstraction. In the implementation , a value is represented by a data object that is stored somewhere. The program or the runtime environment must reserve memory for each data object and, since the memory is finite, ensure that this memory is made available for reuse when the object is no longer needed to represent the value of a variable.

Objects allocated from the heap must be reclaimed, especially when the objects are no longer needed. In a programming language with a garbage collector , for example C # , Java or Python , the runtime environment automatically reclaims objects when existing variables can no longer refer to them. In programming languages ​​without a garbage collector, such as C or C ++ , the program and the programmer must explicitly allocate memory and later free it to reclaim its memory. Otherwise, memory leaks occur and the heap is consumed while the program is running.

When a variable relates to a dynamically created data structure , some of its components can only be accessed indirectly through the variable. Under such circumstances, garbage collectors must handle a case in which only a portion of the memory that the variable can reach needs to be reclaimed.

Name choice

The choice of variable names is mostly left to the programmer . To simplify the traceability of the source texts , it is advisable to use variable names that are as self-explanatory as possible or so-called descriptive variable names, even if this leads to very long identifiers . Naming conventions for variables improve the readability and maintainability of the source texts. Modern editors have devices to reduce the writing effort even with long variable names. After compilation , the program code is largely or even completely independent of the variable names used.

initialization

Variables should be initialized before they are used, that is, they should be assigned a defined value. This can be done by assigning standard values ​​by the runtime system of the programming language used or by explicitly assigning a value to the variable.

In programming languages that do not automatically initialize all the variables used, uninitialized variables are a source of hard-to-find errors: Since the variable etc.If the initialization does not contain a random value, the behavior of the program becomes unpredictable, so that it sometimes delivers incorrect results or even crashes. If the memory area reserved for the variable contains an apparently expected content from a previous program run, this is also referred to as memory interference .

See also

Web links

Wiktionary: Variable  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. Setting and clearing runtime variables - a post in the C ++ community forum (the page in question was last modified on July 10, 2008); with a so-called quote from the Visual Studio Analyzer where it then says in the following: “Runtime variables are name / value pairs that are set by event subscribers and read by event sources. Using run-time variables helps to collect precise details about certain events. "
  2. ^ Tim Lindholm, Frank Yellin: The Java Virtual Machine Specification . Addison-Wesley, 1996: “A variable is a storage location.”
  3. ^ Michael Sperber, William Clinger, RK Dybvig, Matthew Flatt, Anton van Straaten: Report on the Algorithmic Language Scheme. Revised (5.97): “Scheme allows identifiers to stand for locations containing values. These identifiers are called variables. "
  4. ^ What's the difference between dynamic and static (lexical) scoping? Perl FAQ 4.3
  5. N. Gorla, AC Benander, BA Benander: Debugging Effort Estimation Using Software Metrics . In: IEEE (Ed.): IEEE Transactions on Software Engineering . tape 16 , no. 2 , February 1990, p. 223-231 , doi : 10.1109 / 32.44385 (English).