Data encapsulation (programming)

As data encapsulation ( English encapsulation , by David Parnas also known as information hiding ) is known in programming the hiding of data or information from being accessed from outside. Direct access to the internal data structure is prevented and instead takes place via defined interfaces ( black box model ).

Derivation

Data encapsulation is a well-known principle within structured and modular programming . The central model here is the abstract data type , in which data is summarized in a data structure that can only be accessed via defined access functions ( procedures ). In actual programming , the abstract data type is implemented in various ways.

Another example in modern programming languages is the hiding of data within scopes . Each sub- structure of a program (main block, procedures , functions , subroutines , ...) defines such an area, so that a hierarchy of validity is created. Declared data are only visible and valid within the surrounding area and in all deeper areas. They remain hidden for a higher realm.

Data encapsulation in the object-oriented paradigm

Encapsulation is also an important principle in object-oriented programming . Encapsulation is the controlled access to methods or attributes of classes . Classes cannot unexpectedly read or change the internal state of other classes. A class has an interface that determines how the class can be interacted with. This prevents the program invariants from being bypassed .

The user - meaning both the algorithms that work with the class and the programmer who develops them - should have to know as little as possible about the inner workings of a class (secret principle) . Due to the encapsulation, only information about the "what" (functionality) of a class is visible to the outside, but not the "how" (the internal representation). This defines an external interface and also documents it.

Types of access used

The Unified Modeling Language as the de facto standard notation allows the following types of access to be modeled (the short notation of the UML in brackets ):

public ( +): Accessible to all objects ,
private ( -): Only accessible for objects of its own class ,
protected ( #): Only accessible for objects of their own class and of derived classes of this class,
package ( ~): allows access to all elements within its own package.

Note: The handling of the keyword package is different in the various programming languages . Replacement in the respective language:

C # : internal
Visual Basic .NET : friend
Java : No definition means package access (default).

The options for specifying accessibility differ depending on the programming language .

advantages

Since the implementation of a class is not known to other classes, the implementation can be changed without affecting the interaction with other classes.
The result is greater clarity, since only the public interface of a class has to be considered.
When accessing via an access function , it does not matter from the outside whether this function exists 1: 1 inside the class, is the result of a calculation or possibly from other sources, e.g. B. a file or database .
Significantly improved testability , stability and changeability of the software or its modules .
Reduction of the number of possible undesired interactions between program parts. If a program contains N variables and M functions , there are possible interactions. As a rule, however, only interactions are actually desired. This plays a role in troubleshooting, because errors usually manifest themselves in the fact that a variable contains an incorrect value, and you need to know which functions have access to the variable in order to isolate the cause of the error. The data encapsulation restricts the program section to be examined to very few functions from the start. ${\ displaystyle {\ mathcal {O}} (N \ cdot M)}$ ${\ displaystyle {\ mathcal {O}} (N + M)}$

disadvantage

Depending on the application, speed losses due to the call of access functions . Direct access to the data elements would be faster.
Additional programming effort for the creation of access functions.

The internal representation of an object is generally hidden outside of the object definition. Usually only the object's own methods can directly examine or manipulate its own. Hiding the object's internal data protects its integrity by preventing users from putting the component's internal data into an invalid or inconsistent state. A supposed advantage of encapsulation is that it can reduce system complexity and thus increase robustness by allowing the developer to limit the mutual dependencies between software components.

Some object-oriented programming languages such as Ruby only allow access through object methods, but most, e.g. B. C #. C ++ and Java, give the programmer some control over what is hidden, usually through keywords like public and private . The hiding of information is achieved by providing a compiled version of the source code, which is linked via a header file.

Examples

The following example in the C # programming language shows how access to an attribute can be restricted by using the keyword : private

class Program
{
	public class Konto
	{
		private decimal kontostand = 500.00m;

		public decimal gibKontostand()
		{
			return kontostand;
		}
	}

	static void Main()
	{
		Konto meinKonto = new Konto();
		decimal meinKontostand = meinKonto.gibKontostand();

		/* Diese Main Methode kann den Kontostand mit der öffentlichen Methode "gibKontostand", die von der Klasse "Konto" zur Verfügung gestellt wird, abfragen, aber sie kann den Wert des Attributs "kontostand" nicht ändern*/
	}
}

The following example is implemented in the Java programming language:

public class Angestellter
{
    private BigDecimal salary = new BigDecimal(50000.00);
    
    public BigDecimal gibLohn()
    {
        return salary;
    }

    public static void main()
    {
        Angestellter angestellter = new Angestellter();
        BigDecimal lohn = angestellter.gibLohn();
    }
}

Encapsulation is also possible in non-object-oriented programming languages . In C , for example, a structure in the public programming interface can be declared via the header file for a number of functions that work with a data element that contains data elements that cannot be accessed by clients of the programming interface with the keyword extern .

// Header file "api.h"

struct Entity;          // Opaque structure with hidden members

// API functions that operate on 'Entity' objects
extern struct Entity *  open_entity(int id);
extern int              process_entity(struct Entity *info);
extern void             close_entity(struct Entity *info);
// extern keywords here are redundant, but don't hurt.
// extern defines functions that can be called outside the current file, the default behavior even without the keyword

Individual evidence

^ Benjamin Pierce: Types and Programming Languages . In: MIT Press . 2002.
↑ KN King: C Programming: A Modern Approach , 2nd. Edition, WW Norton & Company, 2008, ISBN 978-0393979503 , p. 464 (Retrieved November 1, 2019).

[Pierce-1] Benjamin Pierce: Types and Programming Languages . In: MIT Press . 2002.

[2] KN King: C Programming: A Modern Approach , 2nd. Edition, WW Norton & Company, 2008, ISBN 978-0393979503 , p. 464 (Retrieved November 1, 2019).


SOLID principles	Single Responsibility • Open Closed • Liskov Substitution Principle • Interface Segregation • Dependency Inversion
Further principles	Demeter's law • Design by contract • Data encapsulation • Linguistic Modular Units • Self-Documentation • Uniform Access • Single Choice • Persistence Closure • Command Query Separation • Principle of Least Surprise
Packaging principles	Reuse Release Equivalence • Common Closure • Common Reuse • Acyclic Dependencies • Stable Dependencies • Stable Abstractions