Object-relational impedance mismatch

from Wikipedia, the free encyclopedia

As Object-relational impedance mismatch - often only impedance mismatch - ( English for about object-relational intolerance ) is called a problem of computer science in the application development , which occurs when objects from an object-oriented programming in a relational database is stored.

background

The problem arises from the use of object-oriented programming languages ​​in conjunction with data that is stored in relational databases. Object-oriented applications represent their data by means of objects. If the data is to be saved, it makes sense to save the objects themselves in a database. It turns out, however, that the relational database model has fundamental differences to the object-oriented model. This intolerance has been known as impedance mismatch since the early 1980s .

Problem Description

The problem lies in the different paradigms of the two systems. An object can be characterized by four basic properties:

  • identity
  • Status
  • behavior
  • Encapsulation

A relational system, on the other hand, is derived from relational algebra and stores truth statements in so-called relations. A relation could e.g. B. look like this: {Name, Company}. This relation corresponds to an assertion of the form: "There is a person with the name NAME who works for a company COMPANY ". A tuple is a truth statement within the relation, which z. For example, it looks like this: {John Doe, ACME} (There is a John Doe who works at ACME .). A tuple is made up of attributes (name and company). By linking relations, new relations can be formed and thus new truth statements can be derived, such as B. the answer to "How many people are there who work at ACME?"

A closer look at the two paradigms shows that there are some differences.

  • Structure. An object contains both data and behavior. The corresponding class can be part of a class hierarchy. The relational model does not support such object-oriented concepts as inheritance (generalization and specialization). A tuple in the sense of a relational model only represents a statement of truth. If you consider a class-subclass relationship, only one object is required to represent the data in the object-oriented model, whereas redundancy-free representations in the relational model require two tuples.
  • Identity. An object has an identity that is independent of its state (data). If an object-oriented application is executed twice, the same object (in terms of its state) has different identities. Two objects with the same data also differ in a program flow through their identities. In contrast to this, the identity of a tuple is determined by its data (or by the primary key that results from the data of the tuple). A tuple can therefore be uniquely identified at any time based on its data, which does not apply to an object.
  • Data encapsulation. An object protects its data from changes or uses methods (behavior) to limit the way in which data can be changed. An object therefore gives the possibility of changing data in well-defined ways. In contrast, there are no such safeguards in the relational model (many database manufacturers are expanding the SQL standard to create ways to achieve this, but this is not a fundamental part of the relational model).
  • Working method. The data in a relational database is modified by transactions from a connected application. This is strongly reminiscent of procedural programming, whose characteristic feature is the separation of data and behavior. The object-oriented model groups logically related behavior with data relevant for this behavior in objects. An object-oriented application can be seen as a network of interacting objects. The operations that can be carried out on a relational database are set-based, whereas objects communicate individually with others (message passing).

Possible solutions

There are various possible solutions, but they are only a more or less elegant workaround of the problem. Regardless of which solution you choose - as long as the systems are different, every developer will sooner or later get to the point where their solution no longer fulfills one or more of the following points:

  • Maintainability
  • performance
  • Comprehensibility

Object-oriented database

The most obvious solution is to replace the relational database with an object-oriented database . This makes programmatic handling easier, but complex queries can become very complicated. In addition, management and database administrators often reject it because the data is hardwired to the object and cannot be made visible without the associated application. Any mapping is completely unnecessary.

Object relational database

Many of the well-known manufacturers expanded their relational database products with object-oriented features to create an object-relational database management system (ORDBMS). In doing so, they respond to the demand for object-oriented databases. Existing architectures with relational databases can be retained with these upgrades and offer the developer an object-oriented view of the data. The impedance mismatch is largely avoided, but mapping still has to be accessed depending on the database system.

Extend the programming language by relational functions

This solves the problem backwards. Due to the relational support of the language used (e.g. embedded SQL ), mapping is no longer necessary. However, many OOP developers are reluctant to use this solution, as it usually restricts the use of objects.

O / R mapper

An object-relational mapper is a layer between the application and the database. He takes care of the complete mapping between objects and tables. This process is invisible to the developer. Today's mappers are very high-performance - with increasing complexity, however, a multitude of other problems arise. The more specific the solution, the more often the developer has to determine how the mapping between the worlds should be done. This can be extremely complicated at times.

An O / R mapper has to solve problems on different levels. One approach describes four levels:

  • Paradigm. A paradigm in this context is a concept for representing data. Object-relational mapping must be able to overcome the differences between the paradigms. These have already been explained in the previous section.
  • Language. A language is used to express a model through a paradigm. Often used object-oriented programming languages ​​are e.g. B. Java and C #. Relational databases are addressed using SQL. A major difference between these is their type system. The SQL standard defines certain simple data types that can be used to store data. Separate tables are required to store complex data. In contrast to this, in object-oriented languages ​​the option of defining new data types using your own classes is an integral part.
  • Scheme. A scheme is a model expressed in a concrete language. Source code and database scripts can be seen as the schema of an object-relational application. Most O / R mappers require some type of configuration (often a mapping file) to overcome the schema differences. It should be noted that the developers of the object-oriented schema usually ensure that complex business logic can easily be mapped with the objects, while a database designer is usually careful to avoid redundancy and optimize performance (e.g. by normalizing the Database schemas).
  • Instance. Instance in this context means concrete data. This level mainly deals with problems such as access and modification of data as well as conversion of different data types etc.

credentials

  1. ^ C. Copeland, D Maier: Making small talk a database system. In: ACM SIGMOD Records. vol. 14, 2, 1984, pp. 316-325.
  2. ^ A b c Christopher Ireland, David Bowers, Michael Newton, Kevin Waugh: A Classification of Object-Relational Impedance Mismatch . In: Advances in Databases, First International Conference on . IEEE Computer Society, 2009, ISBN 978-0-7695-3550-0 , pp. 36–43 , doi : 10.1109 / DBKDA.2009.11 .
  3. ^ A b c d Deborah J. Armstrong: The quarks of object-oriented development . In: Commun. ACM . tape 49 , no. 2 , February 2006, ISSN  0001-0782 , p. 123-128 , doi : 10.1145 / 1113034.1113040 .
  4. Craig Russell: Bridging the object-relational divide . In: Queue . tape 6 , no. 3 . ACM, July 28, 2008, ISSN  1542-7730 , p. 18-28 , doi : 10.1145 / 1394127.1394139 ( acm.org [PDF]).
  5. ^ A b Edgar F. Codd : The relational model for database management: version 2 . Addison-Wesley Longman Publishing, Boston, MA, USA 1990, ISBN 0-201-14192-2 ( acm.org [PDF]).
  6. Erich Gamma , Richard Helm , Ralph Johnson , John Vlissides : Design patterns: elements of reusable object-oriented software . Addison-Wesley Professional, 1995.
  7. Ted Neward: The Vietnam of Computer Science. Interoperability Happens, June 26, 2006, archived from the original on July 4, 2016 ; Retrieved June 2, 2010 .

Web links