Referential Integrity

from Wikipedia, the free encyclopedia

Referential integrity ( RI ) is a term from computer science . This is understood to mean conditions that can help ensure data integrity when using relational databases . According to the RI rule, data records (via their foreign keys ) can only refer to existing data records.

According to this, the RI basically consists of two parts:

  1. A new data record with a foreign key can only be inserted in a table if a data record with a corresponding value in the primary key or a unique alternative key exists in the referenced table .
  2. A data record deletion or a change of the key in a primary data record is only possible if no dependent data records are related to this data record.

Definitions

"Referential integrity (also relationship integrity ) means that attribute values ​​of a foreign key must also be available as attribute value of the primary key."

"The relationships between data objects in a DBMS are controlled via the referential integrity ."

Terms and their meaning

Origin / Background: According to the theory of relations , data to be saved are i. d. Usually divided into several tables. The data records in these tables usually have logical connections (relationships) with one another. See example “library” : Book X is borrowed from library user Y. This resulted in the requirement to be able to protect the consistency of these “references” if necessary by a special and secure concept (the “RI”).

According to the literal meaning, "RI" denotes a given or intended quality status of data: Integer ( flawless, intact, "whole" ) with regard to the mutual references contained therein. At the same time, however, “RI” also means the integrity rule or the functional support through which a DBMS ensures this quality.

The RI rule is an extension of the specification of relationship types - most of which have a cardinality of 1: n . The entity types (= tables ) involved in it are referred to - role-specific for exactly one relationship type, not generally applicable to the table - as master table and detail table . These stipulations are effective for the data records that are specifically related. Master and detail tables can also be the same table (recursive relationships).

Other names for master tables are primary, parent, parent or referenced table. The detail table is also called a linked *, child / child, dependent, related *, referencing, referring table or "table with the foreign key".
(*) = conceptually rather unsuitable because the relationship of dependency remains unclear.

A classic case for RI specifications including deletion forwarding are so-called relationship tables , which often only contain the foreign keys of the data records involved in an n: m relationship . References to non-existent primary data records are not allowed here; If one of the two primary data records is deleted, the data record in the relationship table can / must also be deleted.

The (somewhat cumbersome) term “referential integrity” is often abbreviated with RI or RI , and the new German spelling (referential with “z”) is widely used.

Delimitation:

  • In addition to referential integrity (as partial aspects of data quality and consistency), there are other integrity conditions such as value range integrity (valid values ​​at data field level) and the uniqueness of key terms . In addition, database systems ensure the consistency of data at the transaction level (all or no updates, e.g. in the event of a technical interruption) and against updates from competing users / transactions , especially in multi-user operation .
  • RI-like consistency conditions in non-relationally stored databases do not fall under the term 'referential integrity', but are checked by other means, e.g. B. individually in the IT application. Example: data in configuration and registry files , hyperlinks in wikis, etc.

Extensions / special features

While the RI generally protects against inconsistent data actions, many database systems offer additional functions that can be useful when updating master data records:

Change transfer (ÄW)
If the unique key of a record is changed, the DBMS can adjust the foreign keys in all dependent records - instead of rejecting the change. Change propagation is used in particular when natural keys (which can change; family name in the event of marriage) are used; because artificial keys are i. d. Usually unchangeable and no change propagation required.
Extinguishing transmission (LW)
In certain cases it makes sense to also delete dependent data records when deleting the master data record.

These functions can optionally be set in the RI specification and (depending on the DBMS ) extended / specified by additional conditions (see example ). They only work when master data records are updated; detailed data can be deleted at any time or assigned to other (existing) master records.

Other special features in connection with the RI are:

Recursive relationships
The RI can also refer to data in only one table, for example if subdivisions are assigned to their common main department in the DEPARTMENT table.
Cascading
If the dependent data records from an RI relationship are themselves primary data records, the RI rule can also refer to their dependent records. A deletion or change transfer can therefore have a multilevel effect.
Relationship to yourself
In certain situations, a detail record can also refer to itself. Example: The relationship “place belongs to district town” in the table ORT: The data record of the place, which is the district town, refers to itself with the foreign key “district town”. In such cases, however, a zero value is often used - which means “ is itself a district town ”can be interpreted.

Levels of action and effect

RI setting in MS Access (2003)

In the course of data modeling, the RI is recognized as relevant, defined and specified in the respective syntax . This happens for each relationship type (often simply called “relationship”), in which several entity types (= tables) are involved. The specifications are part of the one-time created database schema .

Based on this information, the DBMS checks in the execution of changing compliance with the RI Rules data operations on the fly. Such operations are triggered by IT applications (possibly after an entry or recording of users ) and, if the rules are complied with, lead to changes in the database, otherwise to error messages if the inventory remains unchanged.

example

Examples of RI settings

Using the example of a simple ER diagram , the graphic below shows which considerations can be made in connection with the definition of RI rules:

  • If it is possible that the primary key of "CUSTOMER entries" changes, the foreign keys contained in ORDER should also be changed automatically: RI with change transfer.
  • If ADDRESSES only ever belong to exactly one CUSTOMER, it makes sense to automatically delete them when the CUSTOMER (i.e. the data about him) is deleted: RI with deletion transfer.
  • Since ORDERS always refer to ARTICLES (via "order item"), an ARTICLE must be prevented from being deleted in the database if ORDERS (with items) are still available. Conversely, only those order items may be created that relate to an ARTICLE (in the database): Normal RI without LW / ÄW.
  • If deletion transfer is not defined in the relationship “CUSTOMER: ORDER”, the deletion of a CUSTOMER would be rejected if there were ORDERS (this customer). If deletion were specified, a "cascading deletion" (including order item) would occur in the case of customer deletion.
  • Without any RI specification, the application / user would have to take care of the consistency of the data relationships; otherwise inconsistent data could arise, which would result in the automatic processing of this data e.g. B. no shipping ADDRESS and no invoice recipient (customer) would be known.

DBMS-dependent differences

The level of support for RI that database systems can provide can vary. In addition to the basic function of protecting the RI at all, this can, for example, affect the following additional aspects:

  • Deletion transfer, change transfer
  • cascading deletion or change propagation
  • Additional conditions under which the deletion and change propagation should take place
  • RI across multiple databases.

RI representation in data models / diagrams

There are hardly any uniform rules used to represent the referential integrity in data model graphics. In some model tools, the relationship arrow at RI is shown in bold. Additional functions such as deletion are mostly only visible in text form in the specified relationship types or in the source code of the database schema.

Technical implementation

Technically, the referential integrity is implemented using a so-called foreign key. The relations involved (= tables) require similar attributes, which are used as foreign keys in the dependent table and as primary keys in the other relation. Both attributes must be of the same or a compatible data type. The data records refer (“reference”) with the foreign key to data records with an identical value in their primary key. The DBMS ensures that only references to existing data records are possible and checks this when creating or deleting data records or when changing key fields.

In systems that work according to the transaction principle, if RI rules are violated, all updates made within the transaction are reset ( rollback ).

Primary keys and foreign keys can also consist of several attributes / table columns.

The definition of sensible RI rules, in particular the deletion transfer, is an important task in data design in order to prevent unwanted deletion quantities or inapplicable deletion transfers.

disadvantage

However, the advantages of referential integrity also come at a price, because every check carried out by an RDBMS costs computer resources, especially time.

So it could e.g. For example, if larger amounts of data are imported regularly, it may be useful to temporarily override the RI rules in the receiving system, especially if the consistency of the data is ensured in the delivery system.

Web links

Wikibooks: Determine insert and delete order  - learning and teaching materials

Introduction to SQL

Individual evidence

  1. Albrecht, Nicol: Programming Access 2002 . ISBN 3-8273-1942-0
  2. economic computer science-24 wirtschaftsinformatik-24.de ( Memento of the original dated 5 April 2010 at the Internet Archive ) Info: The archive link is automatically inserted and not yet tested. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www.wirtschaftsinformatik-24.de
  3. University of Rostock geoinformatik.uni-rostock.de
  4. Duden dictionary of origin
  5. University of Frankfurt dbis.informatik.uni-frankfurt.de (PDF; 85 kB)
  6. msdn.microsoft.com Microsoft MSDN