Anomaly (computer science)

from Wikipedia, the free encyclopedia

In computer science , anomalies in relational databases denote malfunction of the database through violation of the rule "every information once". This means that the underlying data model allows tables with columns of the same meaning and also with different (anomalous) content, so that it is no longer possible to identify which table or column contains the correct content (data inconsistency). A distinction is made between anomalies in single-user operation and multi-user operation.

In single-user operation, anomalies can arise from non- normalized or denormalized data structures and lead to inconsistencies . A distinction is made between insert , change and delete anomalies .

In multi-user operation of a database, anomalies occur due to inadmissible parallel database access.

Anomalies in single user operation

Insert anomaly

When data is inserted into a database, one speaks of an insertion anomaly if a new tuple cannot be entered in the relation or can only be entered with difficulty because not all attributes (column heading) of the primary key have values ​​(which is a prerequisite to be able to enter a data record). For example, information cannot be included because other information that is uninteresting in this context or is unknown at the time is missing.

Example:

In this table the respective driver is specified for vehicles. The attributes (license number, surname) are identification keys . Insertion anomalies occur here when a new vehicle is to be inserted but no driver has yet been determined.
Inserting records without the key (or part of the key) is impossible.

Mark Manufacturer First name Surname
K-KJ 321 VW Peter Schmidt
H-CH 333 Audi Fritz cutter
B-MD 321 BMW Max Maier
B-MD 321 BMW Tom Lehmann
A-BC 123 Škoda ? ?
A-BC 123 Škoda ? ?

Change anomaly

When changing data in a database , one speaks of a change anomaly (update anomaly) if not all ( redundant ) occurrences of an attribute value are changed at the same time. This leads to inconsistent data.

Example:

Mark Manufacturer colour First name Surname
K-KJ 321 VW blue Peter Schmidt
H-CH 333 Opel red Fritz cutter
B-MD 321 BMW black Max Maier
B-MM 473 Peugeot green Max Maier

It is assumed in this table that the mentions of “Max Maier” apply to one and the same person. If the name “Maier” is changed to “Meier”, this must be done in two places. If this does not happen, one speaks of an update anomaly. The table now contains inconsistent data.

To prevent this problem, the table should be converted to the 3rd normal form so that the driver data can be viewed separately from the vehicle data.

Example in 3rd normal form:

vehicle
Mark Manufacturer colour Driver_ID
K-KJ 321 VW blue 318
H-CH 333 Opel red 37
B-MD 321 BMW black 93
B-MM 473 Peugeot green 93
driver
Driver_ID First name Surname
318 Peter Schmidt
37 Fritz cutter
93 Max Maier

Because Driver_ID is used in the "Vehicle" table as a foreign key from the "Driver" table, the update anomaly no longer occurs. The data is now stored in a central location and no longer redundantly.

Deletion anomaly

A deletion anomaly ( Delete anomaly ) occurs when deleting a data record are lost than desired more information. It arises when a data set contains several independent pieces of information. By deleting one piece of information, the other is also deleted, although it is still needed.

Example:

Mark Manufacturer colour First name Surname
K-KJ 321 VW blue Peter Schmidt
H-CH 333 Opel red Fritz cutter
B-MD 321 BMW black Max Maier

The vehicle B-MD 321 cannot be deleted here without deleting the driver as well.

To avoid the problem, the table must be converted to the 3rd normal form .

Example in 3rd normal form:

vehicle
Mark Manufacturer colour Driver_ID
K-KJ 321 VW blue 318
H-CH 333 Opel red 37
B-MD 321 BMW black 93
driver
Driver_ID First name Surname
318 Peter Schmidt
37 Fritz cutter
93 Max Maier

Multi-user anomalies

In multi-user operation of a database, anomalies occur due to inadmissible parallel database access. There are roughly four basic problems: lost update, write-read conflict, non-repeatable reading and phantom problem. However, further finer distinctions and specifications are possible.

Lost update

A lost update describes a problem that occurs when multiple parallel write accesses to shared information can occur. If two transactions change the same information, the changes in the first can be immediately overwritten by the changes in the second.

Write-read conflict (dirty read)

A read-write conflict (dirty read) describes a problem that occurs when one of two transactions running at the same time reads data that is written by the other, but is not yet committed.

Non-Repeatable Read

A non-repeatable read (English. Non-Repeatable Read) indicates a problem that occurs when returns different results in succession within a transaction the same read operation.

Phantom Problem (Inconsistent Read)

A phantom problem (inconsistent read) describes a problem that can occur with multiple parallel database accesses. If new data records with this property are inserted in a transaction that is running at the same time during a transaction that relates to several data records with a specified property, this can result in inconsistent data in the first transaction.

Individual evidence

  1. ^ Theo Härder and Erhard Rahm: Database systems, concepts and techniques of implementation , 2nd edition (2001), page 408ff, part V (transaction management), chap. 14 (synchronization), section 14.1. (Anomalies in multi-user operation)

literature

  • Theo Härder, Erhard Rahm: Database systems, concepts and implementation techniques . Springer , Berlin 2001, ISBN 3-540-42133-5 .