Anomaly (computer science)
In computer science , anomalies in relational databases denote malfunction of the database through violation of the rule "every information once". This means that the underlying data model allows tables with columns of the same meaning and also with different (anomalous) content, so that it is no longer possible to identify which table or column contains the correct content (data inconsistency). A distinction is made between anomalies in single-user operation and multi-user operation.
In single-user operation, anomalies can arise from non- normalized or denormalized data structures and lead to inconsistencies . A distinction is made between insert , change and delete anomalies .
In multi-user operation of a database, anomalies occur due to inadmissible parallel database access.
Anomalies in single user operation
Insert anomaly
When data is inserted into a database, one speaks of an insertion anomaly if a new tuple cannot be entered in the relation or can only be entered with difficulty because not all attributes (column heading) of the primary key have values (which is a prerequisite to be able to enter a data record). For example, information cannot be included because other information that is uninteresting in this context or is unknown at the time is missing.
Example:
In this table the respective driver is specified for vehicles. The attributes (license number, surname) are identification keys . Insertion anomalies occur here when a new vehicle is to be inserted but no driver has yet been determined.
Inserting records without the key (or part of the key) is impossible.
Mark | Manufacturer | First name | Surname |
---|---|---|---|
K-KJ 321 | VW | Peter | Schmidt |
H-CH 333 | Audi | Fritz | cutter |
B-MD 321 | BMW | Max | Maier |
B-MD 321 | BMW | Tom | Lehmann |
A-BC 123 | Škoda | ? | ? |
A-BC 123 | Škoda | ? | ? |
Change anomaly
When changing data in a database , one speaks of a change anomaly (update anomaly) if not all ( redundant ) occurrences of an attribute value are changed at the same time. This leads to inconsistent data.
Example:
Mark | Manufacturer | colour | First name | Surname |
---|---|---|---|---|
K-KJ 321 | VW | blue | Peter | Schmidt |
H-CH 333 | Opel | red | Fritz | cutter |
B-MD 321 | BMW | black | Max | Maier |
B-MM 473 | Peugeot | green | Max | Maier |
It is assumed in this table that the mentions of “Max Maier” apply to one and the same person. If the name “Maier” is changed to “Meier”, this must be done in two places. If this does not happen, one speaks of an update anomaly. The table now contains inconsistent data.
To prevent this problem, the table should be converted to the 3rd normal form so that the driver data can be viewed separately from the vehicle data.
Example in 3rd normal form:
|
|
Because Driver_ID is used in the "Vehicle" table as a foreign key from the "Driver" table, the update anomaly no longer occurs. The data is now stored in a central location and no longer redundantly.
Deletion anomaly
A deletion anomaly ( Delete anomaly ) occurs when deleting a data record are lost than desired more information. It arises when a data set contains several independent pieces of information. By deleting one piece of information, the other is also deleted, although it is still needed.
Example:
Mark | Manufacturer | colour | First name | Surname |
---|---|---|---|---|
K-KJ 321 | VW | blue | Peter | Schmidt |
H-CH 333 | Opel | red | Fritz | cutter |
B-MD 321 | BMW | black | Max | Maier |
The vehicle B-MD 321 cannot be deleted here without deleting the driver as well.
To avoid the problem, the table must be converted to the 3rd normal form .
Example in 3rd normal form:
|
|
Multi-user anomalies
In multi-user operation of a database, anomalies occur due to inadmissible parallel database access. There are roughly four basic problems: lost update, write-read conflict, non-repeatable reading and phantom problem. However, further finer distinctions and specifications are possible.
Lost update
A lost update describes a problem that occurs when multiple parallel write accesses to shared information can occur. If two transactions change the same information, the changes in the first can be immediately overwritten by the changes in the second.
Write-read conflict (dirty read)
A read-write conflict (dirty read) describes a problem that occurs when one of two transactions running at the same time reads data that is written by the other, but is not yet committed.
Non-Repeatable Read
A non-repeatable read (English. Non-Repeatable Read) indicates a problem that occurs when returns different results in succession within a transaction the same read operation.
Phantom Problem (Inconsistent Read)
A phantom problem (inconsistent read) describes a problem that can occur with multiple parallel database accesses. If new data records with this property are inserted in a transaction that is running at the same time during a transaction that relates to several data records with a specified property, this can result in inconsistent data in the first transaction.
Individual evidence
- ^ Theo Härder and Erhard Rahm: Database systems, concepts and techniques of implementation , 2nd edition (2001), page 408ff, part V (transaction management), chap. 14 (synchronization), section 14.1. (Anomalies in multi-user operation)
literature
- Theo Härder, Erhard Rahm: Database systems, concepts and implementation techniques . Springer , Berlin 2001, ISBN 3-540-42133-5 .