Anomaly (computer science)

In computer science , anomalies in relational databases denote malfunction of the database through violation of the rule "every information once". This means that the underlying data model allows tables with columns of the same meaning and also with different (anomalous) content, so that it is no longer possible to identify which table or column contains the correct content (data inconsistency). A distinction is made between anomalies in single-user operation and multi-user operation.

In single-user operation, anomalies can arise from non- normalized or denormalized data structures and lead to inconsistencies . A distinction is made between insert , change and delete anomalies .

In multi-user operation of a database, anomalies occur due to inadmissible parallel database access.

Anomalies in single user operation

Insert anomaly

When data is inserted into a database, one speaks of an insertion anomaly if a new tuple cannot be entered in the relation or can only be entered with difficulty because not all attributes (column heading) of the primary key have values (which is a prerequisite to be able to enter a data record). For example, information cannot be included because other information that is uninteresting in this context or is unknown at the time is missing.

Example:

In this table the respective driver is specified for vehicles. The attributes (license number, surname) are identification keys . Insertion anomalies occur here when a new vehicle is to be inserted but no driver has yet been determined.
Inserting records without the key (or part of the key) is impossible.

Mark	Manufacturer	First name	Surname
K-KJ 321	VW	Peter	Schmidt
H-CH 333	Audi	Fritz	cutter
B-MD 321	BMW	Max	Maier
B-MD 321	BMW	Tom	Lehmann
A-BC 123	Škoda	?	?
A-BC 123	Škoda	?	?

Change anomaly

When changing data in a database , one speaks of a change anomaly (update anomaly) if not all ( redundant ) occurrences of an attribute value are changed at the same time. This leads to inconsistent data.

Example:

Mark	Manufacturer	colour	First name	Surname
K-KJ 321	VW	blue	Peter	Schmidt
H-CH 333	Opel	red	Fritz	cutter
B-MD 321	BMW	black	Max	Maier
B-MM 473	Peugeot	green	Max	Maier

It is assumed in this table that the mentions of “Max Maier” apply to one and the same person. If the name “Maier” is changed to “Meier”, this must be done in two places. If this does not happen, one speaks of an update anomaly. The table now contains inconsistent data.

To prevent this problem, the table should be converted to the 3rd normal form so that the driver data can be viewed separately from the vehicle data.

Example in 3rd normal form:

**vehicle**
Mark	Manufacturer	colour	Driver_ID
K-KJ 321	VW	blue	318
H-CH 333	Opel	red	37
B-MD 321	BMW	black	93
B-MM 473	Peugeot	green	93

**driver**
Driver_ID	First name	Surname
318	Peter	Schmidt
37	Fritz	cutter
93	Max	Maier

Because Driver_ID is used in the "Vehicle" table as a foreign key from the "Driver" table, the update anomaly no longer occurs. The data is now stored in a central location and no longer redundantly.

Deletion anomaly

A deletion anomaly ( Delete anomaly ) occurs when deleting a data record are lost than desired more information. It arises when a data set contains several independent pieces of information. By deleting one piece of information, the other is also deleted, although it is still needed.

Example:

Mark	Manufacturer	colour	First name	Surname
K-KJ 321	VW	blue	Peter	Schmidt
H-CH 333	Opel	red	Fritz	cutter
B-MD 321	BMW	black	Max	Maier

The vehicle B-MD 321 cannot be deleted here without deleting the driver as well.

To avoid the problem, the table must be converted to the 3rd normal form .

Example in 3rd normal form:

**vehicle**
Mark	Manufacturer	colour	Driver_ID
K-KJ 321	VW	blue	318
H-CH 333	Opel	red	37
B-MD 321	BMW	black	93

**driver**
Driver_ID	First name	Surname
318	Peter	Schmidt
37	Fritz	cutter
93	Max	Maier

Multi-user anomalies

In multi-user operation of a database, anomalies occur due to inadmissible parallel database access. There are roughly four basic problems: lost update, write-read conflict, non-repeatable reading and phantom problem. However, further finer distinctions and specifications are possible.

Lost update

A lost update describes a problem that occurs when multiple parallel write accesses to shared information can occur. If two transactions change the same information, the changes in the first can be immediately overwritten by the changes in the second.

Write-read conflict (dirty read)

A read-write conflict (dirty read) describes a problem that occurs when one of two transactions running at the same time reads data that is written by the other, but is not yet committed.

Non-Repeatable Read

A non-repeatable read (English. Non-Repeatable Read) indicates a problem that occurs when returns different results in succession within a transaction the same read operation.

Phantom Problem (Inconsistent Read)

A phantom problem (inconsistent read) describes a problem that can occur with multiple parallel database accesses. If new data records with this property are inserted in a transaction that is running at the same time during a transaction that relates to several data records with a specified property, this can result in inconsistent data in the first transaction.

Individual evidence

^ Theo Härder and Erhard Rahm: Database systems, concepts and techniques of implementation , 2nd edition (2001), page 408ff, part V (transaction management), chap. 14 (synchronization), section 14.1. (Anomalies in multi-user operation)

literature

Theo Härder, Erhard Rahm: Database systems, concepts and implementation techniques . Springer , Berlin 2001, ISBN 3-540-42133-5 .

[1] Theo Härder and Erhard Rahm: Database systems, concepts and techniques of implementation , 2nd edition (2001), page 408ff, part V (transaction management), chap. 14 (synchronization), section 14.1. (Anomalies in multi-user operation)