Functional dependency

Functional dependencies (FA) are a concept of relational design theory and form the basis for the normalization of relation schemes .

A relation is defined by attributes. If some of these attributes clearly determine the values of other attributes, one speaks of functional dependency. For example, one could imagine a customer database in which the address and telephone number of a customer are clearly identified by their name and date of birth. Here the address and telephone number would functionally depend on the name and date of birth.

The term key can also be defined with the help of functional dependencies :

If some attributes of a relation uniquely determine the values of all attributes of the relation, one speaks of a super key , that is, every tuple of this relation is uniquely determined by the values of these attributes. For example, one could introduce a customer number that identifies each customer. A key candidate is a minimal super key , that is, no real subset of the attributes of this key completely determines the values of all other attributes of the relation. A so-called primary key is selected from all key candidates in a relation .

Example:

A.	B.	C.
1	1	3
1	1	3
1	2	4th

In this example, C is functionally dependent on A and B, written A, B → C. The arrow can thus be read as "definitely unambiguous": The first two attributes together uniquely determine which value attribute C has. In other words: if you know which values the first two attributes have, this also determines the value of the last attribute. So C is not functionally dependent on A alone, nor on B alone, but on the combination of A and B.

Formal definition

Let be a relation with the relation schema and let and be subsets of attributes of . Be a tuple . Then the restriction is on the attributes . The functional dependency ( is functionally dependent on ) applies if the following applies to every admissible relation : ${\ displaystyle r (R)}$ ${\ displaystyle R}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ beta}$ ${\ displaystyle R}$ ${\ displaystyle t \ in r}$ ${\ displaystyle r}$ ${\ displaystyle t [\ alpha]}$ ${\ displaystyle t}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ alpha \ to \ beta}$ ${\ displaystyle \ beta}$ ${\ displaystyle \ alpha}$ ${\ displaystyle R}$ ${\ displaystyle r (R)}$

${\ displaystyle \ forall t_ {1}, t_ {2} \ in r \ colon \ t_ {1} [\ alpha] = t_ {2} [\ alpha] \ implies t_ {1} [\ beta] = t_ { 2} [\ beta]}$

This means that for all tuples with the same -attributes ( ) it also applies that their -attributes are the same ( ). The values of the attributes from the attribute set clearly determine the values of the attributes from the attribute set ; is also referred to in the literature as the determinant for . For attribute sets one usually writes briefly instead of . ${\ displaystyle t_ {1}, t_ {2} \ in r}$ ${\ displaystyle \ alpha}$ ${\ displaystyle t_ {1} [\ alpha] = t_ {2} [\ alpha]}$ ${\ displaystyle \ beta}$ ${\ displaystyle t_ {1} [\ beta] = t_ {2} [\ beta]}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ beta}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ beta}$ ${\ displaystyle \ alpha \ beta}$ ${\ displaystyle \ alpha \ cup \ beta}$

${\ displaystyle \ beta}$ is fully functionally dependent on when out can be removed no attribute, so that the condition still applies. ${\ displaystyle \ alpha}$ ${\ displaystyle \ alpha}$

In the example above the attribute plays for the determination of no difference: ${\ displaystyle A}$ ${\ displaystyle C}$

The full functional dependency can be obtained from the functional dependency . ${\ displaystyle AB \ to C}$ ${\ displaystyle B \ to C}$

Armstrong's axioms

With the help of the axioms of Armstrong (Armstrong also axioms) can be from a set of functional dependencies that apply to a relation, derive further functional dependencies. The following three rules are sufficient to infer all functional dependencies:

1. A set of attributes uniquely determines the values of a subset of these attributes (trivial dependence), ie . ( Reflexivity ) ${\ displaystyle \ beta \ subseteq \ alpha \ Rightarrow \ alpha \ rightarrow \ beta}$

2. If this applies , then also applies to any set of attributes of the relation. (Expansion rule, reinforcement) ${\ displaystyle \ alpha \ rightarrow \ beta}$ ${\ displaystyle \ alpha \ gamma \ rightarrow \ beta \ gamma}$ ${\ displaystyle \ gamma}$

3. If and , then also applies . (Transitivity rule) ${\ displaystyle \ alpha \ rightarrow \ beta}$ ${\ displaystyle \ beta \ rightarrow \ gamma}$ ${\ displaystyle \ alpha \ rightarrow \ gamma}$

In order to make derivations easier, the following (derived) rules can also be used:

4. If and , then also applies . (Union rule) ${\ displaystyle \ alpha \ rightarrow \ beta}$ ${\ displaystyle \ alpha \ rightarrow \ gamma}$ ${\ displaystyle \ alpha \ rightarrow \ beta \ gamma}$

5. Applies so apply and . (Decomposition / disassembly rule) ${\ displaystyle \ alpha \ rightarrow \ beta \ gamma}$ ${\ displaystyle \ alpha \ rightarrow \ beta}$ ${\ displaystyle \ alpha \ rightarrow \ gamma}$

6. If and , then also applies . (Pseudotransitivity rule) ${\ displaystyle \ alpha \ rightarrow \ beta}$ ${\ displaystyle \ beta \ gamma \ rightarrow \ delta}$ ${\ displaystyle \ alpha \ gamma \ rightarrow \ delta}$

Normalization with functional dependencies

Relation schemes can be normalized with the help of functional dependencies . A relation scheme is, for example, in Boyce-Codd normal form if the following applies to all functional dependencies that apply to: The functional dependency is trivial or is a super key for . The 3rd normal form is somewhat weakened. One of the conditions specified above must apply to them or that all attributes from are contained in at least one of the key candidates of . ${\ displaystyle R}$ ${\ displaystyle \ alpha \ to \ beta}$ ${\ displaystyle R}$ ${\ displaystyle \ alpha}$ ${\ displaystyle R}$ ${\ displaystyle \ beta - \ alpha}$ ${\ displaystyle R}$

There are algorithms that break down a relational scheme into normalized schemes on the basis of functional dependencies. The aim of such a decomposition is losslessness and dependency loyalty (also dependency preservation) of the decomposition. Dependency loyalty means that all functional dependencies that apply to the original relation still apply to the decomposition. One such algorithm that transfers to the third normal form is the synthesis algorithm . The losslessness of a decomposition into two partial relations can be demonstrated with the help of Delobel's theorem .

Attribute envelope

The attribute shell of a certain attribute is a set of all attributes that functionally depend on. In the smallest case, the attribute envelope is only the attribute itself, since no other attributes depend on it. If one wants to determine the attribute envelope of an attribute for a given number of functional dependencies F, one can apply a simple algorithm that determines the set of dependent attributes by repeatedly applying the transitivity rule. The algorithm is defined as follows: ${\ displaystyle \ alpha ^ {+}}$ ${\ displaystyle \ alpha}$ ${\ displaystyle \ alpha}$

Input :

a lot of functional dependencies ${\ displaystyle F}$
a lot of attributes ${\ displaystyle \ alpha}$

Output :

the complete set of attributes that can be derived from the dependencies . It applies . ${\ displaystyle \ alpha ^ {+}}$ ${\ displaystyle \ alpha}$ ${\ displaystyle F}$ ${\ displaystyle \ alpha \ rightarrow \ alpha ^ {+}}$

Envelope while (change to ) do foreach (dependency ) do if ( ) then ${\ displaystyle (F, \ alpha)}$
    ${\ displaystyle \ alpha ^ {+} = \ alpha}$
    ${\ displaystyle \ alpha ^ {+}}$
       ${\ displaystyle \ beta \ rightarrow \ gamma \ in F}$
         ${\ displaystyle \ beta \ subseteq \ alpha ^ {+}}$ ${\ displaystyle \ alpha ^ {+} = \ alpha ^ {+} \ cup \ gamma}$

Applied to a concrete set of functional dependencies F:

F has the dependencies:
- ${\ displaystyle A, B \ rightarrow C}$
- ${\ displaystyle D \ rightarrow E, F}$
- ${\ displaystyle A \ rightarrow G, H}$
- ${\ displaystyle G \ rightarrow B}$
We want to determine the attribute envelope for . ${\ displaystyle A}$

1. ${\ displaystyle A ^ {+} = A}$

2. Run through the functional dependencies from top to bottom:

- ${\ displaystyle A, B}$ is not a subset of ${\ displaystyle A}$
- ${\ displaystyle D}$ is not a subset of ${\ displaystyle A}$
- ${\ displaystyle A}$ $A.$ is a subset of ${\ displaystyle A}$ $A.$
  - ${\ displaystyle A ^ {+} = A ^ {+} + G, H}$
- ${\ displaystyle G}$ $G$ is a subset of ${\ displaystyle A, G, H}$ $A, G, H$
  - ${\ displaystyle A ^ {+} = A ^ {+} + B}$
- New run:
- ${\ displaystyle A, B}$ $FROM$ is a subset of ${\ displaystyle A, G, H, B}$ $A, G, H, B$
  - ${\ displaystyle A ^ {+} = A ^ {+} + C}$
- ... after that nothing changes in the amount ${\ displaystyle A ^ {+} = A, G, H, B, C}$

Completed shell

Intuitively speaking, the closed shell of a set of functional dependencies is the set of attributes that is determined by the “left sides” of the dependencies.

If F = {α ₁ → β ₁ , ..., α _n → β _n } is a set of functional dependencies, then the closed envelope or attribute envelope is the set

${\ displaystyle \ bigcup _ {\ alpha _ {1} ... \ alpha _ {n} \ to \ gamma} \ gamma}$

and is denoted by. The following applies to the envelope: ${\ displaystyle F ^ {+}}$

${\ displaystyle \ forall \ beta \ subseteq R: \ alpha _ {1} ... \ alpha _ {n} \ to \ beta \ implies \ beta \ subseteq F ^ {+}}$

Advanced concepts

Multi-valued dependencies form an extension of the functional dependencies that make it possible to uncover additional anomalies in a relational schema.
Conditional dependencies (engl. Conditional Functional Dependencies) form an extension of the functional dependencies to specific tables of values. A dependency such as PLZ→ Ortis expanded to include an additional table with specific values such as 80001→ München. The postcode 80001 is assigned directly to Munich . With the help of these conditional dependencies, the quality of data can be measured or measures to improve the data quality can be derived.

literature

Alfons Kemper, André Eickler: Database systems. An introduction. Oldenbourg, Munich 2004, ISBN 3-486-27392-2 .
Philip Bohannon, Fan Wenfei, Floris Geerts, Jia Xibei, Anastasios Kementsietsidis: Conditional Functional Dependencies for Data Cleaning. IEEE Service Center, Piscataway NJ 2007.

Web links

University of Leipzig : Normalization of Relations - Prof. E. Rahm (PDF)