Relational algebra

In theory, the database is meant by relational algebra or relational algebra a set of operations for manipulating relations . It enables relations to be filtered, linked, aggregated or otherwise modified in order to formulate queries to a database.

Usually, queries and programs are not formulated directly in a relational algebra, but in a declarative language such as SQL , XQuery , SPARQL or Datalog . These programs and queries are usually first translated into an (generally extended) relational algebra. The resulting operator tree is then transformed with the help of relational laws in order to enable the most efficient possible evaluation of the requests.

History and meaning

In 1941 Alfred Tarski presented ideas of a relational algebra for the first time in his paper “On the calculus of relations”. In particular, he introduced the relational operations “union”, “average” and “join”, although he limited himself to two-digit relations.

At the end of his article, he mentions that his aim was not so much to present new results as to arouse interest in a particular logical theory that had not been considered before:

"The aim of this paper has been, not so much to present new results, as to awaken interest in a certain neglected logical theory, and to formulate some new problems concerning this theory."

- Tarski

In the late 1960s, Edgar F. Codd developed the foundations of today's relational algebra at the IBM Research Laboratory in San Jose. It is not known whether the work of Tarski inspired him. At the beginning of his paper from 1969 he asserted that the relational model was superior in many aspects to the graph model and the network model , which at the time were "en vogue" (French: "in fashion").

“The first part of this paper is concerned with an explanation of a relational view of data. This view (or model) of data appears to be superior in several respects to the graph or network model [1, 2] presently in vogue. "

- Codd

He is referring to the fact that the time it takes to answer queries depends very much on the structure of the respective network. If data is to be retrieved that is adjacent in the network, the user only has to wait very briefly for a response. However, if the desired data is widely scattered in the network, the waiting time can be unreasonably long. When creating a network model, the database developers had to consider all conceivable requests from the outset, as subsequent changes to the data model were very difficult to implement. To solve this problem, Codd had the idea of no longer storing the data in a network, but in relations (tables) that can be linked to one another in different ways depending on the request:

"Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation)."

- Codd

He dared the following prophetic prognosis that databases would in future contain many relationships in stored form:

"The large, integrated data banks of the future will contain many relations of various degrees in stored form."

- Codd

Late 1970, d. H. In the same year in which Codd's work became public, Rudolf Bayer and Ed McCreight presented the B-tree , a data structure that enables relations with a large number of tuples to be stored on a hard drive in such a way that tuples can be read and the modification of tuples can be done highly efficiently.

In the 1970s, the success story of the relational databases, including the associated SQL language , began on the basis of these two works . At Codd's place of work, d. H. at the IBM Research Laboratory in San Jose, the SEQUEL language and the experimental System R database system were developed. SEQUEL was later renamed to SQL. At the beginning of the 1980s there were the first commercial relational database systems for the query language SQL: Db2 from IBM and Oracle from Relational Software Inc. Today, the world of databases is indispensable for SQL (see, for example, category: Relational database management system ). But also various other languages, such as initially QBE or QUEL and later Datalog , XQuery or SPARQL , are ultimately based on Codds' idea of using relations to store data.

General

Relational algebra defines operations that can be applied to a set of relations . This means that relations can be filtered, linked or aggregated, for example. The results of all operations are also relations. For this reason the relational algebra is called closed .

Relational algebra is important as a theoretical basis for query languages in relational databases . Here the operations of relational algebra are implemented in so-called database operators. If every operation of relational algebra in the query language can be implemented by (at least) one expression, it is said to be relationally complete; the expression can link several database operators. If every operation can also be implemented by (exactly) one database operator, it is called strictly relationally complete; so only one database operator can ever be contained in one and the same converting expression. If the condition of strict relational completeness also applies in the other direction, i.e. there is a corresponding operation of relational algebra for each database operator, then the query language is called equivalent to relational algebra, in short: relational equivalent.

Since there are (several) minimal sets of operations for relational algebra from which all further operations can be put together, it is sufficient for (strictly) relational completeness to compare the query language with these "basic operations". This follows from the fact that relational algebra is trivially self-equivalent and is already completely described (with regard to operations) by a minimal system of operations. A common minimal system of operations consists of the six operations: projection, selection, cross product, union, difference and renaming.

Because of its theoretical clarity, relational algebra is often used as a yardstick for evaluating the power or expressiveness of query languages. a. by means of the comparative terms just described. However, one should not infer from the closer proximity of a query language to relational algebra to its greater power. Query languages that are relationally complete or even strictly relationally complete often have a significantly larger range of functions than would be possible by implementing the relations-algebra operations alone. For example, in relational algebra there is no possibility of forming the transitive envelope of a relation, which is interesting in the case of back-referencing relations. The strict relational completeness of a query language tends to indicate a minimum functionality, and the relational equivalence rather a maximum functionality, while the non-strict relational completeness provides the least concrete information about the query language.

In contrast to calculi , relational algebra is safe ; that is, it gives a finite result in finite time. Relational algebra is also an example of a procedural language ; in contrast to calculi, which are mostly formalized as descriptive languages .

Operations

Set operations

In order to be able to perform set operations on the relations R and S, both must be compatible with each other. The type compatibility of two relations is given if

R and S have the same degree (number of attribute elements)
the value range of the attributes of R and S is identical

The type compatibility is also called union compatibility.

Union

With the union R ∪ S, all tuples of relation R are combined with all tuples of relation S to a single relation. The prerequisite for this is that R and S have the same relationship scheme. That is, they have the same attributes and attribute types. Duplicates will be deleted upon merging.

definition

{\ displaystyle R \ cup S: = \ {t | t \ in R \ lor t \ in S \}}

example

R:
A.	B.	C.
1	2	3
4th	5	6th

S:
A.	B.	C.
7th	8th	9
4th	5	6th

R ∪ S:
A.	B.	C.
7th	8th	9
4th	5	6th
1	2	3

requirement

Unification compatibility of R and S

Intersection

cut

The result of the average operation R ∩ S are all the tuples that can be found in both R and S. The average amount can also by the amount of difference expressed: R ∩ S = R \ (R \ S)

definition

{\ displaystyle R \ cap S: = \ {t | t \ in R \ land t \ in S \}}

example

R:
A.	B.	C.
1	2	3
4th	5	6th

S:
A.	B.	C.
7th	8th	9
4th	5	6th

R ∩ S:
A.	B.	C.
4th	5	6th

requirement

Unification compatibility of R and S

difference

With the operation R \ S or R - S, all tuples that are also present in the second relation S are removed from the first relation R. The difference (as well as the symmetrical difference) is not a monotonic operation, therefore relational algebra is not monotonic either in comparison to other declarative query languages (e.g. Datalog ).

definition

{\ displaystyle R {-} S: = R {\ setminus} S: = \ {t | t \ in R \ land t \ notin S \}}

example

R:
A.	B.	C.
1	2	3
4th	5	6th

S:
A.	B.	C.
7th	8th	9
4th	5	6th

R \ S:
A.	B.	C.
1	2	3

requirement

Unification compatibility of R and S

Symmetrical difference

The symmetrical difference R △ S is the set of all tuples that are contained in either R or S, but not in both at the same time.

{\ displaystyle R \ bigtriangleup S: = \ {t | (t \ in R \ lor t \ in S) \ land t \ notin R \ cap S \}}

The operation can be derived from the basic operations:

{\ displaystyle R \ bigtriangleup S: = (R \ setminus S) \ cup (S \ setminus R) = (R \ cup S) \ setminus (S \ cap R)}

example

R:
A.	B.	C.
1	2	3
4th	5	6th

S:
A.	B.	C.
7th	8th	9
4th	5	6th

R △ S:
A.	B.	C.
1	2	3
7th	8th	9

requirement

Unification compatibility of R and S

Cartesian product (cross product)

Cartesian product

The Cartesian product R × S is an operation that is similar to the Cartesian product from set theory .

The result of the Cartesian product is the set of all combinations of the tuples from R and S, i.e. that is, each row of one table is combined with each row of the other table. If all characteristics (columns) are different, the result table includes the sum of the characteristics of the source tables. Features of the same name in the two tables are referenced by placing the table name in front. The number of tuples (lines) in the result table is the result of multiplying the number of lines in the output table.

definition

Any two relations and are given. The Cartesian product is defined by ${\ displaystyle R}$ ${\ displaystyle S}$

{\ displaystyle R \ times S: = \ {(a_ {1}, a_ {2}, ..., a_ {n}, b_ {1}, b_ {2}, ..., b_ {m}) | (a_ {1}, a_ {2}, ..., a_ {n}) \ in R \ wedge (b_ {1}, b_ {2}, ..., b_ {m}) \ in S \ }}

example

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
E.	F.	G
1	2	3
7th	8th	9

R × S:
A.	B.	C.	D.	E.	F.	G
1	2	3	4th	1	2	3
4th	5	6th	7th	1	2	3
7th	8th	9	0	1	2	3
1	2	3	4th	7th	8th	9
4th	5	6th	7th	7th	8th	9
7th	8th	9	0	7th	8th	9

projection

The projection corresponds to the projection mapping from set theory and can also be called attribute restriction. It extracts individual attributes from the original set of attributes and is therefore to be understood as a kind of selection at column level, i.e. the projection hides columns. If β is the list of attributes, write π _β(R) or in the linear notation R [β]. β is also called the projection list . Duplicates in the result relation are eliminated.

definition

Let R be a relation over {A ₁ ,…, A _k } and β ⊆ {A ₁ ,…, A _k }.

{\ displaystyle \ pi _ {\ beta} (R): = \ {t _ {\ beta} | t \ in R \}}

The t _β : = (β), that is, the tuples only receive the attributes from the attribute list β.

example

R:
A.	B.	C.
1	2	3
4th	5	6th
1	3	8th

R [A, B]:
A.	B.
1	2
4th	5
1	3

R [A]:
A.
1
4th

requirement

The specified columns must be in R.

selection

During the selection, a comparison expression ( predicate ) can be used to determine which tuples are to be included in the result set. So tuples ("lines") are hidden. One writes or in the linear notation R [expression]. Expression is then called a selection condition . ${\ displaystyle \ sigma _ {\ text {expression}} (R)}$

definition

Let R be a relation.

{\ displaystyle \ sigma _ {\ text {expression}} (R): = \ {t | t \ in R \ wedge t {\ text {fulfills expression}} \}}

Expression denotes a formula . This can consist of:

Constant selections Attribute θ Constant , where θ is a common (suitable) comparison operator.
Attribute selections Attribute θ attribute
A combination of a formula with logical predicates ∧, ∨, ¬ (brackets as usual).

example

R:
A.	B.	C.
1	2	4th
4th	6th	7th
1	6th	7th
8th	6th	1

R [A = 1]:
A.	B.	C.
1	2	4th
1	6th	7th

R [C> 6]:
A.	B.	C.
4th	6th	7th
1	6th	7th

requirement

Each element of the specified column must be comparable with the comparison value via the conditional operator.

Join

A join designates the two consecutive operations Cartesian product and selection . The selection condition is usually a comparison of attributes A θ B, where θ is a suitable comparison operator. The general network is therefore also referred to as the θ network (theta network). The equi-join is a special case of the general join (see below).

definition

For two relations and is the result of the general combination with a formula expression as selection condition ${\ displaystyle R (A_ {1}, ..., A_ {n})}$ ${\ displaystyle S (B_ {1}, ..., B_ {m})}$

{\ displaystyle R \ bowtie _ {\ mathrm {expression}} S: = \ {r \ cup s | r \ in R \ land s \ in S \ land \ mathrm {expression} \}}

The derivation is:

{\ displaystyle R \ bowtie _ {\ mathrm {expression}} S: = \ sigma _ {\ mathrm {expression}} (R \ times S)}

example

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
E.	F.	G
1	2	3
7th	8th	9

R x S:
A.	B.	C.	D.	E.	F.	G
1	2	3	4th	1	2	3
4th	5	6th	7th	1	2	3
7th	8th	9	0	1	2	3
1	2	3	4th	7th	8th	9
4th	5	6th	7th	7th	8th	9
7th	8th	9	0	7th	8th	9

${\ displaystyle R \ bowtie _ {\ mathrm {RA \ neq SE}} S}$ ; JOIN (R, RA <> SE, S):
A.	B.	C.	D.	E.	F.	G
4th	5	6th	7th	1	2	3
7th	8th	9	0	1	2	3
1	2	3	4th	7th	8th	9
4th	5	6th	7th	7th	8th	9

Join corruption

In the event of a join corruption, the table is first split, except for one column . The 2 tables are then joined over the common column . ${\ displaystyle A_ {j}}$ ${\ displaystyle A_ {j}}$

Be and , then: ${\ displaystyle L_ {1} \ cup L_ {2} = \ {A_ {1}, ..., A_ {n} \}}$ ${\ displaystyle L_ {1} \ cap L_ {2} = A_ {j}}$

{\ displaystyle R \ subseteq \ Pi _ {L_ {1}} \ left (R \ right) \ bowtie _ {A_ {j}} \ Pi _ {L_ {2}} \ left (R \ right)}

example

${\ displaystyle {\ begin {aligned} L_ {1} & = \ {A, B \} \\ L_ {2} & = \ {B, C \} \\ L_ {1} \ cap L_ {2} & = B \ end {aligned}}}$

R:
A.	B.	C.
1	2	3
2	1	2
2	2	1

${\ displaystyle R \ subseteq \ Pi _ {L_ {1}} \ left (R \ right) \ bowtie _ {B} \ Pi _ {L_ {2}} \ left (R \ right)}$ :
A.	B.	C.
1	2	3
1	2	1
2	1	2
2	2	3
2	2	1

Equi-join

{\ displaystyle a_ {3} = b_ {1}}

With the equi-join, the Cartesian product is formed first. The selection then takes place with the condition that the content of certain columns must be identical. The equi-join is a general combination with a formula of the form A = B .

definition

The equi-join is used for the relations R, S and associated attributes A (is an attribute of R) and B (is an attribute of S)

{\ displaystyle R \ bowtie _ {A = B} S: = \ {(r, s) | r \ in R \ land s \ in S \ land r _ {[A]} = s _ {[B]} \} }

example

Here:

{\ displaystyle R \ bowtie _ {A = E} S: = \ {r \ cup s | r \ in R \ land s \ in S \ land r _ {[A]} = s _ {[E]} \}}

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
E.	F.	G
1	2	3
7th	8th	9

R x S:
A.	B.	C.	D.	E.	F.	G
1	2	3	4th	1	2	3
4th	5	6th	7th	1	2	3
7th	8th	9	0	1	2	3
1	2	3	4th	7th	8th	9
4th	5	6th	7th	7th	8th	9
7th	8th	9	0	7th	8th	9

JOIN (R, RA = SE, S):
A.	B.	C.	D.	E.	F.	G
1	2	3	4th	1	2	3
7th	8th	9	0	7th	8th	9

Natural join

The natural join consists of the equi-join and an additional hiding of the duplicated columns (projection). The join takes place via the attributes (columns) which have the same name in both relations. If there are no common attributes, the result of the natural compound is the Cartesian product. The natural compound is commutative and associative, that is, it applies as well as what plays a role in the optimization of queries. The number of attributes of the result relation is the sum of the numbers of the two output relations minus the number of compound attributes. ${\ displaystyle R \ bowtie S = S \ bowtie R}$ ${\ displaystyle (R \ bowtie S) \ bowtie T = R \ bowtie (S \ bowtie T)}$

definition

For two relations and is the result of the natural union ${\ displaystyle R (A_ {1}, ..., A_ {k}, B_ {1}, ..., B_ {n})}$ ${\ displaystyle S (B_ {1}, ..., B_ {n}, C_ {1}, ..., C_ {l})}$

{\ displaystyle R \ bowtie S: = \ {r \ cup s _ {[C_ {1}, ..., C_ {l}]} | r \ in R \ land s \ in S \ land r _ {[B_ { 1}, ..., B_ {n}]} = s _ {[B_ {1}, ..., B_ {n}]} \}}

example

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
A.	F.	G
1	2	3
7th	8th	9

NATURAL JOIN (R, S):
A.	B.	C.	D.	F.	G
1	2	3	4th	2	3
7th	8th	9	0	8th	9

Semi join

The semi join calculates the proportion of a natural join that remains after a reduction to the left relation.

definition

For two relations and is the result of half the natural union ${\ displaystyle R (A_ {1}, ..., A_ {k}, B_ {1}, ..., B_ {n})}$ ${\ displaystyle S (B_ {1}, ..., B_ {n}, C_ {1}, ..., C_ {l})}$

{\ displaystyle R \ ltimes S: = \ {r | r \ in R \ land s \ in S \ land r _ {[B_ {1}, ..., B_ {n}]} = s _ {[B_ {1 }, ..., B_ {n}]} \}}

example

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
A.	F.	G
1	2	3
7th	8th	9

SEMIJOIN (R, RA = SA, S):
A.	B.	C.	D.
1	2	3	4th
7th	8th	9	0

Outer join

Left outer join

Right outer join

Full outer join

In contrast to the equi-join, the outer join also includes the tuples of the left ( left outer join ) or right ( right outer join ) table in the result relation that cannot find a join partner. The non-existent attributes of the join relation are filled with null values.

The combination of the left and right outer join is called an outer join or a full outer join. All tuples are included in the result relation and those attributes of a tuple that have not found a join partner in the other relation are filled with zero values.

The outer join can be used with or without a ( natural outer join ) join condition.

example

R:
A.	B.	C.	D.
1	2	3	4th
4th	5	6th	7th
7th	8th	9	0

S:
A.	F.	G
1	2	3
7th	8th	9

LEFT OUTER JOIN (R, RA = SA, S):
A.	B.	C.	D.	F.	G
1	2	3	4th	2	3
4th	5	6th	7th	ZERO	ZERO
7th	8th	9	0	8th	9

renaming

With this operation attributes and relations can be renamed. This surgery is important to

Enable joins from different named relations,
To enable Cartesian products where there are the same attribute names, in particular with the same relation
To enable set operations between relations with different attributes.

The notation is , linear R [old → new]. ${\ displaystyle \ rho _ {[\ mathrm {new} \ leftarrow \ mathrm {old}]} (R)}$

definition

We construct a new tuple set t ' from the old one:

${\ displaystyle \ rho _ {[\ mathrm {new} \ leftarrow \ mathrm {old}]} (R): = \ {t '| t' (R- \ mathrm {old}) = t (R- \ mathrm {old}) \ land t '(\ mathrm {new}) = t (\ mathrm {old}) \}}$

example

R:
A.	B.	C.
1	2	3
4th	5	6th

R [B → X]:
A.	X	C.
1	2	3
4th	5	6th

division

definition

Since division is a derived operation, we define it with the help of the other operations of relational algebra. Let R, S be relations and the sets of attributes belonging to R and the sets of attributes belonging to S. . ${\ displaystyle \ beta}$ ${\ displaystyle \ gamma}$ ${\ displaystyle R ': = \ beta \ setminus \ gamma}$

The division is then defined by:

${\ displaystyle R \ div S: = \ pi _ {R '} (R) - \ pi _ {R'} ((\ pi _ {R '} (R) \ times S) -R)}$

In plain words containing ie those of attributes , which in any combination with the attributes of in occur. ${\ displaystyle R \ div S}$ ${\ displaystyle R '}$ ${\ displaystyle S}$ ${\ displaystyle R}$

example

Given is a relation R that contains fathers and mothers, their children and the age of these children. In addition, there is a relation S that contains some children and their ages: Maria (4) and Sabine (2). If you divide R by S, the result is a relation that only contains those married couples who have both a daughter Maria at age 4 and a daughter Sabine at age 2:

R:
father	mother	child	Age
Franz	Helga	Harald	5
Franz	Helga	Maria	4th
Franz	Ursula	Sabine	2
Moritz	Melanie	Gertrud	7th
Moritz	Melanie	Maria	4th
Moritz	Melanie	Sabine	2
Peter	Christina	Robert	9

S:
child	Age
Maria	4th
Sabine	2

R ÷ S
father	mother
Moritz	Melanie

Division is used when the question contains "for all". For our example the question is: "Select all parents (father, mother) who have a child with the name Maria and the age 4 and a child with the name Sabine and the age 2 (the relation S)."

Minimalism and completeness

A minimal set of operations, that is, a set of operations that is at least necessary to be able to form all expressions of relational algebra, includes

projection
selection
Cartesian product
Union
difference
renaming

All other operations ( e.g. joins ) can be simulated using these basic operations.

Any other set of operations is relationally complete if they have the same power as the operations above.

Relational Algebra Extensions

Relational algebra is not powerful enough to be able to completely map other query languages, especially SQL, into relational algebra. There are e.g. B. no way GROUP BY/HAVINGto translate the SQL operators , aggregate functions and null values into relational algebra. Here we consider some extensions (some of which have a similar effect) that enable a complete mapping into relational algebra and thus a complete theoretical consideration of these query languages.

Zero values

SQL allows the use of NULL values , which IS NULLcan be queried with the special predicate . This is particularly important when creating outer joins that create a relation that contain all values of one relation, as well as all values of the other for which the join condition is true, otherwise NULL values. This cannot be mapped with relational algebra.

One possibility is the definition of null values with a three-valued logic, as in SQL, that is, the Boolean operators are extended using truth tables in such a way that it is specified how to proceed when an operand is NULL.

Extended truth table for AND
∧	true	false	ZERO
true	true	false	ZERO
false	false	false	false
ZERO	ZERO	false	ZERO

Selection conditions or combinations that are applied to null values result in NULL. One difficulty with this (i.e., with the SQL-like treatment of null values) is that the results of queries with subqueries that return NULL do not necessarily match the user's intention. This type of zero-value treatment is also not orthogonal ; H. the behavior on one level (Boolean operators, 3-value logic) does not correspond to that on another (compound, 3-value logic is mapped to 2-value).

Another possibility is to differentiate between two different types of zero values, each of which means “any” or “not defined”.

Grouping operator and aggregate functions

Grouping and aggregation

The grouping applies functions to the same attributes in a relation. The operator γ receives a list of functions and a list of attributes. The functions are then applied to tuples for which the attributes of the attribute list are the same. The output is a new relation consisting of the attribute list and a new attribute that contains the results of the function list.

The functions are then the usual aggregate functions count, sum, max, avg ….

definition

Let R be a relation and A = {A ₁ ,…, A _n } attributes from R. Let F (X) be a function list f ₁ (x ₁ ),…, f _n (x _n ). The grouping is then

${\ displaystyle \ gamma _ {F (X); A} (R): = \ bigcup _ {t \ in R} \ gamma _ {F (X); \ emptyset} (\ sigma _ {A = tA} ( R))}$

For an empty set of attributes (i.e. γ _{F (X); {}} (…)) an additional attribute is generated that contains the value of the function application over the entire relation. This is used to break down the relation with the selection into partial relations with the same attributes, which are then reassembled with the function application.

Furthermore, a grouping with an empty function list has no effect.

NF²

Nesting

Denesting

The NF² model is an extension of the relational database model. The name stands for Non-first-normal-form (NFNF), which should indicate that the condition of atomic attribute values of the 1st normal form is broken. As a result, sets of attributes and sets of sets are allowed, which means that an attribute of a relation can again be a relation. The domain (range of values) of a combined attribute is the cross product of the attribute domains involved.

NF² extends relational algebra in such a way that, in addition to the usual (appropriately adapted) operations of relational algebra, two operations are added that nest a relation (nesting ν) and deinterleave a relation (denesting μ). Nesting combines a number of attributes in a sub-relation that is given a new attribute name. De-nesting removes nesting. These operations serve to transform NF² relations into the 1st normal form and vice versa. The operations are generally not bijective.

For the above reasons, NF² does not need any foreign keys.

Multi-set semantics

As a result of queries, SQL returns a multiset , i.e. a set that can contain multiple elements. This was done for performance reasons in order to save the additional step of removing duplicates. Strictly speaking, only queries DISTINCTthat are specified can be translated into relational algebra .

For relational algebra one can then additionally specify a function bag-to-set , which removes the duplicates from a multiset and thus creates a set, and the basic operations then simply as a multiset {{t | ...}} specify. However, care must be taken when defining derived operations.

Examples

As the relationship schemes for the examples, we take the classic sample database consisting of the schemes customer , supplier and goods . The schemes are:

CUSTOMER (customer number, name, place of residence, account balance)

SUPPLIER (supplier number, name, place, telephone)

GOODS (goods number, description, supplier number, price)

Basic operations of relational algebra are then used as follows:

The prices of all goods:

π _{name, price} (GOODS)

All suppliers from Bremen:

σ _{place = 'Bremen'} (SUPPLIER)

Customers with a negative account balance:

σ _{account balance <0} (CUSTOMER)

Rename the location of the SUPPLIER (for example to be able to perform quantity operations):

ρ _{place of residence ← place} (SUPPLIER)

Since the results of relational algebra are again relations (the RA is orthogonal), the operations can again be applied to the results of operations. This allows complex queries. For a simpler notation, we assume that the cross product implicitly renames the attributes so that the new attribute names are qualified with the relation name, i.e. H. the supplier number from the relation WARE becomes WARE.

The telephone numbers of all suppliers who deliver vegetables in Bremen:

π _Telephone (σ _{Name = 'Vegetables' ∧ Place = 'Bremen' ∧ SUPPLIER.supplier no = WARE.supplier no} (SUPPLIER × GOODS))

All locations that contain at least one supplier and at least one customer

π _place (ρ _{place ← place of residence} (CUSTOMER)) ∩ π _place (SUPPLIER)

literature

Edgar F. Codd : A Relational Model of Data for Large Shared Data Banks. In: Communications of the ACM. 6/13/1970, pp. 377-387. (The fundamental work with which Codd first presented the relational data model in 1970. ACM Classics or Peter Morcinek: FHTW Berlin PDF; 1.4 MB)
Alfons Kemper, André Eickler: Database Systems - An Introduction. ISBN 3-486-57690-9
Peter Kandzia, Hans-Joachim Klein: Theoretical foundations of relational database systems. ISBN 3-411-14891-8
Andreas Heuer, Gunter Saake : Databases: Concepts and Languages. MITP Verlag, ISBN 3-8266-0619-1 , p. 297 ff.
H. Buff Database Theory . Book-on-Demand , Norderstedt, ISBN 3-0344-0201-5 , 312 pages
H.-J. Schek, P. Pistor: Data Structures for an Integrated Data Base Management and Information Retrieval System (Non First Normal Form NF²), Proceedings of the 8th International Conference on Very Large Data Bases, 1982, ISBN 0-934613-14-1 , p 197-207
Dirk Leinders, Jerzy Tyskiewicz, Jan Van den Bussche: On the expressive power of semijoin queries : arxiv : cs.DB / 0308014

Web links

Wikibooks: Relational algebra and SQL - learning and teaching materials

Explanatory videos for relational algebra , Big Data Analytics Group, Saarland University
RAT, Software Rational Algebra Translator to SQL
SELECT2OBree: Conversion of SQL into relational algebra
Database / introduction to joins. In: SELFHTML . Retrieved July 4, 2017 (Illustrative Examples).
Bibliography server
Further examples

Individual evidence

^ Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 53 .
↑ Ramez Elmasri, Shamkant B. Navathe: Fundamentals of Database Systems . 3rd, revised edition. Pearson Studium, 2002, ISBN 3-8273-7021-3 , pp. 242 .
^ Jeffery D. Ullman: Principles of Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 210 .
↑ ^a ^b Torsten Grust, Jens Teubner: Relational Algebra: Mother Tongue - XQuery: Fluent . In: Proc. of the first Twente Data Management Workshop on XML Databases . Enschede, The Netherlands 2004, pp. 9–16 ( uni-konstanz.de ).
↑ ^a ^b Richard Cyganiak: A relational algebra for SPARQL . Ed .: Hewlett-Packard Development Company. Bristol 2005 ( https://www.semanticscholar.org/paper/A-relational-algebra-for-SPARQL-Cyganiak/129ee5496d9b54fdfdc11214765f90d047fe57a1 , http://www.hpl.hp.com/techreports/2005/HPL-2005-170 .html ).
↑ ^a ^b Werner Kießling, Gerhard Köstler: Multimedia course database systems . Springer-Verlag, Berlin / Heidelberg 1998, ISBN 3-540-63836-9 .
^ Jeffery D. Ullman: Principles od Database and Knowledge-base Systems - Volume II: The New Technologies . Computer Science Press, 1989, ISBN 0-7167-8162-X , pp. 633-733 . , Chapter 11
^ ^A ^b Alfred Tarski: On the calculus of relations . In: The Journal of Symbolic Logic . tape 6 , no. 3 . Association for Symbolic Logic, New York September 1941, p. 73-89 , JSTOR : 2268577 .
^ ^A ^b ^c Edgar F. Codd: Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks . In: ACM SIGMOD Record . tape 38 , no. 1 . Association for Computing Machinery, New York 1969, p. 17-36 ( acm.org ).
^ ^A ^b Edgar F. Codd: A Relational Model of Data for Large Shared Data Banks . In: Communications of the ACM . tape 13 , no. 6 . Association for Computing Machinery, New York 1970, pp. 377-387 ( acm.org ).
^ Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indexes . In: Proceedings of the 1970 ACM SIGFIDET (SIGFIDET '70) . Association for Computing Machinery, 1970, pp. 107-141 .
^ Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indexes . In: Acta Informatica . tape 1 , no. 3 . Springer-Verlag, 1972, p. 173-189 .
↑ Alexis Leon and Mathews Leon: SQL - A Complete Reference . Tata McGraw-Hill, New Delhi 1999, ISBN 0-07-463708-8 , pp. 68–69 ( limited preview in Google Book Search).
^ Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 195-210 .
^ Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 185-195 .
↑ Günther Pernul, Rainer Unland: Databases in the company - analysis, modeling and use . In: Textbooks Business Information Systems . 2nd, corrected edition. Oldenbourg Wissenschaftsverlag, ISBN 3-486-27210-1 , p. 228, 248 ( limited preview in Google Book search).

[Ullman_Vol_I_S_53-1] Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 53 .

[Elmasri_Navate_Kapitel_S_242-2] Ramez Elmasri, Shamkant B. Navathe: Fundamentals of Database Systems . 3rd, revised edition. Pearson Studium, 2002, ISBN 3-8273-7021-3 , pp. 242 .

[Ullman_Vol_I_S_210-3] Jeffery D. Ullman: Principles of Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 210 .

[Grust_Teubner_TDM_2004-4] Torsten Grust, Jens Teubner: Relational Algebra: Mother Tongue - XQuery: Fluent . In: Proc. of the first Twente Data Management Workshop on XML Databases . Enschede, The Netherlands 2004, pp. 9–16 ( uni-konstanz.de ).

[Cyganiak_2005-5] Richard Cyganiak: A relational algebra for SPARQL . Ed .: Hewlett-Packard Development Company. Bristol 2005 ( https://www.semanticscholar.org/paper/A-relational-algebra-for-SPARQL-Cyganiak/129ee5496d9b54fdfdc11214765f90d047fe57a1 , http://www.hpl.hp.com/techreports/2005/HPL-2005-170 .html ).

[Kießling_Köstler_1998-6] Werner Kießling, Gerhard Köstler: Multimedia course database systems . Springer-Verlag, Berlin / Heidelberg 1998, ISBN 3-540-63836-9 .

[Ullman_Vol_II_Chapter_11-7] Jeffery D. Ullman: Principles od Database and Knowledge-base Systems - Volume II: The New Technologies . Computer Science Press, 1989, ISBN 0-7167-8162-X , pp. 633-733 . , Chapter 11

[Tarski_1941-8] A ^b Alfred Tarski: On the calculus of relations . In: The Journal of Symbolic Logic . tape 6 , no. 3 . Association for Symbolic Logic, New York September 1941, p. 73-89 , JSTOR : 2268577 .

[Codd_1969-9] A ^b ^c Edgar F. Codd: Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks . In: ACM SIGMOD Record . tape 38 , no. 1 . Association for Computing Machinery, New York 1969, p. 17-36 ( acm.org ).

[Codd_1970-10] A ^b Edgar F. Codd: A Relational Model of Data for Large Shared Data Banks . In: Communications of the ACM . tape 13 , no. 6 . Association for Computing Machinery, New York 1970, pp. 377-387 ( acm.org ).

[Bayer_1970-11] Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indexes . In: Proceedings of the 1970 ACM SIGFIDET (SIGFIDET '70) . Association for Computing Machinery, 1970, pp. 107-141 .

[Bayer_1972-12] Rudolf Bayer, Edward M. McCreight: Organization and Maintenance of Large Ordered Indexes . In: Acta Informatica . tape 1 , no. 3 . Springer-Verlag, 1972, p. 173-189 .

[13] Alexis Leon and Mathews Leon: SQL - A Complete Reference . Tata McGraw-Hill, New Delhi 1999, ISBN 0-07-463708-8 , pp. 68–69 ( limited preview in Google Book Search).

[Ullman_Vol_I_S_195-14] Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 195-210 .

[Ullman_Vol_I_S_185-15] Jeffery D. Ullman: Principles od Database and Knowledgebase Systems - Volume I: Classical Database Systems . Computer Science Press, 1988, ISBN 0-7167-8158-1 , pp. 185-195 .

[db_in_unt_2_S228_S248-16] Günther Pernul, Rainer Unland: Databases in the company - analysis, modeling and use . In: Textbooks Business Information Systems . 2nd, corrected edition. Oldenbourg Wissenschaftsverlag, ISBN 3-486-27210-1 , p. 228, 248 ( limited preview in Google Book search).

Relational algebra

contents

History and meaning

General

Operations

Set operations

Union

Intersection

difference

Symmetrical difference

Cartesian product (cross product)

projection

selection

Join

Equi-join

Natural join

Semi join

Outer join

renaming

division

Minimalism and completeness

Relational Algebra Extensions

Zero values

Grouping operator and aggregate functions

NF²

Multi-set semantics

Examples

See also

literature

Web links

Individual evidence