Database

from Wikipedia, the free encyclopedia

A database, also known as a database system , is a system for electronic data management . The main task of a database is to store large amounts of data efficiently , consistently and permanently and to provide the required subsets in different, needs-based forms of representation for users and application programs .

A database consists of two parts: the administration software , called the database management system (DBMS) , and the amount of data to be managed, the database (DB) in the narrower sense, sometimes also called the " database ". The management software organizes the structured storage of the data internally and controls all read and write access to the database. A database system offers a database language for querying and managing the data .

The most common form of database is a relational database . The structure of the data is determined by a database model .

A distinction must be made between the term database (consisting of DBMS and data) described here and database applications: The latter are computer programs (often belonging to the application software ) that manage and store their individually required data using a database system. Examples: order management, ordering, customer and address management, invoicing.

In parlance, sometimes (and conceptually incorrectly) data that is not managed with database systems is referred to as a “database”: a set of thematically related files .

history

Based on problems with the processing of data in simple files, the concept of managing data through a separate software layer between the operating system (file management) and the application program was introduced in the 1960s. This concept countered the undesirable development that data storage devices in the form of files were usually designed for a special application and a significant part of day-to-day business was burdened with copying, mixing and restructuring files.

One of the first large DBMS was IMS with the language DL / I ( Data Language One ). The databases managed with it were structured hierarchically . At the same time, CODASYL defined a model for network- like structured databases.

Edgar F. Codd made significant progress in the 1960s and 1970s with his research work at the IBM Almaden Research Center . Codd developed the fundamentals of the first experimental relational database system system R . The Berkeley Group followed with Ingres and the query language QUEL.

Oracle (at that time still under the company names SDL and RSI ) exploited the results of the System R and made SQL a commercial success. IBM followed with SQL / DS and DB2 . Relational database systems replaced hierarchical and network-like systems in the 1980s and the majority of authorities, corporations, institutes and medium-sized companies switched their IT to database systems.

While in the 1990s only a few commercial manufacturers of database software actually dominated the market (namely IBM, Informix , dBASE , Microsoft SQL Server and Oracle), the open source database management systems became increasingly important in the 2000s . Above all, MySQL and PostgreSQL achieved significant market shares. In response, the leading commercial manufacturers began offering royalty-free versions of their database software. Since around 2001, the importance of NoSQL systems has grown due to the lack of scalability of relational databases .

A family tree of the database systems can be found as Genealogy of Relational Database Management Systems at the Hasso Plattner Institute .

meaning

Database systems are a central component of corporate software today . They represent a critical part of many companies and authorities. The ability of a company to act depends on the availability , completeness and correctness of the data. The data security is an important and legally required part of the IT department of a company or government agency.

Components of a database system

The database system is the executed DBMS together with the data to be administered in the database. A database ensures the persistent storage as well as the consistency of the user data of an institution and offers the database applications using the DBMS interfaces to query, evaluate, change and manage this data.

Database management system

The database management system (DBMS) is the software used, which is installed and configured for the database system. The DBMS defines the database model, has to secure a large part of the requirements listed below and is decisive for the functionality and speed of the system. Database management systems themselves are highly complex software systems.

For database management system is (rarely) the term database management system (DBMS) used.

The abbreviation RDBMS for a relational database management system is common .

Database

In theory is meant by database (Engl. Database ) a logically related database. This database is managed by a running DBMS and stored on non-volatile storage media, invisible to application systems and users. In order to ensure efficient access to the database, the DBMS usually manages a storage hierarchy which in particular also includes a fast intermediate storage ( buffer pool ). To maintain the consistency of the database, all application systems must turn to the DBMS in order to be able to use the database. Only administrative activities, such as data backup, are permitted direct access to the memory. The logical structure of the data to be saved is developed and defined as a data model during data modeling and saved in its final form in and according to the syntax rules of the DBMS. For this purpose, the DBMS creates, uses and manages a "system catalog" ( data dictionary ) with meta information about the database, for example about its structure, its data fields (name, length, format ...), access rules, integrity conditions , etc.

Individual DBMS manufacturers use slightly different terms for what exactly is meant by a database: either all data that is managed by a running DBMS or the instance , or only the data that belongs together in terms of content. In the case of distributed databases , the model also has several databases on different systems that are connected to one another.

Examples

  • All banks and insurance companies work with database systems, usually with relational DBMS. All customer and account information, bookings and other data are stored in a structured manner in the database system. In this application environment, data protection and data security have high priority. Database systems are used here for day-to-day business ( OLTP ) as well as periodically or ad-hoc for any other purpose (such as in marketing , controlling , accounting and many other areas; see also OLAP ).
  • In fact, all medium-sized companies and large corporations work for resource planning with ERP systems, the data part of which is available in the form of database systems.
  • This article in the version available on Wikipedia is managed by a database system ( Wikipedia technology ) along with all the other articles contained there .
  • Market research institutes compile their own and external data in data warehouses (data stores).

Functions of a DBMS

The main functions of today's database management systems are:

Data security

The RDBMS saves the relational data on a storage medium . In addition to the actual data, information about the data schemes and access rights of users is also stored. The latter are important to guarantee data security . This includes both protection against data loss and protection against unauthorized access. The metadata of a DBMS is also known as the system's data dictionary or catalog .

Another important aspect of databases is to back up the data stock by backups . In practice, this is often a performance problem that should not be neglected, since data can only be modified to a very limited extent during a backup.

Transactions

Another important part of data security is the transaction concept , which protects data against race conditions through parallel access by several users. Otherwise, data could be changed by different users at the same time. The result of the changes would then depend on chance or data could become inconsistent. To put it simply, transactions lock data temporarily for access by other users until a transaction is ended by a commit or changes made are made ineffective by a rollback . The data is then free again for other transactions.

Data integrity

The integrity of the data can be ensured through constraints . These are rules in the management system that describe how data may be changed. The most important representative in relational database systems is the foreign key constraint . This prevents data from being deleted that is still required by another table, i.e. H. be referenced via a foreign key . See main article referential integrity .

Other integrity conditions regulate, for example, whether duplicates are allowed or what content individual data fields may contain ("area integrity", including checking for permitted empty content).

Query optimization

Evaluation plan in the form of an operator tree

The DBMS provides a database language so that data can be queried and changed. A query to the database system is first translated into the logical operations of relational algebra. Then so-called database operators are selected who actually perform the logical operation on the data. The choice of operators and the order in which they are executed is called having the query optimizer create an execution plan . The optimizer is a particularly complex part of the database software and has a significant influence on the efficiency of the overall system.

Indexes play an important role in query optimization . They are used to quickly find a specific data set. Which data is given an index is determined with the database schema, but can be adjusted later by a database administrator.

Application support

To support database applications, database systems offer triggers and stored procedures . A trigger triggers an action in the database when a certain event has occurred, often during insert or change operations. Stored procedures are used to execute scripts in the database. Since stored procedures are executed within the database system, they are often the most efficient way to manipulate data. Databases that support triggers and stored procedures are also called active databases.

languages

A database provides an interface, a database language for the following purposes:

  • Data query and manipulation ( DML )
  • Administration of the database and definition of the data structures ( DDL )
  • Authorization control ( DCL )

In the relational DBMS, these categories are combined in one language ( SQL ), but in other systems there is a separation in the form of different languages.

Multi-user capability

Authorizations are managed for access to the data. The corresponding operation cannot be performed without authorization.

For the (pseudo-) simultaneous access of several applications or users, the DBMS regulates competitive situations.

  • There are locks (Engl. Locks ) managed.
  • There are system protocols (Engl. Logs and log files ) managed.
  • The database is transaction-oriented .

This group of requirements distinguishes database systems in the narrower sense from functionally extended file systems.

Errors in a database that occur due to illegal parallel database access are called anomalies in multi-user operation .

Different forms of database systems

Database model

The basis for structuring the data and their relationships to one another is the database model that is defined by the DBMS manufacturer. Depending on the database model, the database schema must be adapted to certain structuring options:

  • hierarchical : The data objects can only have a parent-child relationship to one another.
  • network-like : The data objects are connected to one another in networks.
  • relational : The data is managed line by line in tables. There can be any relationship between data. They are determined by the values ​​of certain table columns.
  • object-oriented : The relationships between data objects are managed by the database system itself. Objects can inherit properties and data from other objects.
  • document-oriented : The objects to be saved are stored as documents with possibly different attributes, i. H. without the requirement of structural equality, saved.

There are a number of mixed and subsidiary forms, such as the object-relational model.

Alignment

A classic distinction is made between aligning the system with many small queries ( OLTP ) or long-term evaluations ( OLAP ). However, it is quite common that the same system has to meet both requirements and, for example, is "run" during the day for OLTP and at night for OLAP operation. A database administrator then works out different configurations (main memory of the server, number of processes, optimization strategy for access, etc.).

See also

literature

  • Ramez Elmasri, Shamkant B. Navathe: Basics of database systems. 3rd edition of the basic studies edition. Pearson studies, Munich a. a. 2005, ISBN 3-8273-7153-8 .
  • Andreas Heuer, Gunter Saake : Databases. Concepts and languages. 2nd, updated and expanded edition. mitp-Verlag, Bonn 2000, ISBN 3-8266-0619-1 .
  • Alfons Kemper , André Eickler: Database systems. An introduction. 7th, updated and expanded edition. Oldenbourg Verlag, Munich u. a. 2009, ISBN 978-3-486-59018-0 .
  • Thomas Kudraß (Hrsg.): Pocket book databases. Fachbuchverlag Leipzig in Carl.Hanser-Verlag, Munich 2007, ISBN 978-3-446-40944-6 .
  • T. William Olle: The Codasyl Approach to Data Base Management. Wiley, Chichester 1978, ISBN 0-471-99579-7 .
  • Gottfried Vossen : data models, database languages ​​and database management systems. 5th, corrected and supplemented edition. Oldenbourg Verlag, Munich u. a. 2008, ISBN 3-486-27574-7 .

Web links

Commons : Databases  - collection of images, videos and audio files
Wiktionary: Database  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. it-visions [1] glossary keyword "database system"
  2. it-infothek [2] Basics of the database application
  3. ^ EF Codd: A relational model of data for large shared data banks. ( Memento of June 12, 2007 in the Internet Archive ) In: Communications of the ACM . 6/13/1970. Association for Computing Machinery, pp. 377-387
  4. ^ Genealogy of Relational Database Management Systems: [3] at the Hasso Plattner Institute .