Network database model

from Wikipedia, the free encyclopedia

The network database model was proposed by the Data Base Task Group (DBTG) of the Programming Language Committee (later COBOL Committee ) of the Conference on Data Systems Language ( CODASYL ), the organization that was also responsible for defining the COBOL programming language . It is also known under the name "CODASYL database model" or "DBTG database model" and is accordingly strongly influenced by Cobol. The finished DBTG report was presented in 1971, around the same time as the first publications on the relational database model. It contained proposals for three different database languages: a Schema Data Description Language , a Subschema Data Description Language, and a Data Manipulation Language .

The network model does not require a strict hierarchy and can therefore also map m: n relationships, ie a data record can have several predecessors. Several data records can also be at the top. There are usually different search methods to get to a certain data record. It can be seen as a generalization of the hierarchical database model .

The logical data model

Network database model

Database records

A network database consists of data sets ( Record ), which consist of various fields ( Data Item ). A field has a name and a value. Each sentence describes a person, an object or an event.

A network database management system (DBMS) processes data records. A record , or more precisely the expression of a record ( record occurrence ), can be saved as a whole in the database (STORE), changed (MODIFY) and deleted again (DELETE).

A CODASYL data record usually has an internal structure. Fields can be combined into group fields and group fields can consist not only of individual fields, but also of group fields.

A CODASYL record can contain so-called “tables”, which are several values ​​that are managed under one name. The individual values ​​or elements of the table are addressed by subscriptions , e.g. B. Customer.Mayer.Monthly sales [0]. Duplicate field names are also allowed, provided they are in different sentences. You must then be addressed in a qualified manner, e.g. B. Customer. Name and supplier. Name .

Records of a record type , which must have a unique name, have the same internal structure. A record type is therefore a general description of many data records, i.e. the characteristics of a record type ( record occurrence ). All record types are defined in the database schema.

Database key

Records within a database can have the same values ​​in all fields. The database key ( Data Base Key (DBK) ), on the other hand, is an internal key that is unique within a database and is assigned when the record is saved for the first time.

Dataset

Relationships between sentences are determined by a special construction called a data set (or simply set). In the simplest case, each set consists of two different types of records. A data set consists of exactly one member of the first record type, the owner of the data set. Each set can have none (empty set occurrence), one or more records of the second record type, the members of the data set. These have a defined sort order. The uniquely named set type describes all members of the same relationship.

The two structure types that describe the network database model are:

  • the record type; several records of a record type are called record occurrence
  • Set type; multiple occurrences of a set type are called set occurrence

All record types and all set types must be described in the database schema.

The schema of a database is graphically represented as a Bachman diagram ( data structure diagram ) in which each record type is shown as a rectangle (with record type and attribute name) and each set type is shown as an arrow from the owner to the member.

Data manipulation

A data manipulation language consists of update functions and query functions. The update functions include saving new records of the record type, changing existing records, deleting existing records, inserting existing records of a record type as a member of a data set and removing a record type from a data set.

An application program usually only needs parts of a database or a database schema. That is why one defines subschemas for programs or program groups that want to manipulate these parts of the database.

Data description language

The "Record" clause

The description of a record type includes a unique record name, the description of the attributes of all data fields and the specification of the so-called "location mode".

  • RECORD NAME IS <record name>
  • DATA ITEM sub-clause: It defines individual fields, group fields and "tables"
  • LOCATION MODE sub-clause: It defines the rule according to which the data records (record occurrences) are to be assigned database keys (DBK), namely SYSTEM , CALC USING [data item, ...] , VIA <set name> SET or DIRECT . SYSTEM means that the system is free to choose the fastest method that is currently available. CALC is an abbreviation for calculation ; What is meant is that the DBK should be calculated from the strung together values ​​of the individual fields named in the USING clause. VIA ... is intended to ensure that the member record is stored as close as possible to the owner. DIRECT means that the application programmer should calculate the database key himself.

The “set” clause

The definition of a data set includes a unique name of the set type, the name of the owner record type, the name of the membership record type and the desired sorting sequence.

  • SET NAME IS [set name]
  • OWNER IS [record name]
  • MEMBER IS [record name]
  • ORDER IS {FIRST / LAST / NEXT / PRIOR / SORTED BY ...}.

Additional options can be:

  • SET IS PRIOR PROCESSABLE. The set should also contain pointers to the previous member set.
  • SET IS LINKED TO OWNER. Each member should also receive a direct pointer to the owner.
  • INSERTION IS {AUTOMATIC / MANUAL}. With AUTOMATIC, every record of this record type added to the database should automatically become a member of the set. MANUAL means that the blocks should not be inserted automatically, but only if required by the program with the INSERT command.
  • RETENTION IS {MANDATORY / OPTIONAL}. OPTIONAL means that membership in the set can also take place without deleting the set (MANDATORY) from the application program.

As you can see, the set definition is actually strictly hierarchical, an owner can have many members. What about the many-to-many relationship that defines the network? Using the example of a parts list structure, one knows that a part (group) can consist of many different parts (groups). Viewed alone, the parts list is a tree structure. However, parts can also be used in many groups for which one would like to have a proof of usage of parts. A second tree structure for this would be of considerable redundancy. You therefore use structure or chain records that are on the one hand in the parts list chain and on the other hand in the use chain of the owner's parts master. The m: n relationship is thus realized.

Data manipulation language

The application environment (UWA)

UWA stands for User Working Area. A UWA is provided for each program running at the same time. It contains pointers called currency indicators as references to data records in the database ( record occurrences ). It also contains patterns or styles ( templates ) of the different record types. There is also a variable called Error status that shows the result of the last DML command.

Currency indicators (current pointers)

The concept of data manipulation in a CODASYL database based on currency indicators ( currency indicators ) and navigation. Currency indicators are variables whose values ​​are database keys ( data base keys ) in the internal format.

  • There is only one data record ( record occurrence ) in the database that is currently available to the RUN-UNIT for processing, it is called the current of run-unit . In CODASYL terminology, "run-unit" is an application program. When a data record was selected (with FIND) or saved (with STORE), it became the current record of the program ( current record of run-unit ).
  • The sequence of records which current (current) records were of a program called Navigation ( navigation ).
  • The record, regardless of the record type that was last accessed, is the current of the program ("current of run-unit").
  • The record of a particular record type that was last accessed is the current record of the record type. Example: the last of the CUSTOMER record type is the current one of the CUSTOMER record type.
  • Correspondingly, there is also a current set type ("current of set type"). For each set type (consisting of owner record type and member record type), the record that was last accessed is the current record of the set, regardless of whether it was an owner or a member.

The user working area therefore contains:

  • a current indicator for the run unit
  • a current indicator for each record type
  • a current indicator for each type of set, referring to either the owner or the member, depending on who was last accessed.

Currency indicators are changed each time a DML operation is performed. When accessing a certain data record, it becomes the current record of run-unit , current record of its record type and current record of all sets , in which it occurs either as an owner or as a member.

Record Templates (templates for record formats)

Templates are "empty" areas in the format of the respective record type. They can be addressed by the application program via record name.field name (or only field name if it is unique).

A GET command reads the data record into the corresponding template and can be processed there by the program. It is also saved via the template after it has been sent by the program. A STORE command copies the template into the database.

Error Status (error status)

After a DML command has been executed, the Error_Status variable in the UWA contains the value 0 if the operation was successful and a value not equal to 0 if an error has occurred. Values ​​other than 0 are not always errors, e.g. For example, it is not an error, but only an indication if the end of the set has been reached while reading through a set.

Read records

A data record is read in two steps:

  1. A FIND command is used to set the current of run-unit . The FIND only changes the currency indicator, not the template in the UWA.
  2. The record is transferred to the UWA work area with a special GET command. The GET command always transfers the current of run-unit to the corresponding template area. The format of the command is: GET [record name].

The FIND command has different variants, all of which have the general task of making a certain data record ( record occurrence ) the current record of the run unit, the record type and all sets in which it occurs. Variants of the FIND command are e.g. B .:

  • Find a record with the database key,
  • Finding a record with the value of its CALC key,
  • Find all sentences in a set one after the other
  • Finding all records in a set that have a certain value in a certain field.
  • Finding the owner of a set.

Add and change records

There are various commands for this: Save ( STORE ) a new data record, insert ( INSERT ) a data record into a set, remove ( REMOVE ) a data record from a set, delete ( DELETE ) the current record ( current record ) and change ( MODIFY ) of the current rate. All commands have many different options, the use of which depends heavily on the respective database structure (schema) and can be very complex in individual cases.

In order to make life easier for the application programmer in practice, I / O modules connected between the (e.g. COBOL) program and the database have proven their worth.

The network database model today

After the introduction of both database models (relational vs. network) at the beginning of the 1970s , there had been highly efficient network database systems on medium and large mainframes for the highest transaction rates for twenty years, until relational database systems were able to keep up with performance. It is not without reason that the hierarchical database system from IBM from the late 1960s is still in use today by many IBM customers. Query languages ​​for ad hoc queries were also available on network systems, for example QLP / 1100 from Sperry Rand . Today the network database model is mainly used on mainframes .

Well-known representatives of the network database model are UDS (Universal Database System) from Siemens , DMS (Database Management System) from Sperry Univac . Mixed forms between relational databases and network databases have been developed - e.g. B. from Sperry Univac (RDBMS Relational Database Management System) and Siemens (UDS / SQL), with the intention of combining the advantages of both models.

Since the 1990s , the network database model has been replaced more and more by the relational database model . With the idea of ​​the semantic web , the network database model is becoming more important again. There are now a number of graph databases that could be described as the modern successors of network databases . A major difference, however, is the ability to dynamically change the schema.

See also

Web links