Db2

from Wikipedia, the free encyclopedia
Db2

IBM logo.svg
Basic data

developer IBM
Publishing year 1983
Current  version 12
(z / OS: October 25, 2013 / LUW: April 12, 2016)
operating system IBM mainframe ( z / OS ), Linux , Unix , Windows
programming language C , C ++ , assembly language
category Database management system
License proprietary
German speaking Yes
Db2 page at ibm.com

Db2 is a commercial relational database management system (RDBMS) from IBM , whose origins go back to System R and the foundations of Edgar F. Codd of IBM Research in 1970.

properties

The database management system DataBase 2 (Db2) is sold by IBM for various platforms:

There is the product line for IBM mainframes on which the system for z / OS has developed from the operating system VSE via MVS and OS / 390 .

Another line originally developed for the OS / 2 operating system . This software is written in the C programming language and forms the basis for the variants in the Linux , Unix and Windows (LUW) operating systems .

One variation is a solution for IBM midrange systems integrated into the IBM i operating system (today's machine designation System i ).

The fourth product line concerns the VSE and VM operating systems . There is still support for these, but no more new versions. IBM advises customers to switch to other platforms.

The current versions are:

  • Db2 for z / OS, Version 12 (former names: Db2 UDB for z / OS for Version 8 or Db2 UDB for z / OS and OS / 390 for Version 7)
  • Db2 UDB for Linux, UNIX and Windows, Version 11.1
    • Db2 Data Warehouse Edition for AIX, Linux, Windows
    • Db2 Everyplace IBM has withdrawn from sales of the IBM DB2 Everyplace products. In addition, the end of support for April 30, 2013 was announced.
  • Db2 for i, Version 7 Release 1 (formerly DB2 / 400)
  • Db2 Server for VSE & VM, Version 7.4

Db2 manages data in tables and stores it in tablespaces . In addition to the standard SQL data types, Db2 also supports binary data types (text, sounds, images, videos, XML data).

A free version for Windows and Linux has been available since February 2006 with the following restrictions compared to the commercial versions:

  • Use of max. 2 cores of a CPU (or 4 cores with an additional maintenance contract)
  • Use of max. 4 GB main memory (or 8 GB with additional maintenance contract)

This version has no restrictions on the size of the database and the number of users, but without an additional maintenance contract there is no replication, 24/7 support and convenient updates.

In order to achieve optimal performance when executing DB accesses, a so-called optimizer is used, a cost-based query optimizer , which defines the access to the relevant tables during program preparation . This is based, among other things, on the so-called table statistics, which can be updated periodically with the RUNSTATS tool, but also takes into account other parameters such as For example, the number of CPUs and auxiliary CPUs, the system status, the amount of memory available and the physical distribution of the data.

The data access of the application layer takes place with SQL, which largely corresponds to ANSI-SQL . Stored data can therefore be accessed from many programming languages with embedded SQL .

Db2 can also be used as an embedded database system.

In April 2007, IBM announced a cooperation with MySQL AB to make the Db2 UDB for iSeries available as a database engine for MySQL . This means that the open source MySQL database can also be used on the System i5 . IBM hopes to open up new areas of application for the i5 system for MySQL and PHP applications.

Properties (Db2 z / OS)

The system currently allows tablespaces with a maximum size of 128 terabytes .

Administration on mainframes is usually carried out using batch jobs , a distinction being made between DB2 utilities (RUNSTATS, COPY, REORG, etc.) and DBA jobs (SQL is carried out using DSNTIAD in a TSO background job). Smaller work is often carried out at the TSO terminal using SPUFI or QMF (Query Management Facility) under ISPF .

Larger mainframe environments use Db2 data sharing , whereby the functionality of the parallel sysplex of the zSeries computers is fully used.

Properties (Db2 LUW)

The Db2 UDB for Linux, Unix and Windows is often abbreviated as Db2 LUW.

It is administered with CLI commands in the command line or graphically via the control center (Control Center (Db2cc)).

The graphical user interface was replaced from version 10.1. Instead, the IBM Data Server Manager can be used.

The so-called Db2-EEE (pronounced “triple E” or “trippel-Ih”) - from version 8 called “DPF” (distributed partitioning feature) - is for larger environments where the database partitions are distributed over several computers (nodes) can.

Components

Components (Db2 z / OS)

List of system components of a Db2 system for z / OS and OS / 390 . The differences between OS / 390 and z / OS are relatively small, so there is no need to differentiate (with regard to Db2 components and functions) on which of these two operating systems Db2 is installed. In the following, z / OS will be used as a synonym for “z / OS or OS / 390”. The Db2 for z / OS is designed for optimal use of the operating system and hardware components. In order to clarify the relationships, this list also contains some definitions of terms for hardware components that are not part of the Db2 system in the narrower sense.

BM
Buffer manager. It is part of the Db2 subsystem and has the task of administering the BP and communicating with the GBP.
BP
Buffer pool. Refers to an area in the main memory that is managed by a Db2 subsystem. In a Db2 subsystem, several BPs are usually set up, which are assigned to the individual databases or, more precisely, to the tablespaces, whereby a buffer pool can be assigned to several tablespaces.
BSDS
Bootstrap data set. Files that store recovery information and communication information for DDF and Sysplex, respectively.
Data sharing function
A data sharing group combines several Db2 subsystems and several databases. Each Db2 subsystem can access any database contained in the data sharing group (Shared Everything Architecture). The individual Db2 subsystems can run on the same or on different computer nodes. An application has no direct influence on which of the Db2 subsystems it is served by. Further Db2 subsystems can be added or removed from a data sharing group during operation. This achieves scalability that is easy to administer.
Data sharing group
A data sharing group brings together several data sharing members. A data sharing group has a GBP and a Db2 catalog. Several data sharing members that are located on the same computer node can be linked together and data sharing members that are located on a computer node up to 40 km away can be integrated.
Data sharing member
A Db2 subsystem included in a data sharing group.
Database
A logical classification criterion for Db2 objects. If data sharing is used, then each database is assigned to exactly one data sharing group. A RACF group and a schema of the same name are usually set up for each database . A tablespace is assigned to exactly one database.
Db2 subsystem
Synonym for Db2 instance and for Db2 instance. It describes a program that was started by the operating system and is active in the main memory. Several Db2 subsystems can be installed on one computer node. If the data sharing function is used, a Db2 subsystem has access to all databases that are assigned to the entire data sharing group (Shared Everything Architecture). A Db2 subsystem can also be installed in such a way that it is not included in any data sharing group. Then each Db2 subsystem can only access its own databases (Shared Nothing Architecture). A Db2 subsystem consists of the RDS, the DM with the IRLM and the BM. It manages several BPs and an SQL statement cache. Each Db2 subsystem has its own log files. Therefore each Db2 subsystem is responsible for its own recovery.
Db2 catalog
Directory of all DB objects. In a data sharing environment, there is a single Db2 catalog that lists all of the objects that the Db2 subsystems can access. Objects that are recorded in the Db2 catalog include the databases, buffer pools, tablespaces, tables, views, indices and the access authorizations.
DDF
Distributed data facility. Component for accessing an RDBMS on a remote system. This means that access to other Db2 systems as well as to RDBMS from other database manufacturers such as B. Oracle or Microsoft SQL Server possible.
DM
Data manager. Component of a Db2 subsystem that accesses the buffer pool and executes the stage 1 predicates of the SQL queries. (The list of stage 1 and stage 2 predicates can be found in chapter 6.3.3.2 Summary of predicate processing of the Application Programming and SQL Guide) The DM is called by the RDS to evaluate an SQL query. The DM requests the necessary data from the BM.
DSNZPARM
System parameters. Configuration parameters of the Db2 subsystem are saved here.
GBP
Group buffer pool. The GBP is available to all Db2 subsystems of a data sharing group. The GBP is the logical name of the storage area managed by the Coupling Facility .
GLM
Global Lock Manager. With a parallel sysplex , the coupling facility takes on the role of the GLM. It takes over the coherence control for the common use of data by the individual computer nodes. The GLM communicates with the LLM of the computer nodes.
Hiper pool
Is an optional extension of the virtual pool and can be defined up to 8 GB in size for Db2 z / OS version 7. The Hiper pool is located in expanded storage. Since Db2-z / OS version 8, Hiper pools are no longer required, as the 64-bit addressing means that the entire main memory can be addressed directly. (A 64-bit address can directly address 16 Exa bytes ).
Instance
Synonymous for Db2 subsystem and for Db2 instance.
IRLM
Internal Resource Lock Manager. This is the lock manager of a Db2 subsystem. It communicates with the LLM of the computer node
LLM
Local Lock Manager. This is the lock manager that controls the locking within a computer node. If resources are requested from the hard drive that are already being accessed by the LLM of other computer nodes, the GLM coordinates the locks between the individual LLMs.
Partitioning
is a property of tablespaces , tables and indexes in Db2 z / OS . All partitions are recorded in the same catalog and can be accessed by the same Db2 subsystems. Partitioning is used to store and manage large amounts of data. A partitioned table allows individual partitions to be administered (REORG, RECOVER, COPY) while the application programs are active on the other partitions. Partitioning is an essential contribution to ensuring high availability of large data stocks.
RDS
Relational data system. Component of a Db2 subsystem that, among other things, executes the Stage 2 predicates. (The list of stage 1 and stage 2 predicates can be found in chapter 6.3.3.2 Summary of predicate processing of the Application Programming and SQL Guide) While the DM evaluates all "simple" predicates, the RDS performs the "complex" Links to determine the results of an SQL query.
SPAS
Stored Procedure Address Spaces. Was used in previous versions to define an address space for stored procedures . In newer versions, stored procedures are executed in WLM- managed address spaces.

Components (Db2 LUW)

The Db2 UDB for Linux, Unix and Windows is often abbreviated as Db2 LUW. In contrast to Db2 z / OS, IBM offers the most important manuals for Db2 V9 LUW in German. The error and help texts are also output in German - assuming an installation with a German language selection. The names of the Db2 system components are also given in German. This may make it easier to get started, as you only have to remember the German terms. However, if you want to understand the further details of the system, then you have to rely on the other manuals that are not translated into German and the other English-language literature e.g. B. instructed the Redbooks . On the other hand, understandably only the original language terms are used in the Db2 catalog. Consequently, there is no way around memorizing the English terms.

In the following definition of terms, the English terms are given first, then the German translation used by IBM, followed by an explanation.

Connect
Connection of an instance to a database. A Connect is a prerequisite for executing SQL commands. There is Connect Type 1, in which a client can only ever have a connection to one database. With Connect Type 2, a client can establish a connection to several databases within a transaction . In the second case, the two-phase commit is used.
THE
Database Administration Server. In German: Db2 management server. DAS is an administration service that supports the Db2 administration tools in local and remote administration, job management and notifications. There can only be one DAS on a computer. In a standard installation, the DAS is configured to start when the operating system starts. From version 8, the DAS no longer saves its parameters in configuration files, but in the tools DB.
Database
In German: database. A database can be thought of as a data container. It is stored on one (or more) specific hard drives. A database has its own Db2 catalog and is managed by a local instance. A database can only be accessed by those instances that have cataloged this database in their database directory. If a database is installed on a computer, there must also be an instance there that manages this database.
Database Directory
In German database directory. Each instance has its own database directory in which all databases are listed to which the instance can establish a connection. All local databases are cataloged in the directory. When an instance creates a new database, it enters this into its database directory. Remote databases can also be entered if they are located on a node that is cataloged in the node directory. The database directory and the node directory are the equivalent of the tnsnames.ora file at Oracle.
Db2 CAE
Db2 Client Application Enabler. Also called Db2 Connect. Db2 CAE can be installed on a client computer to provide network access to a Db2 server. The CAE contains a CLP and the DCS.
Db2 Connect
see Db2 CAE.
Db2 registry
Db2 profile registry for storing environment variables. All system parameters required to start up the instance are managed by the operating system. The other system parameters are managed by the database itself. There are four levels in the Db2 registry:
  • Item-level db2 profile registry. Parameters that only apply to the instance are saved here
  • Db2 global profile registry. Parameters affect all instances
  • Db2 profile registry at the instance node level. Parameters that are only valid for one node
  • Db2 Instance List. List of all the item names available on the system.
DCS
Database Connection Service. The DCS directory is used by Db2-Connect. Entries in the DCS can be made with the command: "CATALOG DCS DATABASE".
DDCS
Distributed Database Connection Service. Gateway component for access to other DRDA systems. Installation of Db2 Connect Enterprise is required for this. The DDCS component includes various BND and LST files that allow the packages required by the CLP to be linked to the remote RDBMS .
DPF
Data partitioning feature. System function for the administration of partitioned Db2 objects.
DMS
Database managed space. Parameters of a tablespace. This determines that the tablespace is managed by the Db2 instance.
copy
Synonym for instance. See definition of terms there.
Federated server
An instance can be set up as a federated server. This, together with the installation of a wrapper on DBMS from other manufacturers such as B. Oracle or Microsoft SQL Server as well as ODBC data sources can be accessed. See also federated information system .
Instance
In German Instance. Synonym for copy and the term for a Db2 subsystem in the mainframe world. From the point of view of the operating system, an instance is a program with a configuration environment that can be started by the operating system. Several instances can be installed on one computer. The instance is started by the operating system under a specific user. This user must be registered in the operating system. He is the owner of the instance. One instance can manage several local databases. It can also access databases from other entities via the Database Directory.
Node
In German knot or computer knot. This means a computer on which a Db2 installation is available.
Node Directory
In German: Node Directory. List of nodes on which other Db2 systems exist and which are to be accessed by the local Db2 system. If no connections to remote databases have been set up, then there is no node directory or the node directory is empty.
LDAP
Lightweight Directory Access Protocol. It is a standard access method for a directory service. For Db2 this means that the database directory and the node directory no longer have to be stored locally on each machine, but can be stored on a central LDAP server. To use an LDAP server, all participating servers must be registered with the LDAP server. The “CATALOG LDAP” commands are available in Db2 for setting up.
SMS
System managed space. Parameters of a tablespace that specifies that the tablespace is managed by the operating system.
Wrapper
A wrapper enables a connection to a remote data source. The prerequisite is that the server is parameterized as a "Federated Server". In version 8, wrappers can be set up for the following data sources: BioRS, BLAST, CTLIB (SYBASE), DRDA (Db2), Entrez, Excel, HMMER, Informix, NET8 (ORACLE), ODBC, OLE DB, table-structured files, Teradata, Web services, WebSphere Business Integration, XML.

Tools

Tools (Db2 z / OS)

Tools that run under the ISPF user interface:

  • Workload Manager : Component that controls access to the resources for work on az / OS.
  • SPUFI : program for executing SQL statements
  • QMF: Program for executing SQL statements. QMF also has some formatting commands. A complete redesign for the QMF software has been announced. In the future, the software should be a client tool with a graphical user interface.
  • InSync for Db2: Macro 4 tool, with which Db2 data can be processed and changed interactively.
  • File-AID for Db2: Tool for creating, deleting, saving, loading tables, tablespaces and data.

Tools that run on a client e.g. B. run under Windows and access the Db2 database server via Db2 Connect:

  • Visual Explain: Tool for displaying the access path of individual SQL statements. Visual Explain runs on Windows and is offered as a free download by IBM. However, the installation of Db2-Connect is required for use.
  • Performance Estimator: Tool for estimating the performance of SQL statements.
  • Control Center: Tool for the administration of the database via a graphical interface
  • Development Center: Development environment for creating PL / SQL and Java procedures
  • Warehouse Manager: Tool for configuring ETL components such as B. Materialized Views
  • Replication Center: Tool for the configuration and administration of data replications to other relational or non-relational databases.
  • QMF: Program for executing SQL statements. QMF is available on ISPF and Windows.

Tools (Db2 LUW)

  • CLP = command line processor. In German: command line processor. Environment for the execution of SQL statements and Db2 commands for the administration of the database. The CLP is called with the command: "Db2". You can switch back to the command level of the operating system at any time without losing the connection to the database with the command: "quit".
  • Db2 Connect: Database programming interface that enables a client to access a database server. Each installation needs its own node directory and its own database directory. This defines the DB servers to which a connect can be made and the databases managed there.
  • Control center: tool for administration of the database
  • Visual Explain can also be used with Db2 LUW.

Scalability (Db2 LUW)

The DBMS can be installed on different hardware configurations.

If a small volume of data is to be managed and only a few transactions place their demands on the DBMS, then an installation on a single processor machine is a cost-effective solution.

With a larger number of transactions to be processed, a multiprocessor machine would be the next largest unit for increasing performance. The system can process the evaluation of a single query on multiple processors in parallel. This speeds up both the handling of the entire transaction volume and the execution of individual, processing-intensive queries. However, not all queries are suitable for parallelization. The DBMS manages the distribution of the work steps to the individual processors ( symmetrical multiprocessor system ). This hardware configuration is also known as shared-everything architecture , because the individual processors have to share all other hardware components with one another.

If the use of the other hardware components (disk storage, controller, main memory) turns out to be a bottleneck, then the performance can be further increased by distributing the DBMS to several single-processor machines . This hardware configuration is also known as Shared Nothing Architecture , since each processor has its own main memory, controller and disk storage. If the DBMS is installed on several computer nodes, then database partitioning must be used. This means that one (or more) database partitions are set up on each computer node. A single DBMS can be installed on a maximum of 512 computer nodes.

In order to provide the maximum hardware performance for a DBMS, multi-processor machines can be selected for the individual computer nodes . Such a hardware architecture is also known as an SMP cluster .

Partitioning

Db2 offers various concepts to facilitate the administration of large amounts of data and to increase access performance through parallel processing. One of these concepts is partitioning.

Database partitioning

Database partitioning is only available for Db2 LUW. It must be used if a Db2 instance is installed on several computer nodes. At least one database partition must be set up on each computer node. It can also be used if the Db2 instance is installed on only one computer node. When installing on a computer node, database partitioning is particularly worthwhile if several hard drives are installed on it. A separate database partition can be set up on each hard disk. In this way, the data can be evenly distributed over several hard drives. The parallelization of disk access means that data can be accessed more quickly.

After the database partitions have been defined, they must be combined into partition groups. A partition group can consist of one or more database partitions. A database partition can belong to several partition groups. The partition groups can overlap.

Example:

There are database partitions p1 to p9

The partition groups g1 to g6 are set up:

g1 = (p1)

g2 = (p2)

g3 = (p3, p4, p5)

g4 = (p3, p4, p5, p6, p7)

g5 = (p6, p7, p8, p9)

g6 = (p9)

When creating a table space in this environment, it must always be specified in which partition group the table space is to be created. In the example, table spaces for smaller tables can be created in partition groups g1 and g2. Table spaces for larger tables can e.g. B. be created in the partition group g4. Any tables created in this table space are automatically created on all database partitions of the named group. In this way, the data in a table can be distributed over several computer nodes.

The distribution takes place implicitly by generating a partition key using a hash algorithm . The system itself selects the columns that are used to generate the hash key. In most cases, the primary key or part of it is used. If the table is defined without a primary key, the first table column is used. The distribution assignment can be checked and, if necessary, changed with the administration tool SQLUGTPI.

Table partitioning

Table partitioning is a function that can be used in Db2 z / OS from Version 8. With the Db2 LUW it is also available from version 8.

In the English literature, the terms 'table-controlled partitioned', 'table partitioned' or 'data partitioned' are used for this.

With table partitioning, when a table is created, it is determined how many partitions this table should have and according to which distribution criterion the data is to be distributed over the individual partitions. A different table space (tablespaces) can be specified for each table partition. If the table areas were created on different storage disks, then parallelization (and thus acceleration) of disk accesses can be achieved.

The assignment of the data to the individual partitions must be specified explicitly when defining the table.

Example 1:

create table t1 (verkaufsdatum date, umsatz decimal(10,2))
in ts1, ts2, ts3, ts4, ts5
partition by range(verkaufsdatum)
(starting from ('01.01.2007') ending at ('31.12.2007') every (3 month));

ts1 through ts5 are the table spaces that already exist.

Example 2:

create table t1 (verkaufsdatum date, umsatz decimal(10,2))
partition by range(verkaufsdatum)
( starting from ('01.01.2007') ending at ('31.01.2007') in ts1,
  ending at ('31.05.2007') in ts2,
  ending at ('01.06.2007') in ts3,
  ending at ('10.07.2007') in ts4,
  ending at ('31.12.2007') in ts5);

In the second variant, an uneven distribution of the data can be explicitly taken into account.

The partitioning limits can be changed later. Further partitions can be added and individual partitions can be deleted.

With Db2 z / OS version 8 and higher, all indices that are defined for this table can also be partitioned. If an index is also partitioned, then it is partitioned according to the same criteria as the table. The primary key, the partitioning and the sort order of the records in the table area (CLUSTER order) can be selected independently of one another. If all the indices of a partitioned table are also partitioned, the individual table partitions are completely independent. However, unique (UNIQUE) indices can only be partitioned if they contain all partitioning columns; this must be taken into account when choosing the primary key. A partition can then be administered (e.g. LOAD REPLACE, RECOVER, REORG, COPY) while programs are active on the other partitions. However, the prerequisite is that the programs can be parameterized in such a way that they only access the data of certain partitions. In the example above, the partition in ts1 can be loaded with LOAD REPLACE while a program is processing the data from December in ts5.

With Db2 LUW Version 9, an index cannot be partitioned. It can only be created in its own table area. A complete isolation of the individual table partitions (from the point of view of administration) as with Db2 z / OS is therefore not possible in Db2 LUW Version 9.

Index partitioning

This concept was the only way to partition data in Db2 z / OS up to version 7. For reasons of compatibility, this concept is still included in the current version, but the manufacturer recommends switching to table partitioning.

In the English literature the terms 'index partitioned' or - to put it even more clearly - 'index-controlled partitioned' are used for this.

With index partitioning, the number of partitions is specified when a table space is defined. Only one table can be created in this table area. When defining the index, the distribution key is determined according to which the data is distributed to the individual partitions. This index also determines the sorting order (cluster order) of the data and it is stored partitioned according to the same distribution key.

Example:

create tablespace ts1
numparts 5
in db01;

create table t1
( ID integer not null,
  verkaufsdatum date not null,
  umsatz decimal(11,2),
  primary key(ID) )
   in db01.ts1;

create index i1
on t1 (verkaufsdatum)
cluster
( part 1 values ('31.01.2007'),
  part 2 values ('31.05.2007'),
  part 3 values ('01.06.2007'),
  part 4 values ('10.07.2007'),
  part 5 values ('31.12.2007'));

In the example, db01 is the name of the database, ts1 is the name of the table space, t1 is the name of the table and i1 is the name of the index. The table space is defined for 5 partitions. The partition upper limits are set when the index is defined.

The disadvantage of index partitioning is that all other indexes cannot be partitioned. This severely restricts the separate administration of individual partitions. In addition, the partitioning key and the cluster order are inextricably linked. With table partitioning, these elements can be determined independently of one another.

Index types

Indices can be partitioned in Db2 z / OS and in Db2 LUW from version 9.7. Partitioning of indices is not possible in Db2 LUW up to and including V9.5. The following explanations about index partitioning therefore refer to Db2 z / OS and Db2 LUW from 9.7.

The introduction of table partitioning in Db2 z / OS V8 has resulted in a wide range of design options for the index design. The English terms used in the literature should be assigned to the German terms - if available - and briefly described.

Properties of indexes created on partitioned tables

partitioned

(partitioned)

nonpartitioned

(not partitioned)

partitioning

(partitioning)

*1 * 2
nonpartitioning

(not partitioning)

DPSI * 3

With the index partitioning of a table used in previous Db2 versions, only the cluster index could be partitioned. This is the type * 1. Type * 1 indices can of course also be created in the current version.

If further secondary indices were to be created for a partitioned table, then up to version 7 these could only be created without partitioning, i.e. as * 2 or * 3. As of version 8, all indices can also be partitioned to form a partitioned table. Unpartitioned indexes can still be created for a partitioned table. The partition independence is then limited.

  • Partitioned / Nonpartitioned. In German: partitioned / not partitioned. The abbreviation NPI stands for non-partitioned index. A distinction is made between whether the index itself is stored in several partitions or whether it is stored as a single contiguous structure. A non-partitioned index can be stored in a single dataset. If an index is partitioned, there is a separate index structure for each partition. An index can only be partitioned according to the same criteria as the table to which it refers. This means that if a table is not partitioned, no partitioned indexes can be created for this table.
  • Partitioning / nonpartitioning. In German: partitioning / not partitioning. An index is partitioning if it contains the columns of the table that are used to partition the table as first-most index columns. Example: a table is partitioned using the columns S1, S2. An index I1 over the columns S1, S2, S5 is partitioning, an index I2 over the columns S1, S3, S2 is not partitioning and I3 over the columns S7, S8, S9 is also not partitioning. This designation does not say anything about whether the index is stored partitioned or not.
  • Secondary index. In German: secondary index. Another name for nonpartitioning is non-partitioning. (Index types DPSI and * 3)
  • DPSI = data-partitioned secondary index. The index is partitioned, but not partitioning. So it is stored in n partitions, even though its topmost index columns do not match the columns that make up the partitioning of the table. A DPSI cannot be defined as UNIQUE. This is because a certain index key can occur in every partition, i.e. it can occur in every index structure. The uniqueness of a key could only be checked by accessing the index structures of all other partitions. If uniqueness is required, then a nonpartitioning index should be defined as nonpartitioned. (Index type * 3)
  • logical / physical index partition. Both terms refer to the set of all keys that the index stores that reference the same table partition. If the index itself is partitioned, then it is a physical index partition because each index partition is actually stored in a separate file (s). They are logical index partitions when the index is not partitioned. The keys from the individual logical index partitions are mixed and stored in a single index structure.

Further properties of indexes (concerns indexes for partitioned tables and also indexes for non-partitioned tables)

  • clustering. If several indexes are created for a table, then one of them can be defined as a cluster index. This means that when the table is reorganized, the data records are stored in the order given by this index. Since version 8 and higher the cluster sequence no longer necessarily has to match the primary key and the partitioning criteria, a non-partitioning index can also be selected as the cluster index. The sort order always refers to the records within a partition. A cluster index can be unique or nonunique.
  • unique / nonunique. In German: Unique / Not Unique Index. A key can only appear once in a unique index. If the index is not unique, a key can appear more than once.
  • padded / non padded. Specifies how VARCHAR data fields are stored in Index. With a padded index, VARCHAR data are expanded to their full length by padding them with blanks. The length information does not have to be saved for this. A non-padded index only saves VARCHAR data in the length in which it is actually available. The length information is also saved here. Up to Db2 z / OS V7, VARCHAR data fields could only be saved padded in the index. From version 8 this option is available.
  • ascending / descending. In German: ascending / descending sort order. The sorting order was an important parameter in earlier Db2 versions, as the leaf pages could only be read sequentially in the defined sorting order. Since Db2 z / OS V8, the leaf pages of an index can be read sequentially in both directions.

SQL PL

The stored procedure engine in Db2 implements a subset of ISO-standardized SQLPM (SQL Persistent Modules). It is similar to Oracle PL / SQL and PostgreSQL PL / pgSQL , but, for example, ROWTYPES, simply structured tables, are not supported.

Such as Oracle and PostgreSQL also offers Db2 the possibility of additional languages on the database side in stored procedures ( stored procedures ) to use. Db2 offers C ++ and Java here .

Utilities

statistics

RUNSTATS is a utility for generating statistical data on the contents of Db2 tables and their indices. For example, the minimum and maximum values ​​of a column and the cardinality of the columns and the table are determined.

This data is stored in the Db2 system catalog, a collection of tables in which Db2 stores information about all objects, such as tables, indices, columns, etc. ( self descriptive , ie "self-describing").

The database management system uses this statistical data in order to use the best possible access path for the access to the DB2 tables implemented by the user. For example, index access is usually pointless if the entire table contains only a few rows; simply reading all the data is then faster. In addition to the data for the access path ( optimizer ), data for administration are also determined, e.g. B. the quotient of the memory pages used or the degree of compression.

RUNSTATS should always be executed when the contents of tables have changed significantly or when new indices have been created. Afterwards, the referencing DB2 packages must also be re-linked with static SQL , since the access paths are stored in them.

The tool can be executed for table areas (tablespace) or indices and can also run embedded within the framework of other administrative utilities (REORG, LOAD).

JCL example for Db2 ( z / OS ):

//RSTAT    EXEC DSNUPROC,UID=’JCA.RUNSTA’,TIME=1440,
//         UTPROC=’’,
//         SYSTEM=’V71A’
//UTPRINT  DD  SYSOUT=*
//SYSIN    DD *
RUNSTATS TABLESPACE DSNXXX.DSNABCDE
   TABLE(ALL) SAMPLE 25
    INDEX(ALL)
  SHRLEVEL CHANGE

Here 25% of the rows are examined in the tablespace DSNXXX.DSNABCDE in all tables and indices (SAMPLE 25) . A parallel update by other processes is permitted (SHRLEVEL CHANGE) .

Under Unix a command could look like this:

RUNSTATS ON TABLE beispielschema.beispieltabelle
WITH DISTRIBUTION AND DETAILED INDEX

reorganization

The REORGCHK utility can be used to determine whether tables or indexes can be reorganized. In addition, the statistical data of a table can be updated in a simplified form in analogy to the RUNSTATS utility. REORG is a utility for reorganizing Db2 tables or indices (Unix and Windows versions) or tablespaces (mainframe). The data is stored in an optimal way on the permanent memory. The optimal way is determined by the cluster index. The memory pages are optimally filled and lines that are not on their original page (Far-Of-Page, Near-Of-Page) are moved back to the best possible position.

The utility can work offline or online. In the offline variant, the data is sorted according to the clustering index and saved in temporary files. The tablespace is then created again and the data - now sorted - is entered again in the table area. All indices are then rebuilt.

With the online variant, a new tablespace ("shadow copy") is created and the data is successively transferred to the new area. Changes made in the meantime are then incorporated from the recovery log, with a work table (mapping table) used to assign old to new internal record codes (record identifiers). If all changes have been taken into account, a "switch" takes place, after which Db2 accesses the new table area from now on. The old area is discarded.

In addition to reorganization, backup copies (COPY) and statistical data (RUNSTATS) can be determined.

JCL example for Db2 ( z / OS ):

//REORG    EXEC DSNUPROC,UID=’JCA.REORG’,TIME=1440,
//         UTPROC=’’,
//         SYSTEM=’V71A’
//UTPRINT  DD  SYSOUT=*
//SYSIN    DD *
REORG TABLESPACE DSNXXX.DSNABCDE

The tablespace DSNXXX.DSNABCDE is reorganized here.

Another utility can be used to determine the necessity: REORGCHK.

Index check

The Db2 utility checks whether the database index on the tablespace to be checked is consistent with the data to which the index is applied. However, the utility only logs inconsistent states, but does not eliminate them and does not set any pending status.

JCL example for Db2 ( z / OS )

//STEP1 EXEC DSNUPROC,UID=’IUIQU1UQ.CHK1’,
// UTPROC=’’,
// SYSTEM=’DSN’
//SYSUT1 DD DSN=IUIQU1UQ.CHK3.STEP1.SYSUT1,DISP=(MOD,DELETE,CATLG),
// UNIT=SYSDA,SPACE=(8000,(200,20),,,ROUND)
//SYSIN DD *
  CHECK INDEX (ALL) TABLESPACE DSN8D81A.DSN8S81E
//*

Index new building

REBUILD INDEX must be executed for all tablespaces after a point-in-time recovery , unless the index is also backed up with image copy. In this case, a RECOVER INDEX could take place at the same time instead .

JCL example for Db2 ( z / OS )

//STEP1 EXEC DSNUPROC,UID=’IUIQU1UQ.RBLD’,
// UTPROC=’’,
// SYSTEM=’DSN’
//SYSUT1 DD DSN=&SYSUT1,DISP=(,DELETE),
// UNIT=SYSDA,SPACE=(8000,(200,20),,,ROUND)
//SYSIN DD *
  REBUILD INDEX (ALL) TABLESPACE DSN8D81A.DSN8S81E
//*

Here all indices are rebuilt from the tablespace DSN8S81E of the database DSN8D81A .

Integrity test

The CHECK DATA utility checks for violations of referential integrity and constraints . If such a violation is detected, the corresponding tablespace is set to the "CHECK-pending" status. As an option, incorrect data can be deleted from the tablespaces and transferred to error tables (exception tables). Before a CHECK DATA is executed, a CHECK INDEX should be run on this tablespace to ensure that the index to which the utility refers is correct. Encrypted tables cannot be checked with this utility because the data is not decrypted before the CHECK. CHECK DATA should be executed after a conditional restart or a point-in-time recovery for all tablespaces in which interdependent tables may not have synchronized.

JCL example for Db2 ( z / OS )

//STEP11 EXEC DSNUPROC,UID=’IUIQU1UQ.CHK2’,
// UTPROC=’’,
// SYSTEM=’SSTR’
//SYSUT1 DD DSN=IUIQU1UQ.CHK2.STEP5.SYSUT1,DISP=(MOD,DELETE,CATLG),
// UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND)
//SORTOUT DD DSN=IUIQU1UQ.CHK2.STEP5.SORTOUT,DISP=(MOD,DELETE,CATLG),
// UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND)
//SYSERR DD DSN=IUIQU1UQ.CHK2.SYSERR,DISP=(MOD,DELETE,CATLG),
// UNIT=SYSDA,SPACE=(4000,(20,20),,,ROUND)
//SYSIN DD *
   CHECK DATA TABLESPACE DBIQUQ01.TPIQUQ01 SCOPE ALL
   AUXERROR INVALIDATE
/*

Monitoring of utilities (Db2 z / OS)

The command "Db2 DISPLAY UTILITY (* | $ UTILID $)" is used to monitor the current status of a started Db2 utility . A result is then structured as follows: JCL example for Db2 ( z / OS ):

DSNU100I – DSNUGDIS – USERID = SAMPID
   MEMBER = DB1G
   UTILID = RUNTS PROCESSING UTILITY STATEMENT 1
   UTILITY = RUNSTATS
   PHASE = RUNSTATS
   COUNT = 0
   NUMBER OF OBJECTS IN LIST = n
   LAST OBJECT STARTED = m
   STATUS = STOPPED
   DSN9022I – DSNUGCC ’-DISPLAY UTILITY’ NORMAL COMPLETION

Explanation of the STATUS field:

  • STOPPED = The utility was terminated abnormally and the objects that were accessed (tablespace and index) are still being accessed by the aborted utility. To make these objects accessible again, either the utility must be restarted or the utility terminated using the Db2 command "-TERM UTILITY ($ UTILID $)" .
  • TERMINATED = A termination request is being processed.
  • ACTIVE = The utility is in the active state and is currently running.

Extensions

The functionality of Db2 can be expanded with so-called extenders, which offer expanded functionality for specialized data types.

Net Search Extender

For efficient full-text search on columns with text content.

Spatial Extender

For geographic problems (flood zone analysis etc.)

Object Relational Extender

Extension for working with objects based on RDBMS

XML extender

With the Db2 LUW V9, the processing of XML-structured data has been fundamentally changed. The Db2 z / OS V8 has also been expanded to include a number of functions for processing XML data.

XML support in the SQL language is now also described by the ISO.

Functions for querying XML documents and for extracting information for storage in relational tables:

Functions for generating XML documents from data records that are stored in relational tables: (Examples)

  • XMLELEMENT
  • XMLFOREST
  • XMLAGG

At the end of 2008 there was still no standard definition for the update of individual elements of an XML document, which is why Db2 only allows the update of entire XML documents.

OLAP extenders

Support of the SQL extensions for OLAP applications. Examples:

  • RANK
  • DENSE_RANK
  • ROW_NUMBER

history

In the course of the development of the relational user interface “Structured Query Language” ( SQL ), the first prototype, the so-called System R, was developed internally at IBM (1975–1979), which was first installed in 1977 for an IBM customer. From this experience, SQL / DS was developed for DOS / VSE (today z / VSE ) and VM (today z / VM ).

At the same time, the independent product line of the mainframe MVS (today z / OS ) was promoted. Db2 was released in 1983 as a database system for MVS and was thus the second available relational DBMS on the market (after the Oracle market launch in 1979) .

In 1987, IBM announced a database management system for the OS / 2 operating system . In 1991 the DBMS 'OS / 2 Database' for OS / 2 was added to the product range. This DBMS is the forerunner of today's Db2 UDB. In 1993 the DBMS was offered under the name Db2 V1 for the operating systems OS / 2 and AIX . Since 1995 the DBMS can also be installed under Windows.

In the meantime, Db2 products exist on different system platforms such as Db2 Universal Database (UDB) for z / OS , Db2 / VM / VSE, Db2 Universal Database (UDB) for LUW ( AIX , HP-UX , Linux , Sun Solaris , Windows (XP / 2000/2003)) and Db2 Universal Database (UDB) for iSeries ( System i5 , formerly AS / 400 ).

In the mainframe area, Db2 has largely replaced the hierarchical database system IMS / DB from IBM.

In April 2009, EnterpriseDB and IBM signed a contract to make Oracle compatibility for Db2 available on the basis of an EnterpriseDB technology.

History (Db2 z / OS)

1983 Db2 was released as a database system for the mainframe operating system MVS .

Version 3 (available since December 1993, support until March 2001)

  • Index type 2. The partitioning already exists from version 1, but the processing of the partitions was not independent of one another. This was due to the index locks that were caused by INSERT, UPDATE and DELETE. As a result, the data from other partitions was also locked. When a program did a lot of writes to one partition, programs that accessed other partitions were affected. This problem was only resolved in version 4 through the introduction of index type 2.

Version 4 (available since November 1995, support until December 2001)

  • Stored procedures and user-defined functions can be used. The programs must be in a programming language such. B. COBOL or C can be written (with embedded SQL). In these programs, non-DBMS resources (e.g. VSAM files) can also be accessed.
  • Parallel processing possible when evaluating a single query.
  • the isolation level can be specified for a single SQL statement, which differs from the isolation level that the entire program (package) has.
  • the insulation level dirty read is supported
  • Locking of individual data lines possible. Up to now it was only possible to lock data pages (at least 4 kB).
  • Locking of index entries is no longer necessary with INSERT, UPDATE, DELETE. This will reduce the time it takes to wait for locked data. Programs that are active on different partitions no longer hinder each other.
  • Maximum size of a table: 64 GB. Maximum number of partitions for a table space: 64

Version 5 (available since June 1997, support until December 2002)

  • Storage of data in ASCII format possible. So far, data could only be saved in EBCDIC format. This brings a performance improvement for all client programs that process the data in ASCII format.
  • the scope of the SQL language is expanded to include the CASE statement
  • the online reorg enables the table data to be reorganized while programs access this data (read and write). This significantly reduces the downtime for the applications.
  • Maximum size of a table: 1 TB. Maximum number of partitions for a table space: 254

Version 6 (available since June 1999, support until June 2005)

  • Triggers can be used
  • LOB data types (= Large Objects ) can be saved. This allows pixel, audio, video data and text documents to be saved in tables.
  • Maximum size of a table: 16 TB. Maximum number of partitions for a table space: 254

Version 7 (available since March 2001, support until June 2008)

  • Storage of data in Unicode format possible
  • SQL / PL, the SQL3 extensions of the SQL language to include procedural elements. This means that stored procedures can be written in the SQL programming language.
  • Temporary tables can be declared. Each connection receives its own instance as soon as it accesses the table. At the end of the transaction, the instance is automatically removed again.
  • Static scroll cursor. Reading forwards and backwards possible. Changes to other transactions update the amount of data read by the cursor. (Exception: newly inserted records from other transactions are not visible)
  • Maximum size of a table: 16 TB. Maximum number of partitions for a table space: 254

Version 8 (available since March 2004)

  • Table partitioning. For details see section Partitioning. The administration of large databases is made easier and the downtime of the applications is further reduced.
  • Internal 64-bit addressing. This means that it is no longer necessary to swap the working memory areas. The previous direct addressability of a maximum of 2 GB is no longer applicable.
  • Many structural changes are possible during operation, which previously required deletion and recreation. Examples: Extending the table or index by additional columns, changing the data type of table columns even if indices exist for them, adding, deleting or changing partitions.
  • The maximum size of a single SQL statement has been increased from 32 KB to 2 MB.
  • Use of the new Assist Processor (= co-processor) zIIP
  • Dynamic scroll cursor. Extension of the static scroll cursor from V7, with which newly inserted records of other transactions can now be read.
  • Visual Explain is offered as a free download. It can be installed under Windows and can display the access paths of the packages.
  • Maximum size of a table: 128 TB. Maximum number of partitions for a table space: 4096

Version 9 (available since March 2007)

  • Extended data types (64 bit) BIGINT, DECFLOAT, VARBINARY
  • New SQL functions
    • OLAP support: RANK, DENSE_RANK, ROW_NUMBER
    • Set operations INTERSECT (intersection) and EXCEPT (difference set)
    • MERGE statement as a combination of INSERT or UPDATE
  • TRUNCATE Table for quick deletion of all data records of a table without destroying the data structures, the index structures and the triggers
  • Hidden last-change timestamp makes it easier to implement optimistic locking
  • INSTEAD OF trigger
  • Maximum size of a table: 128 TB. Maximum number of partitions for a table space: 4096

Version 10 (available since October 2010)

  • XML functions with which individual nodes within an XML data field can be changed (INSERT, REPLACE, DELETE)
  • improved access performance through improved parallel processing, among other things
  • various administrative work can now be carried out during ongoing operations that previously required downtime.

Others

Besides Db2 provides IBM with Informix to another commercial database. After acquiring Informix in 2001, there was a brief plan to merge Informix with Db2, which caused some uncertainty among Informix customers. However, this plan was soon abandoned. Since then, Informix and Db2 have been further developed in parallel and positioned for different areas of application. However, new technologies were often carried over into the other system.

See also

Web links

Individual evidence

  1. IBM Db2 system properties. Retrieved April 13, 2019 .
  2. IBM Db2 12 for z / OS - Overview - United States. September 30, 2019. Retrieved September 30, 2019 (American English).
  3. Express-C (free version)
  4. ^ Graig S. Mullins: DB2 Developer's Guide . 2004, p. 722.
  5. Cooperation with MySQL AB ( Memento of the original from October 14, 2007 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www-05.ibm.com
  6. a b 6.3.3.2 DB2 UDB for z / OS V8 Application Programming and SQL Guide . IBM Library Server
  7. DB2 for z / OS - DB2 for z / OS Version 8 books ( Memento of the original from January 27, 2007 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www-306.ibm.com
  8. DB2 for z / OS - DB2 for z / OS Version 8 books ( Memento of the original from January 27, 2007 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www-306.ibm.com
  9. ibm.com
  10. IBM Redbooks
  11. ^ IBM DB2 Administration Guide Planning
  12. ^ SQL & XML Working Together
  13. research.ibm.com
  14. ^ The Big Picture: IBM DB2 Information Management Software and DB2 Universal Database
  15. Collaborate to Integrate Technology in New Version of DB2. Retrieved September 27, 2011 .
  16. Sources: 1. Reference manuals for DB2 z / OS from IBM for the respective versions. 2. What's new documents from IBM on the individual versions. 3. Redbooks What's new in DB2 z / OS V8