Dataset (IBM mainframe)

from Wikipedia, the free encyclopedia

A dataset is a file running on a IBM - mainframe system exists. A dataset name (DSN) can be a maximum of 44 characters long and consists of several qualifiers (name parts) separated by periods. Each qualifier can have a maximum of eight characters. Example: MY.PRIVATE.TEST.DATASET.V1

In application programs, a dataset is usually not accessed directly via the dataset name. Instead, access takes place via a logical name (also known as a data definition (DD) name) that refers to a corresponding DD statement of a job that contains the DSN and, optionally, further information on processing.

Datasets can exist in different file organization forms:

  • Direct Access Dataset : The relative address of a dataset is calculated from the key using a hash function .
  • HFS dataset: A disk area that is intended to hold Unix files (in Unix terminology one would say: a loopback file system).
  • ISAM -Dataset (Indexed Sequential Access Method): Obsolete form of organization that has been almost completely replaced by VSAM. An ISAM dataset consists of three physical files (PRIME, INDEX and OVERFLOW).
  • Partitioned Data Set (PDS and PDSE): A file organization in which the data set contains a directory with member names, each member representing a single sequential file.
  • Sequential files: With this file format, the data is written or read sequentially from the beginning of the file to the end of the file.
  • VSAM organizational forms: The VSAM operating system component provides different organizational forms, the most powerful of which, KSDS (key-sequenced dataset), supports key-based access to the individual data records. VSAM-organized files are also called VSAM clusters; For the individual VSAM organizational forms see the article VSAM .

The term dataset is only used for files on the mainframe that were created under the MVS personality . Files that were created under Unix System Services (i.e. within an HFS dataset) are generally not referred to as datasets.

The metadata ( file attributes ) of the datasets are partly in the VTOC (for disk storage devices ) or tape label (for magnetic tapes), partly in the catalog .

Sequential files can be versioned as Generation Data Groups (GDG).


  1. In everyday life, the term "dataset" is often used in the sense of "dataset member".