from Wikipedia, the free encyclopedia

A data record ( listen ? / I ) is (for example according to Mertens) a group of content-related data fields (belonging to an object) , e.g. B. Item number and item name. Data records correspond to a logical structure that was defined during software development (e.g. in the conceptual schema of data modeling ). Audio file / audio sample

In data processing , data that has been combined into data records is stored in databases or files . They are the subject of the processing of computer programs and are generated, read, changed and deleted by them (see CRUD ). When inputting the content of data records is often presented in the form of a form , when outputting or displaying also in list form, possibly with only part of the data fields.

Even non-electronic data can be combined to form data sets, for example, is an index card in a card file , a record.

In addition to the meaning of "record" in the narrow sense - as a specific collection data (. Eg address data of customer Müller) - the term in the context of software development as a type phrase in the Declaration uses of data; Example: data record 'address data'. Depending on the methods, programming languages, etc. used, terms such as record, entity type , class , tuple , structure, compound , etc. are also used instead of 'data record' . Their logical structure is determined within the framework of the conceptual schema of data modeling .

Delimitation: Although data actually always appear as a sequence of several data elements , not all forms of data are called 'data sets', but only data groupings that belong to a certain object and that have identical structures within a data set . The data fields 'Name', 'Address' and 'Date of birth' could therefore form a data record for a person. Not as records in this sense are, for example: running text , printer or video streams , content of executable files , photo data or the data of graphics software .

Different meaning in the statistics

In summary, the data record in computer science - as described above - denotes a one-dimensional, structured sequence of attributes of an element of a superordinate set (e.g. an index card in a card index, an order from a database for orders, a line in an address list).

In contrast, the data set in statistics describes the entirety of data in a certain context. Here it is synonymous with database (a sentence or a collection of data, also from the translation of a data set , formerly used for files at IBM ) - for example, all the data determined in a statistical survey or the " tax evader CD ".

Variants in storage

In general, a data set corresponding to the means of expression of programming languages declared , the data type composite or record, possibly within an associative array . The mathematical model of a data set is a tuple .

Numerous distinctions can be made with regard to the storage of data sets. For example:

  • Storage in normal files : There are alternatives such as:
    • User-defined individual data formats and structures,
    • CSV files with field separators such as semicolons or similar,
    • XML format in the form <field name = field content> and with further structure-related, textual information,
    • RDF format for Internet information.
  • Storage in databases:
    • In relational databases , records are stored in tabular form, with a record i. d. Usually corresponds to a table line.
    • Column-oriented databases do not store all data fields for each data record one after the other, but the contents of all data records for each data field, so a column (with all its rows) stands for 'data record'.

In detail, however, 'data set' is not a technical, but a logical term for which there are numerous technical forms of appearance and implementation. Data records can be differentiated according to the following characteristics / properties:

  • General validity: Bindingly defined structures and formats (such as binary data or text, length, other rules), e.g. B. for certain software solutions (as in the DTA process ) vs. formats individually defined by the user .
  • Record / data field length: Fixed and uniform length per field vs. fields of variable length (e.g. with field separators as in CSV or with field length information); accordingly leads to data records with fixed or variable length.
  • Used character encoding : only text characters eg. B. in ASCII code vs. other data types in a binary code .
  • Field attributes: only net data vs. further information per field (such as bold, underline, font, etc., mostly not visible).
  • Data record limitation: end of record marking vs. fixed record length.
  • Homogeneity: Uniform vs. different types of data in the same file, recognizable e.g. B. by a data field 'Record type'.

In spreadsheet applications , a data record is usually represented by a row, or alternatively by a column, depending on the arrangement. A classic example of a data record is a punch card .

The following applies to electronic data: They exist in their storage medium as bit / byte sequences of any length. In this amount of data, the individual data records and data fields are identified and addressed by methods usually provided by the programming language and / or the system software and displayed, for example, in rows and columns.

Logical / physical data records: As a rule, several data records are combined into larger storage units on electronic data carriers . Such units are called differently depending on the computer system, for example as a 'page' (page, in many database systems) or as a block (in conventional storage). In a computer program, the processing of the individual data record is preceded by routines (mostly of the operating system or the DBMS ), which carry out the actual reading or writing on the data carrier in blocks / pages for reasons of optimization and iteratively position the individual data record within the data block and for processing in the Provide main memory .

See also

Individual evidence

  1. P. Mertens et al .: Fundamentals of Information Systems . 5th edition. Springer Verlag, Berlin 1998, p. 59
  2. data set . In: GablerWirtschaftslexikon
  3. ^ Henry Herper: Computer modeling. (PDF) Uni Magdeburg, 2004, accessed on March 11, 2014 .
  4. data set .
  5. Dr. Henry Herper Computer Modeling [1] page 46 Data Modeling - Layer Model (2004)
  6. Techtarget [2] data set
  7. Number of documents and data records in the Berlin data portal (PDF)
  8. Row and column oriented databases . Elite IT specialist
  9. Sebastian Dworatschek: Basics of data processing . Chap. 1.2.1 Logical and physical sentences