Original list

from Wikipedia, the free encyclopedia

The original list , also known as a series of observations , is the direct result of data collection in the field of statistics , i.e. the original recording of the observation or measurement values . The values ​​in the original list have not yet been calculated, except for the translation of the perceptions into numbers through the measurement . That is why the individual value is called the original value and all the original values ​​together are called original data , primary data or raw data . In addition to these characteristic values, information can also be listed as to which characteristic carrier the characteristic values ​​are assigned to. If the values ​​are listed randomly in their order or according to their chronological order of observation, it is an unsorted original list . If the order is based on any order, then there is a sorted original list (also primary table ). One possible order would be the alphabetical order of the feature carriers or the order according to the size of the feature expression in one of the recorded feature areas.

The identification of the feature carrier can be omitted both in the original list and in the primary table. The following are examples with and without this identification of the feature carriers.

example

Example of an unsorted original list:

Feature carrier number of children
John Doe 1
Frederik Pig 0
Bea example woman 2
Piggeldy pig 0
1 0 2 0

Example of a primary board:

Feature carrier number of children
Frederik Pig 0
Piggeldy pig 0
John Doe 1
Bea example woman 2
0 0 1 2

The feature carriers are often designated with a code number. These codes of the unordered original list can be reassigned one after the other in the ordered original list. It is suggested to indicate this change in the digits by putting them in brackets.

Unsorted original list:

Feature carrier number of children
1 1
2 0
3 2
4th 0

Primary board:

Feature carrier number of children
(1) 0
(2) 0
(3) 1
(4) 2

Advantages and disadvantages

The original list contains all observation values ​​and therefore no omissions, transmission errors or lost information ( advantages ). On the other hand, original lists can in practice contain thousands or millions of data records which, taken in isolation, are confusing and cannot be evaluated; In addition, an uncorrected original list can still contain obvious errors such as rotated numbers or impossible data ( disadvantages ).

In practice, the data in an original list must therefore be prepared in order to fulfill its purpose. This is usually done by forming frequency distributions .

Many of the statistical parameters that are intended to depict or summarize the properties of such a frequency distribution accept reducing information. This is both an advantage and a disadvantage. If the data is not too extensive, you can also keep a tally.

Prussian census

The term "original list" appears early in connection with censuses. It is not addressed as a step in the mathematical processing of statistical data. If one nevertheless wants to classify the following original list in the sense of the static concept of the original list, it could be described as an original list with mainly nominally scaled features. Only the number of residents in a house is scaled in absolute terms.

Columns of the original list for the Prussian census on December 3, 1864
Continuous no. House number First and last name Stand or trade Year of birth religion Number of house residents Date of recording Remarks

Web links

Individual evidence

  1. a b c d Günther Bourier: Descriptive statistics. Practice-oriented introduction with tasks and solutions . 9th edition. Gabler Verlag, Wiesbaden 2011, ISBN 978-3-8349-2763-7 , p. 34–35 ( limited preview in Google Book search).
  2. a b Hans-Joachim Mittag: Statistics: An interactive introduction . 6th edition. Springer-Verlag, Berlin 2011, ISBN 978-3-642-17817-7 , pp. 12 ( limited preview in Google Book search).
  3. Uwe W. Gehring, Cornelia Weins: Basic course statistics for political scientists . 5th edition. VS Verlag für Sozialwissenschaften, Wiesbaden 2009, ISBN 978-3-531-53193-9 , p. 120 ( limited preview in Google Book search).
  4. ^ A b Siegfried Schumann: Representative survey. Practice-oriented introduction to empirical methods and statistical analysis processes . 4th edition. Oldenbourg Wissenschaftsverlag, Munich 2006, ISBN 3-486-58070-1 , p. 137–138 ( limited preview in Google Book search).
  5. Jörg-D. Meißner: Understand statistics and use them sensibly. Application-oriented introduction for economists . Oldenbourg Wissenschaftsverlag, Munich 2004, ISBN 3-486-20035-6 , p. 38 ( limited preview in Google Book search).
  6. Georg Bol: Descriptive Statistics: Textbook and Workbook . 6th edition. Oldenbourg Wissenschaftsverlag, Munich 2004, ISBN 3-486-57612-7 , p. 27 ( limited preview in Google Book search).
  7. ^ Benjamin R. Auer, Horst Rottmann: Statistics and Econometrics for Economists: An Application-Oriented Introduction . 2. revised u. act. Edition. Gabler Verlag, 2012 (September 6, 2011), ISBN 3-8349-2971-9 , p. 13.
  8. limited preview in the Google book search