Standardization (statistics)

Densities of a standardized (blue) and two non-standardized normal distributions (red and purple)

Under standardization (in introductory statistics courses it is sometimes referred to as z-transform hereinafter) is understood in the mathematical statistics , a transformation of random variables , so that the resulting random variable the expected value zero and variance one has. The standard deviation corresponds to the root of the variance and is therefore also equal to one. The standardized random variable is often called the z-score or z-value and forms a foundation for the construction of statistical tests .

Standardized random variable

A standardized random variable is a random variable with an expected value of 0 and a variance of 1.

Intended use

Standardization is e.g. B. necessary to be able to compare differently distributed random variables with one another. In addition , standardized random variables are necessary for some statistical methods, such as factor analysis .

Derivation of the mathematical formula

If a random variable with expected value and variance (and correspondingly standard deviation ), the associated standardized random variable is obtained by centering and then dividing by the standard deviation ${\ displaystyle X}$ ${\ displaystyle \ operatorname {E} (X) = \ mu}$ ${\ displaystyle \ operatorname {Var} (X) = \ sigma ^ {2}}$ ${\ displaystyle \ sigma = {\ sqrt {\ operatorname {Var} (X)}}}$ ${\ displaystyle Z}$

{\ displaystyle Z = {\ frac {X- \ mu} {\ sigma}}}

.

The following applies to the random variable obtained in this way : ${\ displaystyle Z}$

${\ displaystyle \ operatorname {E} (Z) = \ operatorname {E} \ left ({\ frac {X- \ mu} {\ sigma}} \ right) = {\ frac {1} {\ sigma}} \ left (\ operatorname {E} (X) - \ mu \ right) = 0}$
${\ displaystyle \ operatorname {Var} (Z) = \ operatorname {Var} \ left ({\ frac {X- \ mu} {\ sigma}} \ right) = \ operatorname {Var} \ left ({\ frac { X} {\ sigma}} \ right) = {\ frac {1} {\ sigma ^ {2}}} \ operatorname {Var} (X) = 1}$

If it is normally distributed with expected value and variance , then it is standard normally distributed , i.e. H. . ${\ displaystyle X}$ ${\ displaystyle \ mu}$ ${\ displaystyle \ sigma ^ {2}}$ ${\ displaystyle Z = {\ frac {X- \ mu} {\ sigma}}}$ ${\ displaystyle Z \; \ sim \; {\ mathcal {N}} (0,1)}$

Differentiation from studentization

In many statistical programs such as SPSS and Statistica , the option of standardizing the measurement results is already built in. Strictly speaking, one should speak of a studentization here, since the exact distribution of the underlying random variables is not known and therefore the arithmetic mean must be used instead of the expected value and the empirical variance instead of the variance . Often, however, the terms studentizing and standardizing are used synonymously.

Word origin

The z-notation has historically been used for different purposes. Today it mostly stands for a standard normally distributed random variable. The variable and the term z -distribution were introduced in 1924 by RA Fisher in his work On a Distribution Yielding the Error Functions of Several Well Known Statistics . ${\ displaystyle z}$

literature

Bortz, Schuster: Statistics for human and social scientists. 7th edition. Springer, 2001.
Falk et al. a .: Foundations of statistical analyzes and applications with SAS. Birkhäuser, 2002.

Individual evidence

↑ Jeffrey Wooldridge : Introductory econometrics: A modern approach. 5th edition. Nelson Education, p. 736.
↑ Jeffrey Wooldridge: Introductory econometrics: A modern approach. 4th edition. Nelson Education, 2015, p. 728.
^ John Aldrich: Earliest Uses of Symbols in Probability and Statistics

[1] Jeffrey Wooldridge : Introductory econometrics: A modern approach. 5th edition. Nelson Education, p. 736.

[2] Jeffrey Wooldridge: Introductory econometrics: A modern approach. 4th edition. Nelson Education, 2015, p. 728.

[3] John Aldrich: Earliest Uses of Symbols in Probability and Statistics