Rank (statistics)

In a series of statistical observations, the rank of an individual observation is its position if all observation values are ordered according to size and numbered.

It is possible that at least two observations have the same value. We then speak of bonds or composite values (Engl. Ties ). The rank is therefore not well defined .

In stochastics, however , the rank is almost certainly clearly explained if the individual observations are independent and continuously distributed . A number of statistical tests in nonparametric statistics are based on the evaluation of the ranks within samples . The observation values arranged according to their rank are called order statistics .

definition

The observation values are sorted according to size. In the event that no value occurs more than once, the smallest value is usually ranked 1, the next larger (i.e. the second smallest) ranked 2, etc. Possible procedures for values that occur several times (so-called ties ) are listed below.

The usual notation is for the observation value with the rank . ${\ displaystyle x _ {(i)}}$ ${\ displaystyle i}$

example

The following observations were made for the monthly expenditures for leisure goods and vacation in two-person households:

Observation number	1	2	3	4th
Observation value	220	240	220	180
rank	2 or 3	4th	2 or 3	1

So:, d. H. is the observation value with the rank and the second observation value in the data series. ${\ displaystyle x _ {(4)} = 240 = x_ {2}}$ ${\ displaystyle x _ {(4)}}$ ${\ displaystyle 4}$ ${\ displaystyle x_ {2}}$

The observations can be arranged in a ranking list :

List rank	Observation number	Observation value
1.	4th	180
2nd - 3rd	1	220
"	3	"
4th	2	240

Ties

In practice it can happen that observed values occur several times. It is said that bonds occur in the observed values. Since observations with the same values should not have different ranks, they must be treated. Since sums of ranks are often considered in statistics, a requirement that is often placed on methods that handle constraints is that the sum of the ranks of observations is even . ${\ displaystyle n}$ ${\ displaystyle 1 + 2 + 3 + \ ldots + n = {\ tfrac {n (n + 1)} {2}}}$

Various methods can be used to find a clear ranking:

Average: The observations of the same rank are assigned the arithmetic mean of the ranks falling on them.

Example: The following observations were made for the monthly expenditures for leisure goods and vacation in two-person households:

Observation number	1	2	3	4th	5	6th	7th	8th	9	10
Observation value	125	315	215	105	200	170	170	220	220	220
rank	2	10	6th	1	5	3.5	3.5	8th	8th	8th

Ranks 3 and 4 should be assigned to the observation values 170. The arithmetic mean results in . ${\ displaystyle {\ tfrac {3 + 4} {2}} = 3 {,} 5}$
Ranks 7, 8 and 9 should be assigned to the observation values 220. The arithmetic mean results in . ${\ displaystyle {\ tfrac {7 + 8 + 9} {3}} = 8}$

Randomization: The observation values of the same rank are randomly assigned to one of their ranks.

A fortiori method: If a test is carried out , the ranking is determined in such a way that the null hypothesis is favored. ${\ displaystyle H_ {0}}$

properties

The sum of the ranks of a data series is

{\ displaystyle 1 + 2 + 3 + \ ldots + n = {\ frac {n (n + 1)} {2}}}

( Gaussian empirical formula ). This property is also retained when the arithmetic mean is used to calculate the ranks of ties.

Individual evidence

↑ ^a ^b Ulrich Krengel: Introduction to probability theory and statistics . 8th edition. Vieweg, 2005, p. 187-188 .
^ ^A ^b Roland Jeske: Having fun with statistics . 4th edition. Oldenbourg, 2003, p. 172-173 .
↑ Jürgen Bortz, Gustav A. Lienert, Klaus Boehnke: Distribution-free methods in biostatistics . 3. Edition. Springer Verlag, 2008, p. 69-70 .