Trend model

from Wikipedia, the free encyclopedia

The seasonal trend model is the traditional time series analysis approach . The modeling takes place with the help of a mathematical model , which includes the following components:

Is missing e.g. B. the seasonal component, one speaks of a trend model .

Model construction

If the observed time series is, then a trend is first estimated. Linear, polynomial or exponential trends are possible, but also moving averages .

An additive or multiplicative seasonal component can be estimated from the residuals . It is assumed that the deviations of the trend function from the observed values ​​are subject to a seasonal pattern.

example

The graphic below shows the unemployment figures in the Federal Republic of Germany from January 2005 to December 2008 (top left) and a linear trend function. At the top right the deviation between the observed unemployment figures and the estimates from the trend is shown. You can see that in the spring of each year the trend function underestimates the unemployment figures and overestimates it in the fall (same color = same month). The graph at the bottom left shows the deviation averaged over the years for each month. This deviation is added to the trend function for the corresponding month. This results in the seasonal trend model (red line) in the graphic at the bottom right.

Linear trend with additive seasonal fluctuations for unemployment figures in Germany 2005–2008.

Trend estimation

Different trend models for the unemployment figures in Germany from 2005–2011

The trend of a time series shows the global course of a time series. Different regression approaches are used for this:

  • a linear or polynomial model: ,
  • an exponential model: or
  • also moving averages with a correspondingly high order.

Linear or polynomial trend model

In the linear or polynomial trend model, a linear or polynomial regression is simply carried out with respect to the time variables in order to estimate the trend:

While the estimated values , ... depend on how the time is parameterized, the estimated trend values are independent of the parameterization.

The following table shows two parameterizations of the time for a linear trend model:

  • corresponds to the first trend model
    • January 2005 the same and
    • February 2005 same ,
  • for the second trend model
    • January 2005 the same and
    • February 2005 same .

The values ​​for or for all subsequent months are then fixed.

Unemployed Linear trend model 1 Linear trend model 2
time (in millions)
Jan 2005 5.09 1 4.80 −83 4.80
Feb 2005 5.29 2 4.77 −81 4.77
March 2005 5.27 3 4.75 −79 4.75
... ... ... ... ... ...
Dec 2011 2.78 84 2.63 +83 2.63
Trend model

Since both parameterizations produce the same estimated values, you can choose any one:

  • The first parameterization allows an easy interpretation of the trend function . Based on an unemployment figure of 4.825 million in December 2004 ( ), the number of unemployed will fall by an average of around 26,150 people per month until December 2011.
  • The second parameterization would be useful if one had to calculate the regression coefficients by hand. Among other things, the arithmetic mean is required, which results here . You can also see that an average of 3.71363 million people were unemployed between January 2005 and December 2011.

With the available data, however, a linear trend function would be unsuitable, as it only poorly reflects the global course of the time series. This is also shown in the previous graphic. It also shows that a quadratic trend function would be better:

.

Exponential model

Number of phones in the United States from 1891 to 1979 with a linear and an exponential trend

An exponential trending model is used when the data suggests it. The graph on the right shows the number of phones (in thousands) in the US from 1891 to 1979, as well as an exponential and a linear trend function. Obviously, the exponential trend describes the data better than the linear trend.

Furthermore, the exponential trend model

the advantage that results from the back calculation

.

The estimated value for each .

The regression coefficients are estimated by tracing them back to the linear model, i.e. H. both and are logarithmized and then and are estimated.

In contrast to the linear or polynomial trend function, both the values ​​of the estimated regression coefficients and the estimated values depend on how the time is parameterized. In the graph, the year 1891 is the same and the year 1892 is the same

.

Moving averages

Another alternative to trend estimation are moving averages with a sufficiently high order . The value is calculated as the average of the observed values at one point . A distinction must be made between the calculation for even and odd orders:

In the case of an even order, the boundary points and with the weight 1/2 are included and all points between them with the weight 1.

However, this is only one way to calculate moving averages; for more see the main article Moving Average .

Moving averages of various orders for estimating the trend in unemployment figures in Germany from 2005 to 2011

However, the moving averages pose three problems:

  1. Which order should one choose for the trend estimation? If the order is too small, the moving average may also capture the seasonality of the data. If the order is too large, then the trend no longer adapts so well to the data. The graphic shows different orders: seven corresponds to a quarter before and after, thirteen corresponds to half a year before and after and twenty-five corresponds to a year before and after.
  2. At the margins, i.e. January 2005 and December 2011 in the graphic opposite, one can no longer calculate estimated values, since the data set contains neither values ​​before January 2005 nor after December 2011.
  3. With the linear, polynomial and exponential trend model, one can in principle also extrapolate into the future. This is not possible with a moving average, as the future values ​​would have to be known for this. So it is only suitable for describing the data.

The advantage of moving averages, however, is that they better fit a non-linear trend in the data.

Seasonal estimate

The seasonal estimate is based on the assumption that there is a structure in the time series that is repeated seasonally. The length of a season is known in advance. When it comes to unemployment figures, it is known that due to the weather conditions, unemployment figures rise regularly towards winter, while they fall again towards summer. So there is an annual pattern in the data.

In essence, seasonal fluctuations are modeled either additively or multiplicatively:

With the value from a trend estimate and an index that is repeated every season.

The following table shows the values ​​of the unemployment figures in Germany from January 2005 to December 2011 ( ), a trend estimate ( ) with a moving average of order 13 and the deviations between the observed values ​​and the trend estimate for an additive or multiplicative seasonal model.

time Unemployed Trend estimation Add. deviation Mult. deviation
(in millions) (Equation Ø with )
Jan 2005 5.09 - - - 1
Feb 2005 5.29 - - - 2
March 2005 5.27 - - - 3
Apr 2005 5.05 - - - 4th
May 2005 4.88 - - - 5
Jun 2005 4.78 - - - 6th
Jul 2005 4.84 4.87 −0.04 0.993 7th
Aug 2005 4.80 4.87 −0.07 0.985 8th
Sep 2005 4.65 4.85 −0.20 0.959 9
Oct 2005 4.56 4.81 −0.25 0.947 10
Nov 2005 4.53 4.77 −0.24 0.950 11
Dec 2005 4.60 4.73 −0.13 0.973 12
Jan 2006 5.01 4.70 +0.31 1.066 1
Feb 2006 5.05 4.67 +0.38 1.082 2
March 2006 4.98 4.62 +0.35 1.077 3
Apr 2006 4.79 4.58 +0.21 1.046 4th
May 2006 4.54 4.54 0.00 1,000 5
Jun 2006 4.40 4.50 −0.10 0.978 6th
Jul 2006 4.39 4.47 −0.08 0.981 7th
Aug 2006 4.37 4.41 −0.04 0.991 8th
Sep 2006 4.24 4.34 −0.10 0.977 9
Oct 2006 4.08 4.26 −0.17 0.959 10
Nov 2006 4.00 4.18 −0.19 0.955 11
Dec 2006 4.01 4.12 −0.11 0.974 12
... ... ... ... ... ...

Additive seasonal variation

A season index is assigned to each point in time of a season with a given length . Then the difference between the observation value and the estimated trend value is formed

.

Then all values ​​are averaged for a fixed one

In the unemployment example ( ), all January deviations are averaged ( ):

This is repeated for all months up to December ( ):

This means that the final time series estimate can be calculated from the trend estimate and the averaged seasonal deviations .

time
Jan 2005 5.09 - - 1 - -
... ... ... ... ... ... ...
Dec 2005 4.60 4.73 −0.13 12 −0.12 4.61
Jan 2006 5.01 4.70 0.31 1 0.23 4.93
... ... ... ... ... ... ...
Dec 2006 4.01 4.12 −0.11 12 −0.12 4.00
Jan 2007 4.26 4.06 0.20 1 0.23 4.29
... ... ... ... ... ... ...

Multiplicative seasonal variation

A season index is assigned to each point in time of a season with a given length . Then the quotient between the observation value and the estimated trend value is formed

.

Then all values ​​are averaged for a fixed one .

In the unemployment example ( ), all January deviations are averaged ( ):

This is repeated for all months up to December ( ):

This means that the final time series estimate can be calculated from the trend estimate and the averaged seasonal deviations .

time
Jan 2005 5.09 - - 1 - -
... ... ... ... ... ... ...
Dec 2005 4.60 4.73 0.973 12 0.967 4.58
Jan 2006 5.01 4.70 1.066 1 1.063 5.00
... ... ... ... ... ... ...
Dec 2006 4.01 4.12 0.974 12 0.967 3.98
Jan 2007 4.26 4.06 1.049 1 1.063 4.32
... ... ... ... ... ... ...

Goodness of a trend season model

Since there are different options for both trend estimation and season estimation, the question arises as to which model is the best. Since both models can be non-linear, one cannot necessarily proceed in two stages; H. first take the “best” trend model and then select the best seasonal model; only a combination of trend and seasonal estimates should be checked.

Based on linear regression, a coefficient of determination is defined for a seasonal trend model:

with the mean of all for which a prediction is made. As a rule, the coefficient of determination of a seasonal trend model is significantly greater than that of linear regression.

The following table shows the coefficients of determination for various trend and trend season models for the unemployment data in Germany from January 2005 to December 2011.

Trend model Linear Exponentially Eq. Average ( )
0.817 0.765 0.917
Seasonal fluctuation additive multiplicative additive multiplicative additive multiplicative
0.868 0.870 0.791 0.767 0.993 0.994

The graphic shows the nine trend season models. You can see that both the blue (linear trend) and the green (exponential trend) models don't fit the data well. The red models (moving averages) fit the data best.

Various seasonal trend models for unemployment data in Germany from January 2005 to December 2011

literature

  • Peter P. Eckstein: Statistics for Economists: A Real-Data-Based Introduction with SPSS . 2nd Edition. Gabler Verlag, 2010, ISBN 978-3-8349-2345-5 .