# ARMA model

ARMA models (ARMA, acronym for: A uto R egressive- M oving A verage , German autoregressive moving average , or autoregressive moving average ) or autoregressive moving average models and their extensions ( ARMAX models and ARIMA models ) are linear , discrete-time models for stochastic processes . They are used for the statistical analysis of time series, especially in economics, social sciences and engineering. The specification , estimation, validation and practical application of ARMA models are dealt with in the Box-Jenkins approach . The most important application is the short-term forecast . These models take the form of linear difference equations and are used to map linear stochastic processes or to approximate more complex processes.

## Mathematical representation

If both past noise terms and past values ​​of the time series themselves flow into an ARMA model , one speaks of a mixed ARMA model. If there are only current and past noise terms, it is a (pure) moving average or MA model. If, in addition to the current noise term, only past values ​​of the time series are included, it is a (pure) autoregressive or AR model.

### Moving Average or MA model

${\ displaystyle y_ {t} = c + \ epsilon _ {t} + \ sum _ {j = 1} ^ {q} b_ {j} \ epsilon _ {tj}}$ The signal to be modeled is formed by a weighted, moving average ( moving average ) of noise terms in the current and previous periods, as well as a constant given. The so-called MA coefficients indicate the weight with which the noise term flows into the signal. ${\ displaystyle y_ {t}}$ ${\ displaystyle \ epsilon _ {tj}, j = 0, ..., q,}$ ${\ displaystyle q}$ ${\ displaystyle c}$ ${\ displaystyle b_ {j}, j = 0, \ ldots, q,}$ With regard to the noise terms , it is assumed that they are temporally independent of one another and are identically (mostly Gaussian) distributed, with an expected value of 0 and the variance . ${\ displaystyle \ epsilon _ {t}}$ ${\ displaystyle 0 <\ sigma ^ {2} <\ infty}$ ### Autoregressive or AR model

${\ displaystyle y_ {t} = c + \ epsilon _ {t} + \ sum _ {i = 1} ^ {p} a_ {i} y_ {ti}}$ The signal is made up of a constant, a noise term and a weighted, moving average of the previous signal values, the AR coefficients being the weights. ${\ displaystyle p}$ ${\ displaystyle a_ {i}, i = 1, \ ldots, p,}$ ### ARMA model

${\ displaystyle y_ {t} = c + \ epsilon _ {t} + \ sum _ {i = 1} ^ {p} a_ {i} y_ {ti} + \ sum _ {j = 1} ^ {q} b_ {j} \ epsilon _ {tj}}$ This model is also known as the ARMA (p, q) model, where p and q indicate the autoregressive and moving average orders of the process, respectively. Pure AR (p) or MA (q) models are special ARMA models with q = 0 or p = 0.

With the help of the so-called shift or lag operator (from lag , "time shift"): ${\ displaystyle L}$ ${\ displaystyle L ^ {d} x_ {t} = x_ {td}}$ one also writes shorter:

${\ displaystyle a (L) y_ {t} = c + b (L) \ epsilon _ {t}}$ where and are each polynomials (of degrees p and q): ${\ displaystyle a (\ cdot)}$ ${\ displaystyle b (\ cdot)}$ ${\ displaystyle a (x) = 1-a_ {1} x- \ cdots -a_ {p} x ^ {p}}$ ,
${\ displaystyle b (x) = 1 + b_ {1} x + \ cdots + b_ {q} x ^ {q}}$ .

### Alternative representations

#### Pure MA representation

Assumed and have no common zeros. Then an ARMA process can be expressed as an MA ( ) process if and only if for all zeros of . That is, under these conditions, the process has a representation of the form ${\ displaystyle a}$ ${\ displaystyle b}$ ${\ displaystyle \ infty}$ ${\ displaystyle | z |> 1}$ ${\ displaystyle z \ in \ mathbb {C}}$ ${\ displaystyle a}$ ${\ displaystyle y_ {t} = {\ frac {c} {a (L)}} + {\ frac {b (L)} {a (L)}} \ epsilon _ {t} = \ mu + \ sum _ {j = 0} ^ {\ infty} c_ {j} \ epsilon _ {tj}}$ where the expected value of by and the coefficients of the pure MA representation are given by the polynomial . ${\ displaystyle y_ {t}}$ ${\ displaystyle \ mu = c / a (L)}$ ${\ displaystyle c_ {j}}$ ${\ displaystyle c (L) = b (L) / a (L)}$ #### Pure AR display

The analogy to the pure MA display is the pure AR display. It requires that the process is invertible, i.e. that the zeros of the MA polynomial are greater than one. Then: ${\ displaystyle b (L)}$ ${\ displaystyle d (L) y_ {t} = {\ frac {a (L)} {b (L)}} y_ {t} = {\ frac {c} {b (L)}} + \ epsilon _ {t}}$ or.

${\ displaystyle y_ {t} = \ nu + \ epsilon _ {t} + \ sum _ {j = 1} ^ {\ infty} d_ {j} y_ {tj}}$ ## Special cases and extensions

### White noise

An ARMA (0,0) process , so if it is simply the noise term (possibly plus a constant), then one speaks of white noise . ${\ displaystyle y_ {t} = c + \ epsilon _ {t}}$ ${\ displaystyle y_ {t}}$ ### Random Walk

A random walk is a first order AR process (p = 1) in which the AR coefficient has the value 1, i.e.

${\ displaystyle y_ {t} = c + y_ {t-1} + \ epsilon _ {t}}$ If this applies to the constant , then one also speaks of a random walk with drift , otherwise of a random walk without drift . A random walk is always integrated from order 1. ${\ displaystyle c \ neq 0}$ ### ARIMA

In the case of non-stationary time series, stationarity can possibly be induced by forming differences. The first difference of is defined by, where is the so-called difference operator. If the d-th difference is not modeled as an ARMA (p, q) model, then one speaks of an integrated ARMA model of the orders p, d, and q, or in short: an ARIMA (p, d, q) - Model. Values ​​for the original, undifferentiated time series are obtained by integrating d times (“anti-difference formation”) of . ${\ displaystyle y_ {t}}$ ${\ displaystyle \ Delta y_ {t} = y_ {t} -y_ {t-1}}$ ${\ displaystyle \ Delta = 1-L}$ ${\ displaystyle y_ {t}}$ ${\ displaystyle \ Delta ^ {d} y_ {t}}$ ${\ displaystyle \ Delta ^ {d} y_ {t}}$ ### ARMAX

If one or more exogenous variables are required to model the time series , then one speaks of an ARMAX model. In the case of an exogenous variable : ${\ displaystyle y_ {t}}$ ${\ displaystyle x_ {t}}$ ${\ displaystyle a (L) y_ {t} = c + b (L) \ epsilon _ {t} + e (L) x_ {t}}$ where the polynomial describes the lag structure with which the exogenous variable influences the variable to be explained . ${\ displaystyle e (L)}$ ${\ displaystyle x_ {t}}$ ${\ displaystyle y_ {t}}$ ### Seasonal ARMA models

Seasonal effects often occur in economic as well as other time series. Examples are monthly unemployment figures, quarterly retail sales, etc. In order to take these into account, seasonal AR or MA components can also be specified. If data is available with a seasonal range s (e.g. s = 12 for monthly data and s = 4 for quarterly data), the seasonal ARMA model has the form:

${\ displaystyle a_ {S} (L ^ {s}) a (L) y_ {t} = c + b_ {S} (L ^ {s}) b (L) \ epsilon _ {t}}$ where the seasonal AR polynomial is the order and the seasonal MA polynomial is the order . In short: ARMA (p, q) x (P, Q, s). ${\ displaystyle a_ {S} (L ^ {s}) = 1-a_ {S, 1} L ^ {s} - \ cdots -a_ {S, P} L ^ {s}}$ ${\ displaystyle P}$ ${\ displaystyle b_ {S} (L ^ {s}) = b_ {S, 0} + b_ {S, 1} L ^ {s} + \ cdots + b_ {S, Q} L ^ {s}}$ ${\ displaystyle Q}$ ### VARMA

VARMA models are a natural generalization of the ARMA models. VAR models are linear, time-discrete models for stochastic processes with endogenous variables : Each variable depends on previous signal values. VMA models are the generalizations of MA models and they are useful for impulse response function analysis. A VAR model (order ) is: ${\ displaystyle N}$ ${\ displaystyle p}$ ${\ displaystyle p}$ ${\ displaystyle {\ vec {y}} _ {t} = {\ vec {c}} + {\ vec {u}} _ {t} + \ sum _ {i = 1} ^ {p} A_ {i } {\ vec {y}} _ {ti}}$ with as a constant vector, as a vector of white noise, and as matrices. ${\ displaystyle {\ vec {c}}}$ ${\ displaystyle {\ vec {u}} _ {t}}$ ${\ displaystyle A_ {1}, A_ {2}, \ dotsc, A_ {p}}$ ${\ displaystyle (N \ times N)}$ ## Modeling

In practice, ARMA modeling mostly follows the Box-Jenkins method , which consists of the steps of model identification, estimation, validation and application.

### ID

The aim of the identification is to determine the ARMA specification parameters d, p and q. Unit root tests can be used to determine d, the necessary difference order . For the ARMA orders p and q, the autocorrelation function (AKF) and the partial autocorrelation function are often used, as well as criteria for model selection, such as the Akaike information criterion or the Bayesian information criterion .

### estimate

The estimation of the model parameters is mostly done by maximum likelihood estimation or least squares estimation . In the case of pure AR models, the least squares estimator is a linear estimator; otherwise a nonlinear least squares estimate is required.

### Validation

Various criteria can be used to assess the suitability of an estimated model. As a rule, it is checked whether the residuals, i.e. the estimated ones, are uncorrelated and behave like white noise. In addition, the forecast quality can also be evaluated. If a model does not appear adequate, running through the identification and estimation step again may provide a remedy. ${\ displaystyle \ epsilon _ {t}}$ ### application

After successful validation, the model application can be operated. Often this is the short-term forecast. A one-step forecast can be obtained, for example, by shifting the difference equation of the estimated ARMA model one period into the future and calculating the conditional expected value. This can be repeated recursively for multi-step forecasts.