# 2.2: Estimation of the Autocovariance Function

This section deals with the estimation of the ACVF and ACF at lag \(h\). Recall from equation (1.2.1) that the estimator

\[ \hat{\gamma}(h)=\frac 1n\sum_{t=1}^{n-|h|}(X_{t+|h|}-\bar{X}_n)(X_t-\bar{X}_n), \qquad h=0,\pm 1,\ldots, \pm(n-1), \]

may be utilized as a proxy for the unknown \(\gamma(h)\). As estimator for the ACF \(\rho(h)\),

\[ \hat{\rho}(h)=\frac{\hat{\gamma}(h)}{\hat\gamma(0)},\qquad h=0,\pm 1,\ldots,\pm(n-1), \]

was identified. Some of the theoretical properties of $\hat{\rho}(h)$ are briefly collected in the following. They are not as obvious to derive as in the case of the sample mean, and all proofs are omitted. Note also that similar statements hold for \(\hat{\gamma}(h)\) as well.

- The estimator \(\hat{\rho}(h)\) is generally biased, that is, \(E[\hat{\rho}(h)]\not=\rho(h)\). It holds, however, under non-restrictive assumptions that

\[ E[\hat{\rho}(h)]\to\rho(h)\qquad (n\to\infty). \]

This property is called *asymptotic unbiasedness*.

- The estimator \(\hat{\rho}(h)\) is consistent for \(\rho(h)\) under an appropriate set of assumptions, that is, \(\mathrm{Var}(\hat{\rho}(h)-\rho(h))\to 0\) as \(n\to\infty\).

It was already established in Section 1.5 how the sample ACF \(\hat{\rho}\) can be used to test if residuals consist of white noise variables. For more general statistical inference, one needs to know the sampling distribution of \(\hat{\rho}\). Since the estimation of \(\rho(h)\) is based on only a few observations for \(h\) close to the sample size \(n\), estimates tend to be unreliable. As a rule of thumb, given by Box and Jenkins (1976), \(n\) should at least be 50 and \(h\) less than or equal to *n/4*.

**Theorem 2.2.1. **For \(m\geq 1\), let \(\mathbf{\rho}_m=(\rho(1),\ldots,\rho(m))^T\) and \(\mathbf{\hat{\rho}}_m=(\hat{\rho}(1),\ldots,\hat{\rho}(m))^T\), where \(^T\) denotes the transpose of a vector. Under a set of suitable assumptions, it holds that

\[ \sqrt{n}(\mathbf{\hat{\rho}}_m-\mathbf{\rho}_m)\sim AN(\mathbf{0},\Sigma)\qquad (n\to\infty), \]

where \(\sim AN(0,\Sigma)\) stands for approximately normally distributed with mean vector \(\mathbf{0}\) and covariance matrix \(\Sigma=(\sigma_{ij})\) given by Bartlett's formula

\[ \sigma_{ij}=\sum_{k=1}^\infty\big[\rho(k+i)+\rho(k-i)-2\rho(i)\rho(k)\big]\big[\rho(k+j)+\rho(k-j)-2\rho(j)\rho(k)\big]. \]

The section is concluded with two examples. The first one recollects the results already known for independent, identically distributed random variables, the second deals with the autoregressive process of Example (2.2.1).

**Example 2.2.1.** Let \((X_t\colon t\in\mathbb{Z})\sim\mathrm{IID}(0,\sigma^2)\). Then, \(\rho(0)=1\) and \(\rho(h)=0\) for all \(h\not=0\). The covariance matrix \(\Sigma\) is therefore given by

\[ \sigma_{ij}=1\quad\mbox{if $i=j$} \qquad and \qquad \sigma_{ij}=0\quad\mbox{if $i\not=j$}. \]

This means that \(\Sigma\) is a diagonal matrix. In view of Theorem 2.2.1 it holds thus that the estimators \(\hat{\rho}(1),\ldots,\hat{\rho}(k)\) are approximately independent and identically distributed normal random variables with mean 0 and variance \(1/n\). This was the basis for Methods 1 and 2 in Section 1.6 (see also Theorem 1.2.1).

**Example 2.2.2.** Reconsider the autoregressive process \((X_t\colon t\in\mathbb{Z})\) from Example 2.1.1 with \(\mu=0\). Dividing \(\gamma(h)\) by \(\gamma(0)\) yields that

\[ \rho(h)=\phi^{|h|},\qquad h\in\mathbb{Z}. \]

Now the diagonal entries of \(\Sigma\) are computed as

\begin{align*}

\sigma_{ii}&=\sum_{k=1}^\infty\big[\rho(k+i)+\rho(k-i)-2\rho(i)\rho(k)\big]^2\\[.2cm]

&=\sum_{k=1}^i\phi^{2i}(\phi^{-k}-\phi^k)^2+\sum_{k=i+1}^\infty\phi^{2k}(\phi^{-i}-\phi^i)^2\\[.2cm]

&=(1-\phi^{2i})(1+\phi^2)(1-\phi^2)^{-1}-2i\phi^{2i}.

\end{align*}