Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Statistics LibreTexts

4.3: Large Sample Properties

( \newcommand{\kernel}{\mathrm{null}\,}\)

Let (Xt:tZ) be a weakly stationary time series with mean μ, absolutely summable ACVF γ(h) and spectral density f(ω). Proceeding as in the proof of Proposition4.2.2., one obtains

I(ωj)=1nn1h=n+1n|h|t=1(Xt+|h|μ)(Xtμ)exp(2πiωjh),

provided ωj0. Using this representation, the limiting behavior of the periodogram can be established.

Proposition 4.3.1

Let I() be the periodogram based on observations X1,,Xn of a weakly stationary process (Xt:tZ), then, for any ω0,

E[I(ωj:n)]f(ω)(n),

where ωj:n=jn/n with (jn)nN chosen such that ωj:nω as n. If ω=0, then

E[I(0)]nμ2f(0)(n).

Proof. There are two limits involved in the computations of the periodogram mean. First, take the limit as n. This, however, requires secondly that for each n we have to work with a different set of Fourier frequencies. To adjust for this, we have introduced the notation ωj:n. If ωj0 is a Fourier frequency (n fixed!), then

E[I(ωj)]=n1h=n+1(n|h|n)γ(h)exp(2πiωjh).

Therefore (n!),

E[I(ωj:n)]h=γ(h)exp(2πiωh)=f(ω),

thus proving the first claim. The second follows from I(0)=nˉX2n (see Proposition 4.2.2.), so that E[I(0)]nμ2=n(E[ˉX2n]μ2)=nVar(ˉXn)f(0) as n as in Chapter 2. The proof is complete.

Proposition 4.3.1. shows that the periodogram I(ω) is asymptotically unbiased for f(ω). It is, however, inconsistent. This is implied by the following proposition which is given without proof. It is not surprising considering that each value I(ωj) is the sum of squares of only two random variables irrespective of the sample size.

Proposition 4.3.2.

If (Xt:tZ) is a (causal or noncausal) weakly stationary time series such that

Xt=j=ψjZtj,tZ,

431.PNG

with j=|ψj|<and(Zt)tZWN(0,σ2), then

(2I(ω1:n)f(ω1),,2I(ωm:n)f(ωm))D(ξ1,,ξm),

where ω1,,ωm are m distinct frequencies with ωj:nωj and f(ωj)>0. The variables ξ1,,ξm are independent, identical chi-squared distributed with two degrees of freedom.

The result of this proposition can be used to construct confidence intervals for the value of the spectral density at frequency ω. To this end, denote by χ22(α) the lower tail probability of the chi-squared variable ξj, that is,

P(ξjχ22(α))=α.

Then, Proposition 4.3.2. implies that an approximate confidence interval with level 1α is given by

2I(ωj:n)χ22(1α/2)f(ω)2I(ωj:n)χ22(α/2).

Proposition 4.3.2. also suggests that confidence intervals can be derived simultaneously for several frequency components. Before confidence intervals are computed for the dominant frequency of the recruitment data return for a moment to the computation of the FFT which is the basis for the periodogram usage. To ensure a quick computation time, highly composite integers n have to be used. To achieve this in general, the length of time series is adjusted by padding the original but detrended data by adding zeroes. In R, spectral analysis is performed with the function spec.pgram. To find out which $n^\prime$ is used for your particular data, type nextn(length(x)), assuming that your series is in x.

432.PNG
Figure 4.6: Averaged periodogram of the recruitment data discussed in Example 4.3.1.

Example 4.3.1.

Figure 4.5 displays the periodogram of the recruitment data which has been discussed in Example 3.3.5. It shows a strong annual frequency component at ω=1/12 as well as several spikes in the neighborhood of the El Ni˜no frequency ω=1/48. Higher frequency components with ω>.3 are virtually absent. Even though an AR(2) model was fitted to this data in Chapter 3 to produce future values based on this fit, it is seen that the periodogram here does not validate this fit as the spectral density of an AR(2) process (as computed in Example 4.2.3.) is qualitatively different. In R, the following commands can be used (nextn(length(rec)) gives n=480 here if the recruitment data is stored in rec as before).

>rec.pgram=spec.pgram(rec, taper=0, log="no")

>abline(v=1/12, lty=2)

>abline(v=1/48, lty=2)

The function spec.pgram allows you to fine-tune the spectral analysis. For our purposes, we always use the specifications given above for the raw periodogram (taper allows you, for example, to exclusively look at a particular frequency band, log allows you to plot the log-periodogram and is the R standard).

To compute the confidence intervals for the two dominating frequencies 1/12 and 1/48, you can use the following R code, noting that 1/12=40/480 and 1/48=10/480.

>rec.pgram{\$}spec[40]

[1] 21332.94

>rec.pgram{\$}spec[10]

[1] 14368.42

>u=qchisq(.025, 2); l=qchisq(.975, 2)

>2*rec.pgram{\$}spec[40]/l

>2*rec.pgram{\$}spec[40]/u

>2*rec.pgram{\$}spec[10]/l

~2*rec.pgram{\$}spec[10]/u

Using the numerical values of this analysis, the following confidence intervals are obtained at the level α=.1:

f(1/12)(5783.041,842606.2)andf(1/48)(3895.065,567522.5).

These are much too wide and alternatives to the raw periodogram are needed. These are provided, for example, by a smoothing approach which uses an averaging procedure over a band of neighboring frequencies. This can be done as follows.

>k=kernel("daniell",4)

>rec.ave=spec.pgram(rec, k, taper=0, log="no")

> abline(v=1/12, lty=2)

> abline(v=1/48, lty=2)

> rec.ave$bandwidth

[1] 0.005412659\medskip

The resulting smoothed periodogram is shown in Figure 4.6. It is less noisy, as is expected from taking averages. More precisely, a two-sided Daniell filter with m=4 was used here with L=2m+1 neighboring frequencies

ωk=ωj+kn,k=m,,m,

to compute the periodogram at ωj=j/n. The resulting plot in Figure 4.6 shows, on the other hand, that the sharp annual peak has been flattened considerably. The bandwidth reported in R can be computed as b=L/(12n). To compute confidence intervals one has to adjust the previously derived formula. This is done by taking changing the degrees of freedom from 2 to df=2Ln/n (if the zeroes where appended) and leads to

dfχ2df(1α/2)mk=mf(ωj+kn)f(ω)dfχ2df(α/2)mk=mf(ωj+kn)

for ωωj. For the recruitment data the following R code can be used:

>df=ceiling(rec.ave{\$}df)

>u=qchisq(.025,df), l~=~qchisq(.975,df)

>df*rec.ave{\$}spec[40]/l

>df*rec.ave{\$}spec[40]/u

>df*rec.ave{\$}spec[10]/l

>df*rec.ave{\$}spec[10]/u

43 smooth.PNG
Figure 4.7: The modified Daniell periodogram of the recruitment data discussed in Example 4.3.1.

to get the confidence intervals

f(1/12)(1482.427,5916.823)andf(1/48)(4452.583,17771.64).

The compromise between the noisy raw periodogram and further smoothing as described here (with L=9) reverses the magnitude of the 1/12 annual frequency and the 1/48 El Ni˜no component. This is due to the fact that the annual peak is a very sharp one, with neighboring frequencies being basically zero. For the 1/48 component, there are is a whole band of neighboring frequency which also contribute the El Ni˜no phenomenon is irregular and does only on average appear every four years). Moreover, the annual cycle is now distributed over a whole range. One way around this issue is provided by the use of other kernels such as the modified Daniell kernel given in R as kernel("modified.daniell", c(3,3)). This leads to the spectral density in Figure 4.7.

Contributers

Demo: I really love the way that Equation ??? looks.


This page titled 4.3: Large Sample Properties is shared under a not declared license and was authored, remixed, and/or curated by Alexander Aue.

Support Center

How can we help?