sio 223: geophysical data analysis a few remarks on non...

SIO 223: GEOPHYSICAL DATA ANALYSIS

A few Remarks on Non-Stationarity and Dynamic Spectra

All of the methods for spectral analysis that we have discussed to date have relied on our preliminary

assumptions that the physical process under consideration can be regarded as statistically stationary. To

date our only strategy for dealing with a non-stationary signal has been to detrend or to prewhiten the data

series. While this can be an effective way of eliminating signals that occur at very long period and have

not been adequately sampled by the available measurements, we have no strategy for dealing with spectra

whose overall content changes over the length of the times series.

1. Detrending and Prewhitening

We begin with some simple examples in which the time series available are typical of geophysical data

series in that the power spectra are red. Figure 1 shows monthly sea level measured at the end of Scripps

pier. We see a long term variation and some apparent periodic oscillations. In this case the long term

variation is largely removed by fitting a straight line to the observations: the resulting of detrending the

data by subtracting the best fit line is shown as the residual in the lower part of the figure. In this example

most of the very low frequency power is associated with a linear trend. If we don’t remove it prior to power

spectral estimation there will be a problem with spectral leakage.

As we have already noted the most difficult spectra to estimate are those which contain sharp peaks or rapid

changes with frequency, because rapid changes with frequency lead to spectral leakage or bias. Remember

the Magsat data of Chapter 3, Figures 5 and 9, which you experimented with for the psd homework. A

strategy which can help with spectral leakage is to make a preliminary transformation on the data that

makes the spectrum of the process more nearly white: the "nearly white" spectrum can then be estimated

more accurately, and the transformation corrected for in the frequency domain. This procedure, known as

pre-whitening, was discussed in section 9 of chapter 3.

The general idea is to take the zero-mean process, Xt, and pass it through a linear filter constructed so that

its output Yt has a spectrum that is nearly white. That is we write

Yt =∑s

gsXt−s

Then the corresponding spectral density functions satisfy

SY (f ) = |g(f )|2SX(f )

1

and we can recover

SX(f ) = |g(f )|−2SY (f )

Ideally one would choose g so that

|g(f )|2 ∝ S−1X (f )

which seems to require you to know the form of the spectrum a priori. In practice one can exploit the fact

that often the worst leakage comes from low frequency signals and use a rather short autoregressive filter

for g. As we noted earlier this can be estimated efficiently using the Yule-Walker equations.

Figure 1: Monthly sea level measured at the end of SIO pier; data from Permanent Service for Mean Sea Level.

2. Dynamic Spectra

The above examples illustrate strategies for dealing with data sets we could think of as being "too short to be

considered stationary". But they are of no use in situations where the nature of the signal actually changes

with time. Again we consider sea level at the end of the Scripps pier, but now measured at 1 second intervals

so that one can see the influence of ocean waves. In Figure 2 we have 10 minutes of data taken 24 hours

apart. The dominant characteristic on both days is an oscillation of amplitude about 15 cm and period about

10 seconds, caused by ocean swell.

Taking power spectra based on 1 hour of measurements from each day, we see in Figure 3 that, indeed most

of the variation takes place between periods of 8-15 seconds (or frequencies of 0.06-0.12 Hz), but that on

2

Figure 2: Sea level measured at end of SIO pier; data from Coastal Data Information progam. Zero level isarbitrary. Top frame starts from 1997:240:00:00:00; bottom frame starts from 1997:241:00:00:00

the two days the decrease at higher and lower frequencies has different forms, suggesting a change in the

physics. The peak of the energy on day 241 seems to have moved to higher frequency.

Figure 3. Spectrum of the wave data in Figure 1. In each frame the spectrum of the other is shown as a light line at low frequencies, to indicate the shift in frequency of the lowestfrequency peak.

3

We can make a dynamic spectrum by taking a series of such spectra that moves along the time series, and

then make a contour plot of the power spectrum over time. In Figure 4 we see a prominent ridge line in

the power spectrum which gradually changes frequency with time. The physical explanation is that surface

waves in liquids are dispersive: the wave energy was generated over a broad range of frequencies in one

place, but the lower frequency energy traveled faster and arrived at Scripps pier first. Given the frequency

shift over 3 days and knowing the physics of wave propagation one can find the distance to the source (for

details see Snodgrass et al.(1966), Phil. Trans. Roy. Soc.Ser. A , 259,431–497).

2:1 A short digression on decibels

A decibel is a relationship between two values of power. Decibels are designed for talking about numbers

of greatly different magnitude. We could and have often used scientific notation, but many people prefer to

measure power in terms of the logarithm of a ratio with some arbitrary standard.

Power difference in dB = 10 log[

power Apower B

]Consider for example how the ear perceives loudness. First of all, the ear is very sensitive. The softest

audible sound has a power of about 10−12 watt/sq. meter and the threshold of pain is around 1 watt/sq.

meter, giving a total range of 120dB. In the second place, our judgment of relative levels of loudness is

somewhat logarithmic. If a sound has 10 times the power of a reference (10dB) we hear it as twice as loud.

If we merely double the power (3dB), the difference will be just noticeable. As we noted earlier many power

spectra naturally follow logarithmic scales. You will often see power spectra plotted in dB relative to some

specified (or unspecified) standard as in Figures 3 and 4.

3. Wavelet Analysis and the Wavelet Transform

A strategy that has become increasingly popular over the past two decades for the analysis of non-stationary

time series involves the use of wavelets and the wavelet transform. As we have seen many physical

processes can be illuminated by expressing their properties in terms of the power spectral density function.

Detractors might argue that although we can effectively determine the frequency content of a signal using

the techniques described so far, we do not necessarily know when power is present at a particular frequency.

Wavelets attempt to overcome this by simultaneously representing the signal in both frequency and time

domain. One might wonder how this differs from the dynamic spectrum example of the previous section.

We will shortly see that wavelet analysis can claim to have a collection of time-frequency representations of

4

Figure 4: Power spectral density as a function of time for sea level at end of SIO pier. Contoursare in dB relative to 1m2/Hz. The two dashed lines show the slices represented by the spectra ofFigure 3.

the signal each of which has a different resolution: it is sometimes described as a multi-resolution analysis.

It is time to define what we mean by a wavelet. A wavelet function ψ0(η) depends on a non-dimensional

time (or perhaps space) parameter, η. In order to be considered a wavelet function ψ0 must be localized in

both time and frequency domains. ∫|ψ(f )|2

fdf <∞

and ∫ψ(η)dη = 0

A consequence of these properties is that the wavelets have a spectrum like that of a band-pass filter. One

example is the Morlet wavelet, which consists of a plane wave modulated by a gaussian.

ψ0(η) = π−14 expiω0ηexp−η

2/2 (1)

By convention a set of orthogonal wavelet functions is called a wavelet basis. Several wavelet bases are

illustrated in Figure 5. The continuous wavelet transform of a discrete sequence xn is defined as the

convolution of xn with a scaled and translated version of the ψ0(η):

Wn(s) =N−1∑n′=0

xn′ψ∗[

(n′ − n)δts

](2)

The key to recovering a dynamic view of the signal is that the wavelet scale s can be varied, and it can be

translated along the localized time index n. Changing s shows how amplitude of any feature varies with

5

Torrence and C ompo.(1998), A practical G uide to W avelet Analysis, Bulletin of the American Meteorological Society, 89 , 61-78.

-4 -2 0 2 4-0.3

0.0

0.3

ψ (t / s)

a. Morlet

-2 -1 0 1 202

46

ψ ^ (s ω)

-4 -2 0 2 4-0.3

0.0

0.3b. Paul (m=4)

-2 -1 0 1 202

46

-4 -2 0 2 4-0.3

0.0

0.3c. DOG (m=2)

-2 -1 0 1 202

46

-4 -2 0 2 4t / s

-0.3

0.0

0.3d. DOG (m=6)

-2 -1 0 1 2s ω / (2π)

02

46

FIG. 5. Four different wavelet bases, from Table 1. The plotson the left give the real part (solid) and imaginary part (dashed)for the wavelets in the time domain. The plots on the right givethe corresponding wavelets in the frequency domain. For plottingpurposes, the scale was chosen to be s = 10δt. (a) Morlet, (b) Paul(m = 4), (c) Mexican hat (DOG m = 2), and (d) DOG (m = 6).

6

scale while the translation gives the amplitude variations with time. One can calculate the wavelet transform

in the time domain, but it is faster in Fourier space. Remembering that we have a DFT for xn as

xk =1N

N−1∑n=0

xne−2πikn/N (3)

with k = 0, . . . , N − 1 as the frequency index. In the continuous limit the FT of ψ(t/s) is given by ψ(sω).

Using the convolution theorem we have:

Wn(s) =N−1∑k=0

xkψ∗(sωk)eiωknδt (4)

with the angular frequency defined as

ωk =2πkNδt

k ≤ N

2

= − 2πkNδt

k >N

2

(5)

Using (4) one can obtain the continuous wavelet transform for a given value of s at all n simultaneously.

But we need to be careful with normalization. At each scale the wavelet function must be normalized to

have unit energy:

ψ(sωk) =[

2πsδt

] 12

ψ0(sωk) (6)

The wavelet spectrum is then just taken to be |Wn(s)|2. Why might this be a bad idea? An example is shown

in Figure 6 using sea surface temperature over the central Pacific for the time period 1871 -1996. Note the

cross hatching which indicates the region known as the cone of influence associated with edge effects from

the convolution.

4. Wavelet Analysis: Nuts and Bolts

For software and examples see http://paos.colorado.edu/research/wavelets

(1) Fourier transform the (possibly padded) time series

(2) Choose a wavelet function and a set of scales to analyze

(3) For each scale construct the normalized wave function using (6)

(4) Find the wavelet transform at that scale using (4)

(5) Determine the cone of influence and the Fourier wavelength at that scale

(6) Repeat steps (3) -(5) for all scales, remove any padding and contour plot the wavelet power spectrum.

(7) Assess the significance level relative to a background Fourier power spectrum, (e.g., white or red noise)

at each scale, then use the χ -squared distribution to find the 95% confidence contour.

7

1880 1900 1920 1940 1960 1980 2000-2-10123

(o C)

a. NINO3 SST

1880 1900 1920 1940 1960 1980 2000Time (year)

1

2

4

8

16

32

64

Perio

d (y

ears

)

64.032.0

16.0

8.0

4.0

2.0

1.00.5

Scal

e (y

ears

)

b. Morlet

1880 1900 1920 1940 1960 1980 2000Time (year)

1

2

4

8

16

32

Perio

d (y

ears

)

16.0008.000

4.000

2.000

1.000

0.500

0.2500.125

Scal

e (y

ears

)

c. DOG

FIG. 6. (a) The Nino3 SST time series used for the wavelet analysis. (b) Thelocal wavelet power spectrum of (a) using the Morlet wavelet, normalized by 1/σ2 (σ2 = 0.54∞C2). The left axis is the Fourier period (in yr) corresponding to thewavelet scale on the right axis. The bottom axis is time (yr). The shaded contoursare at normalized variances of 1, 2, 5, and 10. The thick contour encloses regionsof greater than 95% confidence for a red-noise process with a lag-1 coefficient of0.72. Cross-hatched regions on either end indicate the "cone of influence", whereedge effects become important. (c) Same as (b) but using the real-valued Mexicanhat wavelet (derivative of a Gaussian; DOG m = 2). The shaded contour is atnormalized variance of 2.0.(after Torrence & Compo, 1998, Bull. Amer. Meteorol. Soc., 79, 61-78.)

8

Figure 8: Wavelet analysis of Wolf’s sunspot numbers

Another example of wavelet estimation is given for Wolf’s sunspot numbers in Figure 8. The highs in

the wavelet spectrum correspond reassuringly with times when sunspot activity appears greatest. For

comparison with our known methods, the figure on the next page shows the sine multitaper power spectral

density estimate for the sunspot numbers calculated using PSD. This should in principle give the same result

as the global wavelet variance found by averaging the wavelet power spectrum in time.

In practice one might feel somewhat insecure in relying on the above method for acquiring a wavelet

spectrum. The single estimate opens up the possibility of some of the same pitfalls as one might anticipate

with the raw periodogram: high variance and poor protection against spectral leakage for power spectra with

large dynamic range. In a paper in Geophysical Journal International, Lilly and Park (1995, 122,1001-1021)

propose designing the wavelet to generate optimal concentration in frequency. They devise a multiwavelet

9

Sine Multitaper Spectrum from Wolf's Sunspot Numbers

algorithm that average a number of mutually orthogonal Slepian wavelets to produce an estimate that is both

low-variance and resistant to broad-band bias. The same idea can also be used to estimate the time-varying

spectral density matrix for two or more time series. See their paper for details, along with the application to

spectral polarization in seismic records. The method seems quite well suited to the analysis of non-stationary

data processes.

10

Irregularly Spaced Data

This class has introduced you to some basic ideas in time series analysis, but there are a number of topics

we have not dealt with. We have supposed that evenly sampled data are available, but a practical problem

often encountered is that of irregularly spaced data. In some cases one can simply interpolate the data onto

a regular sampling in time using a suitable algorithm, but sometimes the gaps are too long, and the resulting

interpolation effectively lowers the Nyquist frequency. Similarly, gaps in the middle of a data set will limit

the frequency resolution if one is forced to deal with spectral estimates based on shorter time intervals.

Solutions to these problems have been suggested in the form of the Lomb periodogram (see Press et al.),

and modifications to multitaper spectral estimation for irregularly sampled time series (Bronez, 1988; Fodor

& Stark, 2000).

Bronez, T.P., 1988. Spectral estimation of irregularly sampled multidimensional processes by generalizedprolate spheroidal sequences. IEEE Trans. Acoust., Speech, Signal Processing, 36, 1862–1973.

Fodor, I.K., & P.B. Stark, 2000. Multitaper spectrum estimation for time series with gaps. IEEE Transactionson Signal Processing, 48, 3472–3483.

Press, W.H. et al., 1992. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd Edition.Cambridge. p. 569-577

11

sio 223: geophysical data analysis a few remarks on non...

Documents