SpanLib - Spectral Analysis Library

Version 1.1

Project hosted by:

This documentation was generated on 12 September 2006

Observed or simulated multi-channel timeseries generally include a sum of different signals that can be hardly distinguished one another, even if their respective origin is fundamentally different. Analysis methods that are able to extract the most coherent modes of variability generally helps to identify signals of interests.

SpanLib currently focuses on the use of linear analysis methods that rely on eign solutions of covariance or correlations matrices

This package provides a F90 library (as a module) containing a minimal collection of subroutines to perform Principal Componant Analysis (PCA), Multi-channel Singular Spectrum Analysis (MSSA), reconstruction of components and phase composites. The package also provide a python module that calls the F90 library and gives the user a set of useful functions to perform analyses.

In its future version, SpanLib will also include others methods, such Singular Value Decomposition or Principal Oscillation Pattern analysis.

PCA is also know as Empirical Orthogonal Functions (EOFs) decomposition: it decomposes a space-time signal in pairs of spatial EOFs and temporal Principal Components (PCs) that are the eigen solutions of the covariance (or correlation) matrix of the initial signal. The first EOFs represent the dominant, pure spatial patterns of variability, and their associated PCs are the coefficients that regulate these patterns.

Note

In this document, "space" refers to the more general notion of "channel", in opposition to "time". In climate studies, the channel dimension generally coincides with space.

SSA (Singular Spectrum Analysis) is mathematically very similar to PCA: there is now only one channel as an input dataset, and eigenmodes are computed on the lag-covariance matrix (instead of on the cross -between channels- covriance matrix). The EOFs have only a temporal dimension. Therefore, SSA is intended to provides information on purely temporal signal, like a classical Fourier decomposition. However, SSA has many advantages on the latter method:

  • It removes incoherent noise (white noise): the noisy part of the signal takes the form of low order modes, identified as a "background" that can be easily neglected.
  • It naturally extracts regular oscillations (with a narrow spectral peak). These oscillations are identified as pair of modes whose PCs and EOFs are in phase quadrature, that can be intermittent.
  • Coherent nonlinear trends are identified as the lower frequency modes.
  • Compared to others, this method is efficient on short signal.

The maximal lag (the only parameter of SSA) is known as the window.

MSSA is a combination of PCA and SSA: it is an SSA on several channels. The diagonalized is built on covariances between channels (cross) and time segments (lag). Therefore, it has the advantage of PCA for extracting the dominant "spatial" patterns of the variability, and has also the spectral filtering capabilities of SSA. All identified modes have spatio-temporal properties. For example, oscillations are not constrained on a fixed spatial pattern, but can also have a propagative signature over their cycle. This advanced spatial and spectral filtering is helpful to identify the most coherent (and more especially oscillatory) spatio-temporal modes in a short noisy signal.

All these analysis methods act as a linear filter. For each of them, it is possible to reconstruct part of the filtered signal. A reconstructed mode is the "multiplication" of its EOF by its PC, and it has the same dimension of the initial dataset. Such operation is necessary to go back from the EOF space to the physical space.

Finally, PCA may be used also to simply reduce the number of degree-of-freedom (d-o-f) of a dataset. For example, you can keep the first PC that explain a 80% of the variance. These PCs are then used as an input dataset for other analysis. This methodology is useful for MSSA since the eigen problem solving may be very time consuming: we are now able, for example, to potentially reduce the number of channels from several hundred or thounsand, to less than 20.

Similarly, it is not useful to analyse masked points (for example, gridded points situated on land when use analyse oceanic data). The F90 subroutine sl_pca makes the supposition that none of the masked (all channels are analysed). However, as well as for the weights, it is possible to associate an spatial mask to a dataset in order to remove masked points when using the python module. Then, spanlib.pack can be used to "pack" (compress) data before they are analysed.

One can be interested in analysing several variables ate the same time. These variables may come from different regions, datasets and may be even of completely different nature. The essential problem of units may be solved using simple normalisations. Python function spanlib.stackData can be used to "pack" (compress) data before they are analysed. Then, using spanlib.unStackData you can unpack results from you analysis. Raynaud et al (2006) presents an example of use where variables such as sea surface temperature, wind stress modulus and air-sea CO2 fluxes are analysed at the same time: the simultaneous variability of the variables is filtered and the dominant oscillations are extracted for each of these variables.

Reconstructions (F90:sl_pcarec, Python:<SpAn_object>.reconstruct) may not be necessary the multiplication of an EOF by its associated PC. When PCA is used for a reduction of d-o-f (see Section 1.2, “Fundamentals”), orginal PCs are first filtered and then converted back to the original space using saved EOFs.

This is the only and essential parameter of SSA and MSSA (F90:sl_mssa, Python:<SpAn_object>.mssa). It defines the maximal value of the lags use when building the covariance matrix. It acts as a spectral parameter: the spectral resolution is higher for periods lower than this period. A standard value is one third of the time dimension.

One of the most important interests of MSSA is to be able to extract intermittent space-time oscillations from the signal. At the first order, an oscillation is its "typical" cycle. sl_phasecomp (F90) and spanlib.computePhases (Python) perfom phases composites: it computes an averaged cycle and cut it an homegeneous parts (as one can do for the annual cycle in 12 months).

SourceForge.net Logo
This document was generated using xsltproc and perl.