The Mathematics of GeoEnergy

Filtering of Climate Data

One of the frustrating aspects of climatology as a science is in the cavalier treatment of data that is often shown, and in particular through the potential loss of information through filtering. A group of scientists at NASA JPL (Perigaud et al) and elsewhere have pointed out how constraining it is to remove what are considered errors (or nuisance parameters) in time-series by assuming that they relate to known tidal or seasonal factors and so can be safely filtered out and ignored. The problem is that this is only appropriate IF those factors relate to an independent process and don’t also cause non-linear interactions with the rest of the data. So if a model predicts both a linear component and non-linear component, it’s not helpful to eliminate portions of the data that can help distinguish the two.

As an example, this extends to the pre-mature filtering of annual data. If you dig enough you will find that NINO3.4 data is filtered to remove the annual data, and that the filtering is over-zealous in that it removes all annual harmonics as well. Worse yet, the weighting of these harmonics changes over time, which means that they are removing other parts of the spectrum not related to the annual signal. Found in an “ensostuff” subdirectory on the NOAA.gov site:

This makes me cringe now that I take a look at the portion of the filtered data (which I independently extracted, shown below) and notice how well it matches to the annual impulse I am applying in the ENSO model. The impulse, which is required to amplify the tidal cycles, is now clearly phase correlated to the observed annual temperature cycling.

This may sound like an innocent error correction but this eliminates the possibility of tracking correlations, which is the core of any science such as climatology that does not allow experimental control or laboratory experimentation.

That’s just the annual signal. This is what Perigaut et al wrote about removing tidal signals:

1) The first question follows a long series of controversies between oceanic and atmospheric communities on the genesis of Tropical Instability Waves (TIW). We find that both fluids receive mid­-latitude energy increase every 14.7 days from lunar and solar gravitational attractions of the Earth. The biggest challenge that we faced to ensure the validity of this finding came from the common widespread habit of saving climate signals once per day, which contaminates datasets and model outputs that have lunar tidal content into the 14.7 day aliased period. This issue got resolved when we found that the main energy source is not semi­diurnal, but diurnal.

2) The biweekly mechanism now allows to solve the inconsistency between Sea Surface Temperature (SSTs) well explained by the symmetric recharge of warm events from the subtropics (Ref 20), and the intriguing results recorded by coral reefs (Ref 21): every 18.6 yearsthe Moon weakens the 14.7 day inter­hemispheric luni­solar forcing of cold and salty subsurface waters (mass centred in the South) between continents up to the surface. Our manuscript is at the core of the research we conduct to detect remaining uncertainties in satellite datasets that may matter for climate monitoring/modeling. We do so by extracting information from satellites on OVW and rain activity to compensate for the lack of ocean climate model skill in reproducing the observed weather. We had to replace the model OVW climatology with atmospheric products, because the “satellite OVWs” used as if they were atmospheric winds yield unrealistic sea level variations. We have reported this satellite OVW inconsistency with model sea level to the weather and climate science teams responsible for satellite and atmospheric OVW products (Refs 17, 18), then consulted experts for tides and orbits and proposed solutions (Ref 51). We have now reached certainty that the biweekly ocean­ atmosphere instabilities are triggered by the Earth­Moon­Sun alignments that facilitate the cross­-equatorial and vertical fluid circulations. We continue finding possible improvements for climate modelling/ monitoring from satellites, and invite the reader to discuss alternatives to traditional assumptions in weather, climate and tidal modelling.

This was in a cover letter sent to Nature Climate Change, with the emphasis theirs. The letter not only pointed out the laxity in sampling climate data (bullet 1) but pointed to potential findings related to that data (bullet 2). Apparently the paper it refers to was rejected for inclusion in the journal.

These examples of over-processing of data brings up the notion of working smart, not hard. Consider the tweet below: what kind of reality is this where someone requires 1 PB = 1 million gigabytes of model output?

for example I have 1PB of netcdf files archived for my group (mostly climate model output). It's hard to know what the alternative format would be if not netcdf. any other format would be literally unusable (imagine how big/cumbersome that would be as a csv file).

So what they will do is take the 1 petabytes of output and filter out all the important details anyways. That’s counter-productive. Would rather work on 2000 pairs of SOI data-points (that would easily fit on a 360K floppy disk), than struggle chasing ghost correlations in such a large dataset, when a noteworthy answer may be right in front of your face.

Here, we use eigen analysis to study the principal modes of daily surface air temperature. Note that we use the original temperature rather than abnormal temperature. Previous studies (29–31) mentioned above focused mainly on the inter-annual principal modes after removal of seasonality. Instead, we investigate the intra-annual principal modes including seasonality.

Think they mean the anomalous temperature and not abnormal — i.e. anomalous equals the variation form the mean.

Important to include seasonality to determine the impact of seasonality, but of course!