Modelling Arctic Sea Ice (part 1)
Using a supplemented dataset incorporating NSIDC’s Sea Ice Index (SII) to explore the relationship with sea ice extent, sea surface and land surface temperate anomalies
Back in Arctic Sea Ice (part 6) I mentioned something about creating some “fabulous dishes”, and by this I mean some statistical models. We ought to realise right up front that all models are wrong (not my words but those of statistical legend George Box), though George does qualify this by going on to say, “but some models are useful”. In the fine art of cooking even over-baked flapjack has a useful function for it can be crumbled into ice cream, and so it is that I’m going to bake some goodies here today using just five variables:
Mean Arctic Annual Sea Ice Extent (NSDIC SII supplemented by Vinnikov et al 1999)
Mean Arctic Sea Surface Temperature (HADiSST v1.1)
Mean Arctic Land Surface Temperature Anomaly (GHCNd, 16 station sample)
Mean Annual Atmospheric CO2 Concentration (Mauna Loa in situ supplemented with IAC CMIP6 data)
You’ll need to re-read several earlier articles for background to these and how I went about stringing datasets together but for now all we need to realise is that this is top quality kosher data obtained from leading organisations.
Too Many Nuts
Despite being kosher, the first three of these time series contain too many nuts. That is to say errors of measurement and wildly fluctuating real world values give rise to outliers and rather noisy data. Noise can be a real problem when it comes to modelling and the first thing a statistician will do is examine and process outliers. Smoothing is a commonly used technique and there are all manner of ways we can go about this. My favoured method is to apply the T4253H filter.
Dashing Away With The T4253H Smoothing Iron
For those not familiar the T4253H smoothing function the process kicks off with a running median of 4, which is centred by a running median of 2. It then re-smoothes these values by applying a running median of 5, a running median of 3, and ending with Hanning running weighted averages (span 3). Residuals are computed by subtracting the smoothed series from the original series, and this whole process is then repeated on the computed residuals. Finally, the smoothed residuals are computed by subtracting the smoothed values obtained the first time through the process. A bit of a head banger I admit, but there is a partially useful summary here with nowt to be found on Wiki!
At this stage it might do well for me to throw out three examples to show what sea surface temperature (SST), land surface temperature (LST) and Arctic sea ice extent (SIE) look like in the flesh and when subject to the smoothing iron:
That huge dip centred on 1967 is rather interesting. I presume this is real and not some artefact of data collection in which case I either need to find an explanation or create an indicator variable to flag up a very different period when it comes to time series modelling. The same goes for the lesser dips of 1918 and 1995.
What caught my eye here – aided by the T4253H smoothed orange wiggle – is just how periodic this data series is. I hadn’t appreciated this before, and it’s all rather curious because we don’t see this strong periodic pattern with sea surface temperature. Using my eyeballs alone I’m guessing 12ish-year periodicity which isn’t far off the solar cycle of 10 – 13 years. H’mmmm, ok, so we better have a big think about this later on!
I think T4253H has done a spiffing job of this. We’ve smoothed out some noise whilst retaining the underlying character. The little kink upward at the end is noteworthy: is this the beginning of a new, ice-laden era or just a blip?
Sea & Land Tango
I am sure that there will be readers trying to compare the sea surface and land surface time series, so here they both are converted to Standard Scores (Z Scores) in one tidy plot:
T4253H smoothing has clarified the situation so we may see just how well these two series correspond. In terms of overall correlation, the Pearson bivariate coefficient fetches up at r = 0.782 (p<0.001, n=123), which is good going for two variable variables! We can also see that the huge dip around 1967 was observed both on land and sea, so is very likely a real thing.
One natty thing we can do at this stage is resort to cross-correlation analysis (CCA) to determine what sort of a dance these two major series do actually do. My eyeballs pick out instances where land surface temperature (LST) leads sea surface temperature (SST), and instances where SST leads LST. This is where CCA comes in mighty handy, for it is a gem of a spanner for looking at periodic signals beating together over time. Here’s the CCA plot:
NOTE: If land surface temperature follows sea surface temperature and this relationship is positive then we’ll see a palisade of positive value red bars sticking up past the 95% confidence limit dashed line at positive lags. Whereas if sea surface temperature follows land surface temperature and this relationship is positive then we’ll see a palisade of positive value red bars sticking up past the 95% confidence limit dashed line at negative lags. If the relationship between the two variables is negative (sea surface warming whilst the land is cooling, and vice versa), then we’ll see negative-going red bars that push beyond the lower 95% confidence limit dashed line.
So now, what have we got here? We’ve got an upright bar at a positive lag of +1 year, indicating the land warms a year after Arctic seas warm, and we’ve got an upright bar at a negative lag of -1 year, indicating Arctic seas warm a year after the land warms. However the biggest positive-going bar of all is plonked at lag zero, meaning the land and sea warm and cool together within the same season. Hopefully all that makes sense with things warming and cooling together, and with two-way energy transfer.
What will hurt our brains are those statistically significant negative-going bars down at a lag of -5 years and thereabouts. These indicate that either the land starts cooling five years before the seas start warming or the land starts warming five years before the seas start cooling. To say this is most curious is an understatement!
We might start to get a clearer idea of what is going on if we drink in the whole slide. I am sure we can all see the undulating pattern that incessantly flips from positive to negative. This situation arises when we have two oscillating variables with slightly differing periodicities such that they drift in and out of phase over time. A good example is that wow-wow-wow effect (beat frequency) when tuning a guitar string to another and the frequency of the new string is getting close. Thus it isn’t a simple matter of warm Arctic seas warming the land or the land warming the seas; there’s a feedback mechanism and a degree of independence that produces a complex dynamic.
If you get your fingers to do the walking over this plot you’ll see the positive peak-to-peak lag offering 11 and 14 year beat cycles, with the negative peak-to-peak lag offering 12 and 13 year beat cycles. This feels very solar, and I can see that I’m going to have to expand the study in a cosmic direction!
Kettle On!
Apart from statistic, there is also physics. Could it be simply the effect of different heat capacity. Water heats and cools slower than land, and it is the reason for different time lag? The energy source, the Sun, is external to both.
My brain hurts! Maths always did this to me. However, fascinating that the solar cycle may well be important in the fluctuations in sea and land temperatures; it's not surprising when you think about it. I'm pretty sure that global temperatures are much more influenced by solar activity than by piffling amounts of trace gases. I look forward to the next installment when I hope you will include the last of the four variables promised above (mean annual CO2 levels).