Selecting a Bias Correction Method: A Case Study on Multivariate Indicators
- camilleguedamour
- 16 hours ago
- 6 min read
In a previous tutorial, we saw that climate projections can contain systematic errors that must be corrected before use. The method to use for this "bias-correction" can depend on the variables studied, the intended use, the geography of the region, or the resources available for the project. This article provides an illustration of the choice of a debiasing methodology... and the risks if this step is neglected.
Our objective in this case study is to prepare climate projections to quantify the evolution of the health impacts of heat in a city in northern Italy.
With global warming, heatwaves are becoming more intense and frequent, and health risks are increasing. The health impact of heat does not depend solely on temperature; humidity, in particular, is an aggravating factor. To account for this, we will use an indicator that combines temperature and humidity: the wet-bulb temperature.
This indicator is not directly simulated by climate models. We will therefore calculate it from two standard meteorological variables: the daily maximum temperature and relative humidity.
As we will see, particular vigilance is required in correcting the projections used to calculate multivariate indicators - or in other words metrics that combine multiple climate variables.
First: why is it necessary to correct climate projections for bias?
For this demonstration, the exact projection used is not very important: everything would happen more or less the same way regardless of the climate model, the project it comes from (Cordex, CMIP5, CMIP6...), the variables or the location... But we will use climate projections from the EuroCORDEX project, more specifically produced by the British general circulation model HADGEM2-ES and downscalled with the Danish regional model HIRHAM5.
As is generally the case, the results of this model pair are biased. This can be easily verified by comparing the values it gives over a sufficiently long reference period (here 1976–2005) with the actual values over the same period:
We do not expect to obtain exactly the same value for the same day, but the two series should have the same statistical properties. This is not the case, as can be easily seen by comparing the distributions and quantiles. The quantile-quantile plot, in particular, shows that the model generally underestimates the temperature but overestimates it during the coldest days.
Before considering using the projections, these errors must be corrected. We will use the CDFt method, a very widely used bias-correction method. For example, it was used by Météo France to create the bias-corrected projection dataset DRIAS 2014.
Once the correction is applied, the two series have practically identical distributions and quantiles:
The two temperature series are still not identical, but they have very similar statistical properties, so one could study either one and obtain comparable results.
If we want to obtain the maximum daily temperatures for the future, we will take the simulated values in the period to be studied, for example, 2071–2100, and apply the same correction.
The problem of consistency between variables
This is relatively simple when we are interested in a single variable, but in our case, we want to use two variables - temperature and humidity - to compute another indicator, the wet bulb temperature.
There is a problem: meteorological variables are not independent. But the CDFt method, like most debiasing methods, is univariate: it can only be applied separately to each of the variables of interest. In this case, nothing guarantees that their relationship will be preserved.
If we go back to our example, we can use the reference dataset to see if there is a relationship between relative humidity and daily maximum temperature:
Clearly, yes, there are links between the two variables. In particular, humidity decreases rapidly when the temperature rises above 300K or about 25°C.
This relationship makes physical sense: the higher the temperature, the more air can contain water vapor. Relative humidity is the ratio of the amount of vapor in the air to the maximum possible amount, or more rigorously: the vapor pressure over the saturated vapor pressure. Unless there is a significant input of water vapor, it is normal for realtive humidity to decrease when the temperature rises.
Let's now redraw the same graph with the bias-corrected temperature and humidity series. What do we see?
The relationship between the two variables is much less marked. The rapid decrease in relative humidity when the maximum temperature exceeds 25°C seems to have disappeared...
The risk: producing false results
When calculating multivariate indicators, the loss of relationships between the different variables can seriously distort the results.
In our example, we want to evaluate the maximum values of the wet-bulb temperature. The rapid decrease in the humidity rate with the daily maximum temperature, observed in the reference data, will have a moderating effect: when it is hot, the air is generally drier, which limits the impact of heat on comfort and health.
With the corrected data, this relationship has been lost. In the reference data, we see for example that it is rare to have a humidity level above 50% when the maximum temperature exceeds 310K (about 37°C), but with the corrected data, this is no longer the case...
The bias-corrected projections of temperature and humidity are correct independently of each other, but their decorrelation leads to an overestimation of the probability that a hot day will also be humid. Therefore, it distorts our result: we risk overestimating the extremes of wet-bulb temperature.
As in this example, the use of classical bias correction methods, most of which are univariate, in the calculation of multivariate indicators does not reliably reduce biases and can even increase them.
This can be a huge problem as in agronomy (plant growth models, soil dryness...), industry (cooling capacity, solar yield...), risks (thermal comfort indicators, forest fire index...), and many other fields, there are numerous multivariate indicators.
Bias correction for multivariate indicators
For this type of study, it is essential to use a bias correction methodology appropriate for multivariate analysis. There are two types of solutions:
Either choose a bias correction method that preserves the relationships between the different variables, for example, dOTC or MRec.
Or correct each variable independantly using a conventional bias-correction method, and reconstruct the relationships between variables after the correction, this is the principle of methods such as R²D² or MBCn.
For our example, let's use the R²D² method. This consists first of applying a univariate bias correction (often CDFt) and then reorganizing the values obtained to achieve coherent combinations between the different variables using a method called the Schaake Shuffle:
At the end of this process, the appearance of the results over the reference period is much more satisfactory. We can now apply the same transformations (bias correction + shuffle) to the simulations over the future period studied.
It should be noted that research on multivariate bias correction methods is recent and on-going: all the methodologies cited as examples were proposed after 2018. The use cases are still quite rare, and today most accessible projections are either not corrected (for example, the IPCC atlas) or are debiased with a univariate method (for example, DRIAS 2014 or 2020 in France).
If you want to use these data sources to calculate a multivariate indicator, you will therefore have to perform the correction yourself with an appropriate method...
👋Thank you for reading this far. We strive to share our expertise and hope you have learned something. But let's be realistic: even with our tutorials, reliably evaluating the effects of climate change on a territory or activity is a complex exercise and certainly not accessible to everyone...
Callendar is one of the few organizations in Europe that can help companies in producing climate projections that are both tailored to their needs and in line with scientific best practices. If you have a project, contact us to discuss it!