An attempt of reproduction of Sovacool et al.’s “Differences in carbon emissions reduction between countries pursuing renewable electricity versus nuclear power”

In this paper, we attempt to reproduce the results obtained by Sovacool et al. in their recent paper that focuses on the diﬀerences in carbon emissions reduction between countries pursuing renewable electricity versus nuclear power. We have found several ﬂaws in the models and the statistical analysis performed theirein, notably the correlations performed between the fractions of renewable power and of nuclear power and greenhouse gas emissions per capita and the lack of consideration for natural bias between the variables examined


Introduction
Lowering Greenhouse Gas (GHG) emissions, foremost amongst which are carbon dioxide emissions, has been established as a priority in order to mitigate the effects of anthropogenic climate change.While it is clear that abandoning fossil fuels is imperative, there is still some debate about the details of the transition to decarbonized sources of energy.As it was reported in Chapter 2 of the 2018 IPCC report [1], the role of nuclear energy increases along most pathways to decarbonation, although the variance in the share of nuclear energy is quite large across the spectrum of the different models and paths considered in the literature [2,3].For instance, there are scenarios of 100% renewable energy which have been considered by some authors [4,5], although the validity of the assumptions of these high-renewable models have been contested [6].By contrast, there are also examples where the role of nuclear power is greatly increased, such as in [7][8][9][10].
In the context of this debate, Sovacool et al. performed a study concluding that "the implication for electricity planning is that diverse renewables are generally proving in the real world to be significantly more effective than nuclear power at reducing climate disruption" [11].Fell et al. have since published a response [12], criticizing the methodology and other aspects of their paper.In particular, the authors find that "nuclear power and renewable energy are both associated with lower per capita CO 2 emissions with effects of similar magnitude and statistical significance".
Similarly, Wagner [13] has recently obtained results in direct contradiction to the conclusions drawn by Sovacool et al.This paper provides supplementary criticism for the validity of the conclusions of [11] and concludes that "both, nuclear and renewable power allow a reduction of national CO 2 emission levels.[. . .] The analysis of the databases employed in this study did not yield evidence for any further hidden variables, national CO 2 emission might possibly depend on.Specifically, no evidence for "crowding-out" can be detected".
Sovacool et al.'s paper relies on the statistical analysis of historical data available for different variables across a variety of countries.In particular, it relies on establishing correlations amongst the share of nuclear energy (henceforth denoted N ) versus renewable energy (henceforth denoted R) as a fraction of the electrical mix and CO 2 eq emissions per capita, while taking Gross Domestic Product (GDP) per capita as a confounding variable.
After attempting to reproduce the results of Sovacool et al. [11], we have found that the analysis performed is considerably flawed both because there were mistakes in the statistical analysis and because there were inconsistencies in the logic of the authors, in particular concerning: • the "crowding out" hypothesis, i.e. that renewables and nuclear power are structurally incompatible, so there is an anticorrelation between them; • the rejection of the "climate mitigation" hypothesis, which states that "the relative scale of national attachments to nuclear electricity production will vary negatively with carbon emissions".
Both of these elements involved regressions of noncarbonated sources of electricity with GHG emissions, despite the fact that decarbonated energy sources are not good predictors of GHG emissions.The rest of this paper is separated as follows.First, we will give a more detailed account of each of the arguments above.Then, we will give some complementary technical details regarding the data set and our analysis of the complementary data provided by the authors.

Criticism
2.1 Fossil fuels as the real predictor and the "crowding out" hypothesis Both renewable and nuclear energy emit little to no GHGs, but energy stemmming from fossil fuels does.With respect to the GHG emissions per capita, the only relevant variable is the fraction of fossil fuels in the electricity production of each country (which we will denote F ).It follows that the fraction of nuclear energy or renewables in the electrical mix is not a good predictor of GHG emissions, independent of statistical treatment of the data.As shown by our in depth analysis in the appendix, the rejection of the authors of the "climate mitigation" hypothesis arises from an inadequate statistical analysis (cf.Sect.2.2) and from the following fact.
The fraction of electricity produced with renewable, nuclear and fossil fuels satisfies the tautological relation: This relation implies that these three variables are necessarily correlated with one another.In light of this reasoning, no matter the statistical treatment of the data the predictive power of R or N for the GHG emissions can only stem from that of F (cf. Proposition A.1.and Appendix A.4).In particular, the analysis of the authors of [11] on the effect of renewables and nuclear energy as well as their rejection of the "climate mitigation" hypothesis reflects nothing other than relation (1).Moreover, the reasoning behind the "crowding out" hypothesis is flawed.Indeed, the authors of [11] motivate the proposal of the "crowding out" hypothesis as follows.Intermittent renewables require a decentralized electrical infrastructure as soon as they occupy a significant fraction of the electricity produced.By contrast, the optimal electrical infrastructure of non-intermittent power sources, such as fossil fuels, hydroelectricity and nuclear power is centralized [14].The authors then suggest that, for these reasons, there should be an anticorrelation between R and N , which is the statement of the so-called "crowding out" hypothesis.They back this statement by verifying that R and N are indeed anticorrelated and use this to justify their statements.
However, this explanation is inconsistent with the data studied, since most of the electric production considered to be "renewable" from 1990 to 2015 was hydroelectricity -with intermittent power sources such as wind and solar contributing only negligible amounts to the statistic according to the BP Statistical Review of World Energy (2019) [15].Furthermore, given any three positive random variables satisfying relation (1), one can always find at least two pairs of variables that are negatively correlated (cf.Proposition A.1.).It is thus little to no surprise to find that R and N are negatively correlated, but this has nothing to do with the causality relation of the "crowding out" hypothesis suggested by Sovacool et al.It is simply the consequence of the simple mathematical relation between the variables studied.This statement is backed by our in-depth analysis in the appendix of this paper.
2.2 Flaws in the statistical analysis and the rejection of the "climate mitigation" hypothesis Sovacool et al. propose two timeframes (1990-2004 and 2000-2014) along which the data is are split and averaged and justify this by claiming this is "an optimal use of the data", because "renewable energy figures were only recorded since the nineties".However, the averaging procedure of the authors is not justified from a time series analysis (TSA) perspective nor does it exploit the data in any sense of optimality (from a statistical standpoint).Furthermore, that this averaging procedure does not affect the conclusions of the paper should have at the very least been shown by the authors by demonstrating the stability of this procedure, i.e. whether a change in the time step and number of timeframes considered changes the conclusions of the regression analysis or not.However, this was never made explicit in the paper.In general, disregarding TSA considerations may lead to modifications in the results of any subsequent analysis, as many potential time series complications could arise, in particular nonstationarity [16][17][18].Note also that such concerns could have been easily foreseen, as it is not surprising that the data studied is are non-stationary, since many countries underwent rapid industrialization during the studied time period.This a priori arbitrary treatment of the data questions the integrity of the data set used for the subsequent analysis, and by extension, the entire analysis itself and its conclusions.
However, even if the averaging procedure of the authors turns out to be stable and assuming there are no TSA complications in the study of this data set, there are many inconsistencies and flaws in the subsequent statistical analysis performed in [11].These will be treated in more detail in the appendix, but a non-exhaustive list includes: • given the nature of the conclusions and the context of Sovacool et al.'s study, the forward selection performed is inadequate (cf.Appendix A.4): a more appropriate approach would be to consider bidirectional selection, as it also excludes independent variables which do not play a significant role in the predictive power of the model [19].
In other words, the strength of the conclusions drawn by the authors and their policy recommendations cannot be backed by the analysis performed in the paper, since one cannot draw strong conclusions about the relative importance of the variables in the model using forward selection.If the objective was to draw these conclusions, bidirectional selection is more appropriate.• The poor study of the data set before the start of the regression analysis (for instance, there was no check for heteroskedascity), which inevitably led to a suboptimal model, i.e. one with too many variables (or inappropriate ones) -some of which turn out not to be significative -without an increase in goodness of fit, or predictive power (cf.Appendix A.4). • The failure to take into account concentration along the fraction of nuclear power axis of the data set (most countries have no nuclear power, hence most of the data set lies exactly at zero with respect to this variable, which is a huge bias of the statistics regarding this variable), which biases the regressions performed (cf.Appendix A.3). • The interpretation of correlation coefficients as importance measures of the random variables used in the regression analysis (which are correlated) is not justified.While for independent random variables the regression coefficients may be relatively good metrics of importance, this is no longer the case as soon as the variables become correlated.Indeed, even standardizing the correlated set of regression variables, one can obtain results which are misleading.The data set considered presents exactly this problem [11]: we have already pointed out relations between the regression variables considered, which induce correlations between them.If a metric of importance is to be considered, there are appropriate statistical tools to treat the question of the importance of regressed variables in the multicollinear case.We refer the reader to [20][21][22][23] and to the appendix of this article (Appendix A.2) for a more in depth description of these methods.The latter have been implemented in R as the packages relaimpo [21] and sensitivity [20].An in depth analysis of the data using these tools would be enlightening in future works in this direction.

Other elements previously highlighted by other authors
As previously noted, Sovacool et al.'s paper [11] has been discussed by different authors [12,13].Let us briefly state some of the main arguments these authors have made regarding this matter.Fell et al. [12] noted that, among other points, • Sovacool et al.'s paper [11] does not find a positive correlation between N and emissions, but instead finds a negative correlation, which is non-significant.However, due to the small sample size (30) of nuclear countries, that this turns out not to be significant is not surprising; • the crowding out hypothesis does not say anything about the ability of nuclear power to avoid emissions; • the cross-sectional approach of Sovacool et al. with respect to the time frames is also criticized, Fell et al. note that cross-sectional analyses with low sample sizes are sensitive to outliers and sampling choices.Moreover, the choice to study a lagged effect without statistical motivation is criticized by the authors; • the paper's analysis includes the complete set of countries with low GDP per capita.These countries have low emissions per capita and little to no nuclear power.This choice "appears to establish a weaker correlation between nuclear and low per capita emissions, but likely reveals only that many poorer nations have lower emissions per capita due to greater reliance on agriculture and informal economic activities".
Similarly, Wagner [13]  • "no evidence was found that countries using nuclear power systematically employ more fossil fuels preferentially coal to the extent that the emission-free nature of nuclear energy is not only offset thereby but even overcompensated"; • N and R are anti-correlated, but when performing the study over the smaller set of European countries statistical criteria do not attribute significance to the correlation.As previously noted, given the small sample of countries this lack of significance is not in itself surprising; • finally, "the regression of the total CO 2 emissions with [the total amount of energy produced, the amount of renewable energy and the amount of nuclear energy (in absolute value)] as independent variables does not provide any new or deeper insight; specifically, it does not [suggest] inner relationships and correlations with cultural and sociological factors as searched for in [11]".

Conclusion
The analysis of Sovacool et al. does not back their concluding statements.As demonstrated in this paper and its appendix, all the conclusions of their paper do not follow from the data or from proper statistical treatment of it -in particular, the failure to recognize that the predictive power of their model came from the fraction of fossil fuels in the electrical mix, and to take into account the basic relation between the fraction of renewables and nuclear in the electrical mix is fatal to their conclusions.
This means that, given any situation, one can expect to find negative correlation of fractions of the same whole more than two thirds of the time.

A.2. Some remarks on regression analysis
We will now demonstrate why the use of regression coefficients as an importance metric is misleading for regression variables which are correlated.To do this, let us consider a probability space (Ω, F, P), a random variable Y ∈ L 2 (Ω) which we wish to regress using the set of random variables {X 1 , . . ., X n } ⊂ L 2 (Ω), which we will assume to be linearly independent, but not necessarily independent (or orthogonal: if L 2 (Ω) is a centered gaussian space these concepts coincide).As previously noted, linear regression in this context is nothing other than the orthogonal projection of Y onto the hyperplane spanned by the so-called explanatory variables X 1 , . . ., X n .We can write where Proj H (Y ) denotes the projection of Y onto the vector space H and ε is the orthogonal component of Y to Span(X 1 , . . ., X n ).We may express Proj Span(X1,...,Xn) (Y ) in the coordinates of the basis X 1 , . . ., X n .This yields the classical regression analysis equation If X k are orthogonal in L 2 (Ω), the β k can be simply expressed in terms of the inner product of Y with X k , However, if X k are not orthogonal (and therefore not independent), the coefficients can be found by virtue of orthogonalizing the (X k ) k basis to perform the projection and finally changing back the result to (X k ) k coordinates.With this said, as depicted in Figure A.1 it is easy to geometrically see why the regression coefficients β k stemming from correlated variables might be misleading.
From Figure A.1 we see that any metric of "importance" considered for correlated variables should avoid using a coordinate dependent framework.Instead, a coordinate-free approach should be considered.Such a coordinate-free coefficient of importance of variable X k could be a (weighted) average over all subsets A , the improvement in R 2 (or, up to a sign, difference in variance unexplained) when including variable X k in a linear model spanned by the variables in A. Geometrically, going back to Figure A.1, this is similar (up to a sign and normalization) to taking the average over all such subsets A of the difference in distance between hyperplane Span(A ∪ {X k }) to Y and the hyperplane Span(A) to Y .This measure of importance is not coordinate dependent, as the measure depends only on the vector spaces spanned by the different variables.As a side note, we remark that this interpretation provides the link between the different formulae typically used for the so-called LMG.On one hand, notice there are exactly n−1 j hyperplanes spanned by j vectors chosen from a set of n−1 vectors.It follows that the average of ∆R 2  A over all possible linear models A not including X k can be written as where the second sum is carried over the subsets A j ⊂ {X 1 , . . ., X n } \ {X k } having cardinality j.This is the formula for LMG first discovered by Christensen [25].On the other hand, it is also easy to see that this average can also be written down as (A.12) yielding the classical result of the equivalence between all the formulae from the literature [21].This is exactly the the approach taken in [20][21][22][23], which we suggest should be used to further verify the validity of the conclusions of [11].These methods should be understood as weighting the contribution of X 2 in reducing the residual variance to linear models spanned by the X k s.Going back to Figure A.1, it is clear that including X 2 yields non-trivial information, as it considerably increases the variance explained by the linear model, but that this importance is not less than that of X 1 .This is not as easily seen when considering the (X 1 , X 2 ) coordinates of Proj H (Y ), but is clear when looking at the geometric layout of the vectors Y , X 1 and X 2 .As previously stated, these methods have been implemented in R as relaimpo and sensitivity.Preliminary results suggest that this metric of importance for correlated variables yields different results than that of [11].It would also be of similar interest to attempt to reproduce the results of Fell et al. [12] and Wagner [13] using this metric.

A.3. Covariances and correlations of N, R and F
If we now let N , R and F denote the fraction of nuclear, renewable and other sources of the electrical production, Proposition A.1.applies.Taking a look at the data from the study, Figures A.2-A.4 show the distributions for each of these variables (the fractions are along the x axis and the y axis is the count of the histogram).Note that F is dominated by fossil fuel contributions.The covariances of these variables can be found in Table A Upon examining the distribution of F , one sees that it is approximately uniform, which should set its variance to be close to 1  12 .By contrast, nuclear power tends to play a small role in the electrical mix of most countries, which tells us that Var(N ) should be negligible with respect to Var(R) and Var(F ), as the observed values of N concentrate around 0. In particular, this immediately implies that Cov(R, F ) is relatively large in absolute value and negative (independent of interpretation).This is of capital importance when we examine stepwise selection models, which will yield significance for R, but which we will see actually stem from the greater predictive power of variable F of GHG emissions per capita.In particular, the conclusions of Sovacool et al. about the efficacy of renewables to decarbonate do not follow from any statistical analysis, as this covariance is only large and negative because Var(N ) is negligible.
Finally, we must compare Var(F ) with Var(R).Here, Var(R) > Var(F ), and so we have the negative correlation between N and R mentioned in the paper.Looking at the distributions of R and F , one finds that this is due to the fact that most countries seem to either focus on renewables (mainly hydroelectric power in the timeframes considered) or not have any at all, whereas the distribution of other (fossil) sources is more or less uniform.This   negative covariance between N and R is thus explained solely by the latter and the mathematical relation linking the three variables.The negative nature of the correlations can further be emphasized by examining the rank correlation matrix (here, the Spearman ρ coefficient) between each of the variables.Unsurprisingly, we find that the variables all have negative rank correlation.The exact values of Spearman ρ coefficients between the variables for time frames 1 and 2 are tabulated in Tables A. 3 and A.4, respectively.

A.4. Stepwise selection
"Hierachical regression" is more commonly known as stepwise selection in statistics.Stepwise selection can be done in two different directions: forwards or backwards.In forward stepwise selection, one starts with the null model and progressively adds variables while evaluating the significance of each addition, and so, at step n, if variable X n does not yield a significant improvement in the pre- dictions of the model, this variable is discarded.In backward stepwise selection, the opposite is done.That is, we start with a family of variables, and taking out variables by examining which loss gives the most statistical insignificant detereoration of the model fit.Finally, one can do both steps simultaneously, that is, go backwards and forwards to provide an extra check that the choice of variables is optimal.Beyond this choice of approach, trying to maximize predictive power via improvement of the goodness of fit (R 2 ) while intending to study causation is wrong.Relying on R 2 alone can induce into error for two main reasons: • R 2 increases monotonously in the number of parameters added into the model.• The data span multiple orders of magnitude.This renders small relative variations of the points at large scales to have a considerable effect on the significance of the increase in R 2 , despite there being no real meaning behind this significance.
Adapted statistical tools should have been used, such as adjusted R 2 of the fit which takes into account the number of parameters in the model.As for the second point it is more delicate to address so we will do it stepwise, by attempting to reproduce and correct at each step the steps taken in [11].The data set studied will be that of timeframe 1. A.4.1.GDP and GHG emissions per capita Plotting the GDP per capita and the CO 2 eq emissions per capita (henceforth denoted GDP and CO 2 , respectively, for simplicity) the countries considered in [11] yields the results in Figures A. 5  However, after performing a regression analysis in this model, we notice that β 0 is not significative (although we do retrieve their result an R 2 of 0.48 for this model).
Applying the principles of bidirectional selection, we exclude β 0 and examine instead: For timeframe 1, the estimates for the parameters of the model given by the regression are given in Table A.5 and an adjusted R 2 of 0.64, a result which already rivals the (non-adjusted) R 2 they obtain at the end of their forward selection (0.66).
Remark A.1.The reported P -values are grossly underestimated, since the underlying distribution of the resid- Heteroskedastic data spanning many orders of magnitude are often a sign of an underlying Pareto distribution (or power law).This hypothesis can be checked by looking at the data on a log − log plot (cf.Figs. A. 7 and A.8).By inspection we can see that this hypothesis seems to be confirmed.
The simplest model we can postulate is given by log CO 2 = β 0 + β 1 log GDP.(A.15)This simple model has an adjusted R 2 of rougly 0.69 and the following regression table for timeframe 1.The full regression results of this model can be found in Table A. 6 This shows that poor a priori inspection of the data from the part of the authors of [11] ultimately led to a suboptimal model.In particular, we notice that this adjusted R 2 is already higher than any of the R 2 values obtained by the authors at the end of their forward selection (0.66), despite being penalized for taking into account the number of variables in the model and only having two predictors.
Since the goal of this paper is an attempt to reproduce the results of Sovacool et al. we will keep model of equation A.14 in what will follow, despite the fact that going forwards we should consider accounting for the confounding variable with a power law and not just a linear model.
From Figures A.7 and A.8, it seems clear that a saturation phenomenon occurs and that we enter a different regime as GDP becomes larger.We could think here of postulating a more complicated non-linear regression model to account for this phenomenon (for instance by imposing a quadratic regression, or non-parametric smoother model), taking into account that more complex importance measures for should be then applied [20].This is, however, beyond the scope of this paper.

A.4.2. Nuclear, renewables, GDP and CO 2 eq emissions
We discard the N variable after performing bidirectional selection, as the variable does not prove to be significant or  to provide considerable improvement to the adjusted R 2 .Other than the obvious reason that nuclear power emits little to no GHGs, there are other explanations of why this is not a significative explanatory variable in our model.The addition of N as a variable only affects 30 of the data points, many of which lie close to 0% nuclear energy, which does not add much information to the model (around half of them are below the 20% mark).On timeframe 2, one can speculate that there are two trendlines, one before the 30% mark, which is increasing, and the other afterwards, which decreases.Of course, this may purely be an artefact of the data given the low sampling.Furthermore, β 1 is not deemed significatively different from 0, meaning that R plays no role in predicting GHG emissions per capita.This is of course, obvious from the fact that renewable energy emits little to no GHGs.
By contrast, F mostly carries information about the fraction of fossil fuels in the electrical mix, since other sources of energy are negligible once we have excluded fossil fuels, renewables and nuclear power.It follows that a more reasonable model is simply (A.17) As before, parameter β 0 was found to be non-significant.Following the principles of bidirectional selection, we  Finally, let us show that the predictive power of R in Sovacool et al.'s suboptimal model came from F .To do this, we compare their model to the following (also suboptimal) model These models have respective regression tables given in Tables A.9 and A.10.
There are couple of things to note.First, the suboptimality of Model A.20 is reflected by the fact that β 0 is evidently not significant.More importantly, β 2 is almost exactly the same in absolute value as it was in the previous model.This, in conjunction with the large anti-correlation between F and R allows us to conclude that the predictive power of R in Model A.19 was in fact inherited from that of F .Of course, there is a tautological causal link behind this correlation given that F mostly consists of the fraction of fossil fuels in the electrical mix.

Fig. A. 1 .
Fig. A.1.Two correlated random variables X1 and X2 spanning plane H in L 2 (Ω).The projection of Y onto plane H has strictly larger R 2 than the that of the projection of Y onto either X1 or X2 alone.However, when expressed in (X1, X2) coordinates, the expression of Proj H (Y ) is misleading, since the coordinates of this projection are large due to the fact variables X1 and X2 are almost collinear (in fact they are larger than those of the projection of Y onto X1 or X2 individually).

Fig. A. 2 .
Fig. A.2. Number of countries as a function of N .

Fig. A. 3 .
Fig. A.3.Number of countries as a function of R.

Fig. A. 4 .
Fig. A.4. Number of countries as a function of the fraction of F .

Fig. A. 5 .Fig. A. 6 .
Fig. A.5. CO2 as a function of GDP in time frame 1 for all countries.

Fig. A. 7 .
Fig. A.7. CO2 as a function of log(GDP) in time frame 1 for all countries.

Fig. A. 8 .
Fig. A.8. CO2 as a function of log(GDP) in time frame 2 for countries.

Fig. A. 9 .
Fig. A.9. as a function of N for the nuclear countries in frame 1.

Fig. A. 10 .
Fig. A.10. CO2 as a function of N for the nuclear countries in time frame 2.
attempted to reproduce the results of Sovacool et al. on different data sets, namely a compilation of datasets of 26 European countries (we refer the [13]er to ([13], Appendix B) for the detailed sources and reliability controls performed on these databases) and two world-wide databases extracted from the IEA data bank[24].In his paper, Wagner points out that• by redoing the study on different publicly avalaible data sets, one fails to reproduce the results of Sovacool et al.Most notably, the fraction of nuclear power in the energy mix correlates negatively with GES emissions and "the analysis of both European and more global data shows that both renewable and nuclear technologies allow a reduction of CO 2 emissions with comparable efficacy"; .1 for Timeframe 1 and in Table A.2 for Timeframe 2.

Table A .
2. Covariances between different variables of timeframe 2 for renewable countries (nuclear countries included).TableA.4.Spearman ρ between different variables of timeframe 2 for nuclear countries.

Table A .
5. Regression results for model A.14.
We can look at the regression analysis of this model, whose details are given in Table A.7.The adjusted R 2 value for this iteration of the model is 0.64, which does not improve the previous model.Table A.6.Regression results for model A.15.
Table A.7. Regression results for model A.16. Table A.8. Regression results for model A.18.
Table A.9. Regression results for model A.19. Table A.10. Regression results for model A.20. exclude β 0 , and instead consider: CO 2 = β 1 GDP + β 2 F, (A.18) whose regression table can be found in Table A.8.Both variables are significant predictors and the adjusted R 2 of this model is 0.80, and the standard error of the predictors decreased.