Issue 
EPJ Nuclear Sci. Technol.
Volume 4, 2018
Special Issue on 4th International Workshop on Nuclear Data Covariances, October 2–6, 2017, Aix en Provence, France – CW2017



Article Number  36  
Number of page(s)  8  
Section  Covariance Evaluation Methodology  
DOI  https://doi.org/10.1051/epjn/2018042  
Published online  14 November 2018 
https://doi.org/10.1051/epjn/2018042
Regular Article
On the use of the BMC to resolve Bayesian inference with nuisance parameters
CEA, DEN, Cadarache,
13108
Saint Paul les Durance, France
^{*} email: edwin.privas@gmail.com
Received:
31
October
2017
Received in final form:
23
January
2018
Accepted:
7
June
2018
Published online: 14 November 2018
Nuclear data are widely used in many research fields. In particular, neutroninduced reaction cross sections play a major role in safety and criticality assessment of nuclear technology for existing power reactors and future nuclear systems as in Generation IV. Because both stochastic and deterministic codes are becoming very efficient and accurate with limited bias, nuclear data remain the main uncertainty sources. A worldwide effort is done to make improvement on nuclear data knowledge thanks to new experiments and new adjustment methods in the evaluation processes. This paper gives an overview of the evaluation processes used for nuclear data at CEA. After giving Bayesian inference and associated methods used in the CONRAD code [P. Archier et al., Nucl. Data Sheets 118, 488 (2014)], a focus on systematic uncertainties will be given. This last can be deal by using marginalization methods during the analysis of differential measurements as well as integral experiments. They have to be taken into account properly in order to give wellestimated uncertainties on adjusted model parameters or multigroup cross sections. In order to give a reference method, a new stochastic approach is presented, enabling marginalization of nuisance parameters (background, normalization...). It can be seen as a validation tool, but also as a general framework that can be used with any given distribution. An analytic example based on a fictitious experiment is presented to show the good adequations between the stochastic and deterministic methods. Advantages of such stochastic method are meanwhile moderated by the time required, limiting it's application for large evaluation cases. Faster calculation can be foreseen with nuclear model implemented in the CONRAD code or using bias technique. The paper ends with perspectives about new problematic and time optimization.
© E. Privas et al., published by EDP Sciences, 2018
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Introduction
Nuclear data continue to play a key role, as well as numerical methods and the associated calculation schemes, in reactor design, fuel cycle management and safety calculations. Due to the intensive use of MonteCarlo tools in order to reduce numerical biases, the final accuracy of neutronic calculations depends increasingly on the quality of nuclear data used. The knowledge of neutron induced cross section in the 0 eV and 20 MeV energy range is traduced by the uncertainty levels. This paper focuses on the neutron induced cross sections uncertainties evaluation. The latter is evaluated by using experimental data − either microscopic or integral, and associated uncertainties. It is very common to take into account the statistical part of the uncertainty using the Bayesian inference. However, systematic uncertainties are not often taken into account either because of the lack of information from the experiment or the lack of description by the evaluators.
A first part presents the ingredients needed in the evaluation of nuclear data: theoretical models, microscopic and integral measurements. A second part is devoted to the presentation of a general mathematical framework related to Bayesian parameters estimations. Two approaches are then studied: a deterministic and analytic resolution of the Bayesian inference and a method using MonteCarlo sampling. The next part deals with systematic uncertainties. More precisely, a new method has been developed to solve the Bayesian inference using only MonteCarlo integration. A final part gives a fictitious example on the ^{235}U total cross section and comparison between the different methods.
2 Nuclear data evaluation
2.1 Bayesian inference
Let denote some experimentally measured variables, and let x denote the parameters defining the model used to simulate theoretically these variables and t is the associated calculated values to be compared with y. Using Bayes' theorem [1] and especially its generalization to continuous variables, one can obtain the well known relation between conditional probability density functions (written p(.)) when the analysis of a new dataset y is performed: (1) where U represents the “background” or “prior” information from which the prior knowledge of x is assumed. U is supposed independent of y. In this framework, the denominator is just a normalization constant.
The formal rule [2] used to take into account information coming from new analyzed experiments is: (2)
The idea behind fitting procedures is to find an estimation of at least the first two moments of the posterior density probability of a set of parameters x, knowing an a priori information (or first guess) and a likelihood which gives the probability density function of observing a data set knowing x.
Algorithms described in this section are summarized and detailed description can be found in paper linked to the CONRAD code in which they are implemented [3]. A general overview of the evaluation process and the conrad code is given in Figures 1 and 2.
Fig. 1 General overview of the evaluation process. 
Fig. 2 General overview of the CONRAD code. 
2.2 Deterministic theory
To obtain an equation to be solved, one has to make some assumptions on the prior probability distribution involved.
Given a covariance matrix and mean values, the choice of a multivariate joint normal for the probability density and for the likelihood is maximizing the entropy [4]. Adding Bayes' theorem, equation (1) can be written as follow: (3) where x_{m} (expectation) and M_{x} (covariance matrix) are prior information on x, y an experimental set and M_{y} the associated covariance matrix. t represents the theoretical model predictions.
The Laplace approximation is also made. It enables to approximate the posterior distribution by a multivariate normal distribution with the same maximum and curvature of the right side of equation (3). Then, it can be demonstrated that both the posterior expectation and the covariances can be calculated by finding the minimum of the following cost function (Generalized Least Square): (4)
To take into account nonlinear effects and ensure a proper convergence of the algorithm, a Gauss–Newton iterative solution can be used [5].
Thus, from a mathematical point of view, the evaluation of parameters through a GLS procedure suffers from the Gaussian choice as guessed distribution for the prior and the likelihood, the use of Laplace approximation, the linearization around the prior for GaussNewton algorithm and at last a 2nd order terms neglected in the Gauss–Newton iterative procedure.
2.3 Bayesian MonteCarlo
Bayesian MonteCarlo (BMC) methods are natural solutions for Bayesian inference problems. They avoid approximations and propose alternatives in probability density distribution choice for priors and likelihoods. This paper exposes the use of BMC in the whole energy range from thermal, resonance to continuum range.
BMC can be seen as a reference calculation tool to validate the GLS calculations and approximations. In addition, it allows to test probability density distributions effects and to find higher distribution moments with no approximations.
2.3.1 Classical Bayesian MonteCarlo
The main idea of “classical” BMC is the use of MonteCarlo techniques to calculate integrals. For any function f of random variable set x, any integral can be calculated by a MonteCarlo sampling: (5) where p(x) is a probability density function. One can thus estimate any moments of the probability function p(x). By definition, the mean value is given by: (6)
Application of these simple features to Bayesian inference analysis gives for posterior distribution's expectation values: (7)
The proposed algorithm to find out the first two moments of the posterior distribution is to sample the prior probability distribution function times and for each x_{k} realization evaluate the likelihood: (8)
Finally, the posterior expectation and covariance is given by the following equation: (9) (10)
The choice of the prior distribution depends on what kind of analysis is done. If no prior information is given, a noninformative prior could be chosen (uniform distribution). On the contrary, for an update of a given parameters set, the prior is related to a previous analysis with a known probability distribution function.
The use of weights indicates clearly the major drawbacks of this classical BMC method: if the prior is far from high likelihood values and/or by nature values are small (because of the number of experimental points for example), then the algorithm will have difficulties to converge.
Thus, the main issue is that the covered phase space of sampling is not favorable to convergence. In practice a trial function close to the posterior distribution should be chosen for sampling.
More details can be found in paper [6].
2.3.2 Importance sampling
As previously exposed, the estimation of the integral of f(x) times a probability density function p(x) is not straightforward. Especially in this Bayesian inference case, sampling the prior distribution, when it is far away from the posterior distribution or when the likelihood weights are difficult to evaluate properly, could be very expensive and time consuming without any valuable estimation of posterior distributions. The idea is then to sample in a different phase space region by respecting statistics.
This trial probability density function p_{trial}(x) can be introduced by a trick as follow: (11)
Thus, putting this expression in equation (5), one obtains the following equation: (12)
Then, sampling is done on the trial probability density function p_{trial}(x) getting a new set of {x_{k}}. For each realization x_{k}, an evaluation of additional terms is necessary.
As a result, expectation and covariances are defined by: (13) and (14)with .
The choice of trial functions is crucial and the closer to the true solution p_{trial}(x) is, the quicker the algorithm will be. In this paper, a trial function used by default comes from the result of the generalized least square (with additional standard deviation enhancement). Many other solutions can be used depending on the problem.
3 Systematic uncertainties treatment
3.1 Theory
Let recall some definitions and principles. First, it is possible to link model parameters x and nuisance parameters θ with conditional probability: (15) with U the prior information on both model and nuisance parameters. The latter is supposed independent to the measurement. It means nuisance parameters are considered acting on experimental model. It induces giving the following equation: (16)
Moreover, evaluators are looking at the marginal probability density also written . It is given by the integration of the probability density function over marginal variables as follow: (17)
According to (16), the follow equation is obtained: (18) Then the Bayes theorem is used to calculate the first integral term of (18): (19)This expression is right if both model and nuisance parameters are supposed independent. According to (18) and (19), the marginal probability density function of the a posteriori model parameters is given by: (20)
3.2 Deterministic resolution
The deterministic resolution is well described in Habert thesis [7]. Several works have been first performed in 1972 by H. Mitani and H. Kuroi [8,9] and later by Gandini [10] giving a formalism to the multigroup adjustment and a way to take into account the systematic uncertainties. These were the first attempts to consider the possible presence of systematic errors in a data adjustment process. Equations are not detailed here, only the idea and the final equation is provided.
Let be the a posteriori covariance matrix obtained after an adjustment. The a posteriori covariance after marginalization can be found as follow: (21) with G_{x} sensitivities vector of the calculated model values to the model parameters and G_{θ} sensitivities vector of the calculated model values to the nuisance parameters. Similar expressions have been given in reference [8,9] where two terms appear: one for classical resolution and the second for some added systematic uncertainties. is a square matrix supposed reversal. It is often the case when there are more experimental points than fitted parameters. If numeric issues appeared, it is mandatory to find another way, giving by a stochastic approach. Further study should be undertaken to compare the deterministic method proposed here and the one identified in Mitani's papers in order to provide a more robust approach.
3.3 Semistochastic resolution
This method (written MC_Margi) is easy to understand starting from the equation (20): nuisance parameters are sampled according to a Gaussian distribution and for each history, a deterministic resolution is done (GLS). At the end of every simulation, parameters and covariances are stored. When all the histories have been simulated, the covariance total theorem gives the final model parameters covariance. The methods is not developed here but more detailed can be found in papers [7,11].
3.4 BMC with systematic treatment
BMC method can deal with marginal parameters without deterministic approach. This work has been successfully implemented in the CONRAD code. One wants to find the posterior marginal probability function defined in equation (20). It is similar to the case with no nuisance parameters but with two integrals. Same weighting principle can be applied by replacing the likelihood term by a new weight defined by: (22)
The very close similarities between the case with no marginal parameter enabled a quick implementation and understanding. Finally, the previous equation gives: (23)
Both integrals can be solved using a MonteCarlo approach. The integral calculation of the equation (22) is done as follow: (24)
The n_{θ} histories are calculated by sampling according to . The denominator's integral of equation (24) is then computed: (25)
The n_{x} histories are evaluated by sampling according to . Mixing equations (24) and (25), the new weight is given by: (26)
Let us consider N_{x} is the number of history sampled according the prior parameters and N_{θ} is the number of history sampled according the marginal parameters. The larger those number are, the more converge the results will be. The previous equation can be numerically written with no limits as follow: (27)
In order to simplify the algorithm, N_{θ} and N_{x} are considered equal (introducing N as N =N_{θ} = N_{x}). First, N samples are drawn according to θ_{k} and x_{k}. Equation (26) can be simplified as follow: (28)
In CONRAD, the N*N values of the likelihood are stored (i.e. . Those values are required to perform statistical analysis at the end of the simulation. The weight for an history k is then calculated: (29)
To get the posterior mean values and the posterior correlation, one should apply the statistical definition and get the two next equations: (30) (31) with defined as the weighting mean of the two parameters product: (32)
4 Illustrative analysis on ^{238}U total cross section
4.1 Study case
The selected study case is just an illustrative example giving a very first step towards the validation of the method, its applicability and potential limitations. The ^{238}U total cross section is chosen and fitted on the unresolved resonance range, between 25 and 175 keV. The theoretical cross section is calculated using the R mean matrix model. The main sensitives parameters in this energy range is the two first strength functions S_{l}_{ = 0} and S_{l}_{ = 1}. Tables 1 and 2 give respectively the spin configurations and the prior parameters governing the total cross section. An initial relative uncertainties of 15% is taken into account with no correlations. The experimental dataset used comes from the EXFOR database [12]. A 1% arbitrary statistical uncertainty is chosen.
^{238}U spin configuration considered for resonance waves s and p.
Initial URR parameters with no correlation.
4.2 Classical resolution with no marginal parameters
All the methods have been used to perform the adjustment. One million histories haven been simulated in order to get the statistical values converged below 0.1% for the mean values and 0.4% for the posterior correlation. The convergence is driven by how far the solution is from the prior values. Table 3 shows the overall good results coherence. Very small discrepancies can be seen for the classical methods caused by the convergence issues. For a similar number of history, the importance methods converged better than the classical BMC. The prior covariance on the total cross section is given on Figure 3. The anticorrelation created between S_{0} and S_{1} directly give correlations between the low energy and the high energy (see Fig. 4). The prior and posterior distribution engaged in the BMC methods are given in Figure 5. One can notice the Gaussian distributions for all the parameters (both prior and posterior).
Results comparison between the different methods implemented in CONRAD for ^{238}U total cross section.
Fig. 3 A priori covariance matrix on ^{238}U total cross section. 
Fig. 4 A posteriori covariance matrix on ^{238}U total cross section. 
Fig. 5 S_{0} and S_{1} distributions obtained with the classical BMC method. 
4.3 Using marginalization methods
This case is very closed to the previous paragraph. But this time, a nuisance parameter is taken into account. More precisely, a normalization is considered with a systematic uncertainties of 1%. 10 000 histories are sampled for the MC_Margi case (semistochastic resolution) and 10 000 × 10 000 for BMC methods. For the importance resolution, the biasing function is the posterior solution coming from the deterministic resolution. All the results seem to be equivalent, as shown in Table 4. However, the classical BMC is not fully converged because slight differences are found between the means values. Figures 6 and 7 show the mean values convergence using the stochastic resolutions, showing one more time not converged results with the classical BMC method. Calculation time are longer with marginal parameters. This is explained by the method which the idea is to perform a double MonteCarlo integration. The good coherence on the mean values and correlation between parameters give identical posterior correlation on the total cross section. Figure 8 shows the a posteriori covariance, whatever methods chosen.
Results comparison when a normalisation marginal parameter is taken into account. AN_Margi represents the deterministic method, MC_Margi represents the semistochastic resolution, BMC is the classical method and the last column called Importance is the BMC where an importance function is used for the sampling.
Fig. 6 S_{0} mean value convergence. 
Fig. 7 S_{1} mean value convergence. 
Fig. 8 A posteriori covariance of the ^{238}U total cross section. 
5 Conclusions
The use of BMC methods was exposed in this paper and an illustrative analysis was detailed. One important point is that these methods could be used for resonance range analysis (both resolved and unresolved resonance) as well as higher energy models. In addition, both microscopic and integral data assimilation could be achieved. Nevertheless, the major issue is related to the convergence estimator: depending on which parameters are investigated (central values or correlation between them), the number of histories (sampling) could be very different. Indeed, special care should be taken for the correlations calculation because an additional order of magnitude of sampling histories could be necessary. Furthermore, it was shown that sampling priors are not a problem. It is more efficient to properly sample in the phase space region where the likelihood is large. In this aspect, Importance and Metropolis algorithms are working better than brute force (“classical”) MonteCarlo. It also highlights the fact that presampling prior with a limited number of realizations could be inadequate for further inference analysis. The integral data assimilation with feedback directly on the model parameter is too much time consuming. However, a simplified model can be adopted by using a simple model with a linear approach for instance (to predict integral parameters response to input parameters). Such approximation would be less consuming but will erase nonlinearity effect that may be observed in the posterior distribution. Such study should be performed with extensive cases to improve the MonteCarlo methods.
Concerning BMC inference methods, in the future, other Markov chain algorithms will be developed in CONRAD code and efficient convergence estimators will be proposed as well. The choice of Gaussian probability functions for both prior and likelihood will be challenged and discussed.
More generally, an open range of scientific activities will be investigated. In particular, one major issue is related to a change in paradigm: to go beyond covariance matrices and deal with parameters knowledge taking into account full probability density distributions. In addition, for endup users, it will be necessary to investigate the feasibility of a full MonteCarlo approach, from nuclear reaction models to nuclear reactors or integral experiments (or any other applications) without format/files/processing issues which are most of the time bottlenecks.
The use of MonteCarlo could solve a generic issue in nuclear data evaluation related to difference of information given in evaluated files: in the resonance range where cross section uncertainties and/or nuclear model parameters uncertainties are compiled and in the higher energy range where only cross section uncertainties are formatted. This could simplify the evaluation of full covariance over the whole energy range.
Author contribution statement
The main author (E. Privas) has done the development and the analysis of the new stochastic method. The other authors participated in the verification process and in the overall development of CONRAD.
References
 T. Bayes, Philos. Trans. R. Soc. London 53, 370 (1763) [NASA ADS] [CrossRef] [Google Scholar]
 F. Frohner, JEFF Report 18, 2000 [Google Scholar]
 P. Archier, C. De Saint Jean, O. Litaize, G. Noguère, L. Berge, E. Privas, P. Tamagno, Nucl. Data Sheets 118, 488 (2014) [CrossRef] [Google Scholar]
 T. Cover, J. Thomas, Elements of information theory (WileyInterscience, New York, 2006) [Google Scholar]
 R. Fletcher, Practical methods of optimization (John Wiley & Sons, New York, 1987) [Google Scholar]
 P.A.C. De Saint Jean, E. Privas, G. Noguère, in Proceeding of the International Conference on Nuclear Data, 2016 [Google Scholar]
 B. Habert, Ph.D. thesis, Institut Polytechnique de Grenoble, 2009 [Google Scholar]
 H. Mitani, H. Kuroi, J. Nucl. Sci. Technol. 9, 383 (1972) [CrossRef] [Google Scholar]
 H. Mitani, H. Kuroi, J. Nucl. Sci. Technol. 9, 642 (1972) [CrossRef] [Google Scholar]
 A. Gandini, Y. Ronen Ed. Uncertainty analysis (CRC Press, Boca Raton, 1988) [Google Scholar]
 C. De Saint Jean, P. Archier, E. Privas, G. Noguère, O. Litaize, P. Leconte, Nucl. Data Sheets 123, 178 (2015) [CrossRef] [Google Scholar]
 C. Uttley, C. Newstead, K. Diment, in Nuclear Data For Reactors, in Conference Proceedings, (Paris, 1966), Vol 1, p. 165 [Google Scholar]
Cite this article as: Edwin Privas, Cyrille De Saint Jean, Gilles Noguere, On the use of the BMC to resolve Bayesian inference with nuisance parameters, EPJ Nuclear Sci. Technol. 4, 36 (2018)
All Tables
Results comparison between the different methods implemented in CONRAD for ^{238}U total cross section.
Results comparison when a normalisation marginal parameter is taken into account. AN_Margi represents the deterministic method, MC_Margi represents the semistochastic resolution, BMC is the classical method and the last column called Importance is the BMC where an importance function is used for the sampling.
All Figures
Fig. 1 General overview of the evaluation process. 

In the text 
Fig. 2 General overview of the CONRAD code. 

In the text 
Fig. 3 A priori covariance matrix on ^{238}U total cross section. 

In the text 
Fig. 4 A posteriori covariance matrix on ^{238}U total cross section. 

In the text 
Fig. 5 S_{0} and S_{1} distributions obtained with the classical BMC method. 

In the text 
Fig. 6 S_{0} mean value convergence. 

In the text 
Fig. 7 S_{1} mean value convergence. 

In the text 
Fig. 8 A posteriori covariance of the ^{238}U total cross section. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.