Issue 
EPJ Nuclear Sci. Technol.
Volume 4, 2018
Special Issue on 4th International Workshop on Nuclear Data Covariances, October 2–6, 2017, Aix en Provence, France – CW2017



Article Number  38  
Number of page(s)  8  
Section  Covariance Evaluation Methodology  
DOI  https://doi.org/10.1051/epjn/2018047  
Published online  14 November 2018 
https://doi.org/10.1051/epjn/2018047
Regular Article
Choice of positive distribution law for nuclear data
DEN − Service d'études des réacteurs et de mathématiques appliquées (SERMA), CEA, Université ParisSaclay,
91191
GifsurYvette, France
^{*} email: sebastien.lahaye@cea.fr
Received:
6
November
2017
Received in final form:
9
March
2018
Accepted:
9
July
2018
Published online: 14 November 2018
Nuclear data evaluation files in the ENDF6 format provide mean values and associated uncertainties for physical quantities relevant in nuclear physics. Uncertainties are denoted as Δ in the format description, and are commonly understood as standard deviations. Uncertainties can be completed by covariance matrices. The evaluations do not provide any indication on the probability density function to be used when sampling. Three constraints must be observed: the mean value, the standard deviation and the positivity of the physical quantity. MENDEL code generally uses positively truncated Gaussian distribution laws for small relative standard deviations and a lognormal law for larger uncertainty levels (>50%). Indeed, the use of truncated Gaussian laws can modify the mean and standard deviation value. In this paper, we will make explicit the error in the mean value and the standard deviation when using different types of distribution laws. We also employ the principle of maximum entropy as a criterion to choose among the truncated Gaussian, the fitted Gaussian and the lognormal distribution. Remarkably, the difference in terms of entropy between the candidate distribution laws is a function of the relative standard deviation only. The obtained results provide therefore general guidance for the choice among these distributions.
© S. Lahaye, published by EDP Sciences, 2018
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Introduction
Nuclear data evaluation files in the ENDF6 format [1] provide mean values and associated uncertainties for physical quantities relevant in nuclear physics. These uncertainties are denoted as Δ in the format description for most of the nuclear data parameter types, and are understood as standard deviations. Uncertainties can be completed (for microscopic cross sections, for example) by covariance matrices.
For uncertainty propagation based on random sampling, one needs to know the probability density function associated with each random variable. Current nuclear data evaluation files [2–4] do not provide any indication which probability density function to use, and users have to choose a distribution law. This is the case for all uncertain data propagated in fuel cycle code systems, such as independent fission yields, radioactive decay constants, radioactive decay energies, radioactive decay branching ratios and multigroup microscopic cross sections. For some data, in particular independent neutron fission yields and some microscopic cross sections, the relative standard deviation can be high (more than 50%), and Gaussian laws naturally lead to negative occurrences, which is not acceptable for those physical quantities.
Three constraints must therefore be respected when sampling:

positivity of the physical quantity;

its mean value;

and its standard deviation.
Often the positively truncated Gaussian law is used, which takes into account positivity but introduces a bias in the mean value and the standard deviation. Furthermore, it is not symmetric around the mean value.
MENDEL [5], is the new generation of the CEA code system for nuclear fuel cycle studies. Its depletion solver is provided to the transport code systems APOLLO3® [6] and TRIPOLI4® [7]. MENDEL is the successor of DARWIN/PEPIN2 [8].
Uncertainty quantification in MENDEL is based on a propagation method using a Monte Carlo approach by correlated sampling [9], and sampling is done by the CEA uncertainty platform URANIE [10,11]. Nuclear data uncertainties are propagated to physical quantities of interest (such as decay heat or concentrations) [12].
Until now, the choice of distribution law in MENDEL is done in the following way:

positively truncated Gaussian law if the relative standard deviation is less than 50%;

lognormal law if the relative standard deviation is more than 50%.
This choice is based on physical reasons, as truncated Gaussian distributions modify the mean value and the standard deviation values for large relative uncertainties. The switching point between the Gaussian and the lognormal distributions is a pragmatic choice. This paper aims to give a formal justification for this choice.
The structure of this paper is as follows. First, we will investigate the introduced bias in both the mean value and the standard deviation by a truncation of the Gaussian distribution. We will then describe how to modify the Gaussian law parameters in order to obtain after truncation the mean value and the standard deviation as specified in evaluation files.
In the second part, we employ the principle of maximum entropy [13,14] to choose between different distribution laws. We will show that the choice of the distribution law depends on the relative standard deviation.
We limit our study in this paper to the distribution laws themselves, without propagation in numerical code systems.
2 Maximum entropy method
For a continuous probability density function p(x) defined on I, we introduce the differential entropy defined as: (1)
We define p(x)lnp(x) = 0 when p(x) = 0 (due to ).
This entropy function appears in statistical physics and thermodynamics where higher entropy is associated with states closer to equilibrium.
The maximum entropy principle [14] states that for a given set of constraints (for example known mean value and standard deviation), probability distribution with the largest entropy should be chosen.
For given constraints, the law with the largest entropy is the one that contains the least amount of information about the physical quantity. For example, the maximum entropy principle will lead to the following choices:

a uniform law if the constraints are minimal and maximal values;

and a Gaussian law if the constraints are a given mean value and a standard deviation.
3 Candidate distribution laws
Candidate distribution laws must respect the three following criteria:

positivity of the realizations: P(X < 0) = 0 (id est I = R_{+}), as nuclear data are positive;

they must match the mean value μ (within a given tolerance) as specified in the evaluation file;

they must match the standard deviation σ (within a given tolerance) as specified in the evaluation file.
3.1 Gaussian distribution
Let G(μ,σ) be the nontruncated Gaussian distribution with mean value μ and standard deviation σ. Its probability distribution function reads: (2)
3.1.1 Entropy
The Gaussian distribution maximizes the differential entropy among all distribution laws for given mean value and standard deviation. The Gaussian distribution entropy is given by: (3)
3.1.2 Positivity
A Gaussian distribution yields negative values with the probability: (4)
This negative occurrence probability is given as a function of the relative standard deviation in Figure 1.
Due to this negative occurrence probability, which is nonnegligible when is large, it is necessary to use other distribution laws to enforce the positivity constraint.
Fig. 1 Probability of negative occurrences for non truncated Gaussian distributions. 
3.2 Positively truncated Gaussian distribution
3.2.1 Distribution law
We define a Gaussian distribution with mean value μ and standard deviation σ, and then set the probability density to zero for negative values. The resulting distribution − after normalization − is called a positively truncated Gaussian distribution.
Draws can be realized by sampling from the original Gaussian distribution and rejecting all negative values. Its probability density function reads: (5)
The constant β is defined so that ∫_{ℝ}p(x)dx = 1, which means:
If we substitute , we get the form: (6)
The positively truncated Gaussian law will be denoted by PG(μ,σ).
3.2.2 Entropy
The differential entropy can be computed as a sum of ln μ and a function of δ: (7)
3.2.3 Errors on moments
With this distribution, we modify the distribution moments, particularly for large relative uncertainties.
The truncated Gaussian distribution mean value is equal to: (8)
And the truncated Gaussian distribution variance is equal to: (9)
We obtain the following relative error on the expected value, which is a function of the relative standard deviation of the original input data (i.e. nontruncated Gaussian distribution μ and σ parameters): (10)
This bias is represented in Figure 2 as a function of the parameter δ.
The relative error of the standard deviation is also a function of δ only: (11)
This bias is represented in Figure 3 as a function of the parameter δ.
The squared relative discrepancy between the relative standard deviations of the Gaussian distribution and the truncated Gaussian distribution is given in equation (12). (12)
The relative standard deviation discrepancy is represented in Figure 4. When choosing this truncated law, we obtain the numerical values for the errors shown in Table 1.
We can conclude that if the truncation is totally acceptable up to 25% uncertainty, it begins to be problematic for a 50% uncertainty, and is totally unacceptable for 100% uncertainty.
These formal results confirm the need to switch to another law, which has already been introduced heuristically in the URANIE/MENDEL sampling scheme.
Fig. 2 Relative error on mean value. 
Fig. 3 Relative error on standard deviation value. 
Fig. 4 Relative error on standard deviation value. 
Relative error between moments of the approximative law and expected moments.
3.3 Gaussian distribution with correct mean and variance
3.3.1 Distribution law
We consider a random variable in an evaluated nuclear data file characterized by a mean value μ and a standard deviation σ. We define a truncated Gaussian law so that its mean value equals μ and its standard deviation equals σ. (13)
Please note the changed notation compared to equation (5). The variables , here corresponds to μ, σ there.
The following specification of ensures the proper normalization of the distribution: (14)
3.3.2 Coefficient determination
We obtain from equation (8): (15)
And from equation (9):
Leading to: (16) which leads to (17) by substituting as given in (16) in (15): (17)
Equation (17) contains one unknown variable , but its complexity does not enable a formal analytical solution.
Hence, we compute the solutions for several values of (μ,σ) tuples numerically.
To do so, we introduce a pseudo relative standard deviation in equation (17).
The distribution parameters are summarized in Table 2 for different values of .
Gaussian distribution parameters for correct mean and standard deviation.
Problematic values, i.e. negative and unreasonably large values, appear in bold red letters.
The reader should note that and are not the mean value and standard deviation of the truncated Gaussian distribution, which are respectively equal to μ and σ. A negative value of means that more than half of the distribution is truncated and the mode is positioned at zero, which may not be desirable.
For this reason, it is reasonable to not use the truncated Gaussian distribution for relative standard deviations larger than 75%.
3.3.3 Entropy
The differential entropy of the fitted Gaussian distribution is computed in the same way as in equation (7).
3.4 Lognormal distribution
3.4.1 Distribution law
The probability density function of a lognormal distribution characterized by parameters m and s reads: (19)
3.4.2 Coefficient determination
The mean value of a lognormal distribution is equal to: (20)
The variance of a lognormal distribution is equal to: (21) which leads to: (22) (23)
3.4.3 Entropy
The differential entropy of the lognormal distribution reads: (24)
4 Choice of the law
4.1 Entropy principle
The different distribution laws will now be compared based on their differential entropies.
We show in Table 3 the entropy values for the truncated Gaussian law PG(μ,σ), the fitted truncated Gaussian law − constistant with the mean value and the standard deviation provided in the evaluated nuclear data file − and the lognormal law LN(μ,σ). The values were obtained for μ = 1. The differences between the entropies are independent from the choice of μ.
Entropy values for several laws.
Inspecting equations (7) and (24) shows that the difference between the truncated Gaussian law and the lognormal law is independent of μ.
Even though the mode value of the fitted Gaussian distribution which has to be used in equation (7) differs from μ, the difference to the lognormal distribution or the truncated Gaussian distribution is still a function of the relative standard deviation δ only. Equation (7) for the fitted Gaussian law reads:
In fact, is a function of δ only as and:

_{} depends of δ only as it is the solution of equation (18);

and is also a function of δ only.
In conclusion, the difference between the fitted truncated Gaussian law entropy and another candidate law entropy will be a function of δ plus , which is also a function of δ only.
Consequently, the difference in differential entropies is a function of the relative standard deviation only.
Despite entropy considerations, large discrepancies between the mode and the mean value are not desirable, which is an argument against the fitted truncated Gaussian in case of large relative uncertainties.
Using the entropy principle, we can see in Table 3 that − between those three laws − the modified Gaussian law is optimal, when the relative standard deviation is less than 80%.
When relative standard deviation is bigger than 80%, the truncated Gaussian distribution entropy is optimal. Nevertheless, for truncated Gaussian with high level of uncertainties, users need to take into account the huge discrepancy between objective and effective moments. Truncated Gaussian laws is not to be used in this context.
The comparison of the candidate laws for several values of in Figures 5 and 6 shows the similarity of the truncated Gaussian distribution and the Gaussian distribution in the case of small relative uncertainties. The fitted truncated Gaussian distribution is the first distribution to diverge from the Gaussian distribution, and resembles the lognormal distribution for large relative uncertainties (100% uncertainty, right part of Fig. 6).
Fig. 5 Candidate law probability density function for relative standard deviation of 25% (left) and 50% (right). 
Fig. 6 Candidate law probability density function for relative standard deviation of 75% (left) and 100% (right). 
4.2 Mode and mean
For a nontruncated Gaussian, mean value and mode are equal.
For the truncated Gaussian law (resp. the fitted truncated Gaussian law), the parameter μ (resp. ) indicates the distribution mode. In general, it is preferable to use distributions where the mode does not differ too much from the mean value.
If we want to limit to 10% the discrepancy between mode and mean value when using fitted truncated Gaussian distribution, we cannot employ it for . According to Table 2, this constraint is equivalent to:
5 Conclusion
The Gaussian distribution is not adequate for positive physical quantities, especially for large relative uncertainties, as it leads to excessive amount of negative occurrences.
For this reason, we investigated several positive distribution: the truncated Gaussian distribution, the fitted truncated Gaussian distribution and the lognormal distribution. All three laws exhibit zero probability density for negative values.
First, we studied the impact on the mean value and the standard deviation of the use of the truncated Gaussian distribution. We then compared the three distributions in terms of differential entropy. Despite the fact that the differential entropy is not scale invariant, the difference between two differential entropies is a function of the relative standard deviation only.
Both the maximum entropy principle and physical considerations have been considered in this work.
In summary, we suggest the following distribution recipe for choosing among the three distributions, depending on the relative standard deviation:

for small values of relative uncertainties , the mean and standard deviation of the three laws are nearly identical. The differential entropy is slightly better for Gaussian laws. Hence, users can choose indifferently between the truncated Gaussian law PG(μ,σ) and the fitted truncated Gaussian law ;

for intermediate values of relative uncertainties, id est , the principle of maximum entropy and favors the fitted truncated Gaussian law ;

for large values of relative uncertainties , positiveness of the mode and accuracy of the moments impose the choice of a lognormal law.
In conclusion we propose the following two laws: or
Future work can be the study of other distribution laws, such as asymmetric Gaussian and mixed Gaussian laws [15]. Prospectives are the use of the proposed distribution laws in uncertainty quantification problems and the uncertainty propagation in nuclear reactor fuel cycle studies.
References
 Cross Sections Evaluation Working Group, ENDF6 Formats Manual (National Nuclear Data Center Brookhaven National Laboratory, 2012) [Google Scholar]
 M.A. Kellet, O. Bersillon, R.W. Mills, The JEFF3.1/3.1.1 Radioactive Decay Data and Fission Yields Sublibraries (OECD, 2009) [Google Scholar]
 M.B. Chadwick, M. Herman, ENDF/BVII.1 Nuclear data for science and technology: cross sections, covariances, fission product yields and decay data, Nucl. Data Sheets 112, 2887 (2011) [Google Scholar]
 J. Katakura, Data/Code, Technical Report 2011025, JAEA, 2012 [Google Scholar]
 S. Lahaye, P. Bellier, H. Mao, A. Tsilanizara, Y. Kawamoto, First verification and validation steps of MENDEL release V1.0 cycle code system, in PHYSOR 2014 − The role of reactor physics toward a sustainable future (Kyoto, Japan, 2014) [Google Scholar]
 H. Golfier, R. Lenain, J.J. Lautard, P. Fougeras, P. Magat, E. Martinolli, Y. Dutheillet, APOLLO3: a common project of CEA, AREVA and EDF for the development of new deterministic multipurpose code for physics analysis, in M&C 2009 (New York, USA, 2009) [Google Scholar]
 E. Brun, F. Damian, C.M. Diop, E. Dumonteil, F.X. Hugot, C. Jouanne, Y.K. Lee, F. Malvagi, A. Mazzolo, O. Petit, J.C. Trama, T. Visonneau, A. Zoia, TRIPOLI4®, CEA, EDF and AREVA reference Monte Carlo code, Ann. Nucl. Energy 82, 151 (2015) [Google Scholar]
 A. Tsilanizara, C.M. Diop, B. Nimal, M. Detoc, L. Luneville, M. Chiron, T.D. Huynh, I. Bresard, M. Eid, J.C. Klein, DARWIN: an evolution code system for a large range of applications, Nucl. Sci. Technol. Suppl. 1, 845 (2000) [Google Scholar]
 A. Tsilanizara, N. Gilardi, T.D. Huynh, C. Jouanne, S. Lahaye, J.M. Martinez, C.M. Diop, Probabilistic approach for decay heat uncertainty estimation under URANIE platform by using MENDEL depletion code, Ann. Nucl. Energy 90, 62 (2016) [Google Scholar]
 F. Gaudier, URANIE: the CEA/DEN Uncertainty and Sensitivity platform, Procedia Soc. Behav. Sci. 2, 7660 (2010) [Google Scholar]
 M.D. McKaya, R.J. Beckmana, W.J. Conoverb, Comparison of three methods for selecting values of input variables in the analysis of output from a computer code, Technometrics 21, 239 (1979) [Google Scholar]
 S Lahaye, TD Huynh, A Tsilanizara, Comparison of deterministic and stochastic approaches for isotopic concentration and decay heat uncertainty quantification on elementary fission pulse, in EPJ Web Conf. 111, 09002 (2016) [Google Scholar]
 C.E. Shannonm, A mathematical theory of communication, Bell Syst. Tech. J. 27, 379 (1948) [Google Scholar]
 E.T. Jaynes, Information theory and statistical mechanics, Phys. Rev. Ser. II 106, 620 (1957) [Google Scholar]
 J.V. Michalowicz, J.M. Nichols, F. Bucholtz, Calculation of differential entropy for a mixed Gaussian distribution, Entropy 10, 200 (2008) [Google Scholar]
Cite this article as: Sébastien Lahaye, Choice of positive distribution law for nuclear data, EPJ Nuclear Sci. Technol. 4, 38 (2018)
All Tables
All Figures
Fig. 1 Probability of negative occurrences for non truncated Gaussian distributions. 

In the text 
Fig. 2 Relative error on mean value. 

In the text 
Fig. 3 Relative error on standard deviation value. 

In the text 
Fig. 4 Relative error on standard deviation value. 

In the text 
Fig. 5 Candidate law probability density function for relative standard deviation of 25% (left) and 50% (right). 

In the text 
Fig. 6 Candidate law probability density function for relative standard deviation of 75% (left) and 100% (right). 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.