Reference Input Data Library (RIPL)¶

In this brief notebook, we use the processed RIPL data to explore and visualize some attributes. Let us start by importing the necessary packages.

NOTE: This notebook is not meant to be a complete exploration resource. You are responsible for exploring and validating any provided data.

[2]:

# # PROTOTYPE
# import sys
# sys.path.append("../..")

[3]:

import pandas as pd
import numpy as np
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 50)

import seaborn as sns
import matplotlib.pyplot as plt
import os

import nucml.datasets as nuc_data

[4]:

# Specifying directory to save figures
figure_dir = "Figures/"

Loading RIPL/ENSDF Data¶

Let us first load both the original and the cut-off RIPL data. Recall that the cut-off version is based on the RIPL cut-off parameters.

[5]:

ensdf_df = nuc_data.load_ensdf(append_ame=True)
ensdf_cutoff_df = nuc_data.load_ensdf(cutoff=True, append_ame=True)

INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/ENSDF\CSV_Files/ensdf.csv
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_all_merged_no_NaN.csv
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/ENSDF\CSV_Files/ensdf_cutoff.csv
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_all_merged_no_NaN.csv

Plotting Some Features Distributions¶

Isotope Energy Distribution¶

Let us observe what is the energy distribution as a function of the number of protons for all known levels against the cut-off dataset.

[6]:

sns.set(font_scale = 2)
sns.set_style("white")

[7]:

plt.figure(figsize=(14, 8))
sns.scatterplot(ensdf_df.Z, ensdf_df.Energy, alpha=0.5, label="All Known Levels")
sns.scatterplot(ensdf_cutoff_df.Z, ensdf_cutoff_df.Energy, alpha=0.5, label="RIPL Cut-Off")
plt.ylabel("Level Energy (MeV)")
plt.xlabel("Protons")
plt.savefig(os.path.join(figure_dir, 'ENSDF_Z_E.png'), transparent=False, bbox_inches='tight', dpi=600)

C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(

Similarly, for neutrons:

[8]:

plt.figure(figsize=(14, 8))
sns.scatterplot(ensdf_df.N, ensdf_df.Energy, alpha=0.5, label="All Known Levels")
sns.scatterplot(ensdf_cutoff_df.N, ensdf_cutoff_df.Energy, alpha=0.5, label="RIPL Cut-Off")
plt.ylabel("Level Energy (MeV)")
plt.xlabel("Neutrons")
plt.savefig(os.path.join(figure_dir, 'ENSDF_N_E.png'), transparent=False, bbox_inches='tight', dpi=600)

C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(

../_images/notebooks_2_EDA_RIPL_10_1.png

Atomic Mass Number Distribution¶

[9]:

plt.figure(figsize=(14,8))
g = sns.kdeplot(ensdf_df.A, shade=True);
g.set(xlabel="Atomic Mass Number (A)", ylabel="Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_Atomic_Mass_Dist.png'), bbox_inches='tight', dpi=600)

../_images/notebooks_2_EDA_RIPL_12_0.png

Energy Distribution¶

[10]:

plt.figure(figsize=(14, 8))
sns.kdeplot(ensdf_df.Energy.values, shade=True, label="All Known Levels");
sns.kdeplot(ensdf_cutoff_df.Energy.values, shade=True, label='RIPL Cut-Off');
plt.xlabel("Energy [MeV]")
plt.ylabel("Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_E_Dist.png'), bbox_inches='tight', dpi=600)

../_images/notebooks_2_EDA_RIPL_14_0.png

Level Number Distribution¶

[11]:

plt.figure(figsize=(14,8))
sns.kdeplot(ensdf_df.Level_Number.values, shade=True, label="All Known Levels");
sns.kdeplot(ensdf_cutoff_df.Level_Number.values, shade=True, label="RIPL Cut-Off");
plt.xlabel("Level Number")
plt.ylabel("Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_L_Dist.png'), bbox_inches='tight', dpi=600)

../_images/notebooks_2_EDA_RIPL_16_0.png

Uranium and Chlorine Energy Distribution¶

[12]:

plt.figure(figsize=(14, 8))
chlorine = ensdf_df[ensdf_df.Element_w_A == "35Cl"]
uranium = ensdf_df[ensdf_df.Element_w_A == "233U"]
sns.kdeplot(chlorine.Energy.values, shade=True, label="Chlorine-25");
sns.kdeplot(uranium.Energy.values, shade=True, label='Uranium-235');
plt.xlabel("Energy [MeV]")
plt.ylabel("Fraction")
plt.legend()
plt.savefig(os.path.join(figure_dir, 'ENSDF_Cl_U_E_Dist.png'), bbox_inches='tight', dpi=600)

../_images/notebooks_2_EDA_RIPL_18_0.png

Energy vs Level Number¶

Ideally, we would like to model the level energy vs level number. It is a difficult challenge as there are collective states and more. Furthemore, when implementing the RIPL cut-off parameters, we lose more than half of the avaliable data for training.

[13]:

plt.figure(figsize=(14, 8))
sns.scatterplot(x='Level_Number', y='Energy', data=ensdf_df, alpha=0.4, label="All Known Levels")
sns.scatterplot(x='Level_Number', y='Energy', data=ensdf_cutoff_df, alpha=0.4, label="RIPL Cut-Off")
plt.xlabel("Level Number")
plt.ylabel("Level Energy")
plt.savefig(os.path.join(figure_dir, 'ENSDF_E_vs_L.png'), bbox_inches='tight', dpi=600)

../_images/notebooks_2_EDA_RIPL_20_0.png

NucML Plotting Utilities: Level and Level Density¶

For EXFOR, a useful future might be the level density at incident energies. To model RIPL/XUNDL we can take two approaches: (1) model the level energy as a function of the level number with the cut-off dataset or (2) model the level density. We can visualize the level density using NucML ensdf plotting utilities. Let us import it.

[14]:

import nucml.ensdf.plot as ensdf_plot

We can plot the level density of both the original and the cut-off dataset. All you need to do is pass both datasets to the level_density() function. For example, let us plot it for Chlorine 35:

[15]:

ensdf_plot.level_density(ensdf_df, 17, 35, df2=ensdf_cutoff_df, save=True, save_dir=figure_dir)

../_images/notebooks_2_EDA_RIPL_24_0.png

and for Uranium-233:

[16]:

ensdf_plot.level_density(ensdf_df, 92, 233, df2=ensdf_cutoff_df, save=True, save_dir=figure_dir)

../_images/notebooks_2_EDA_RIPL_26_0.png

Statistics Example - Chlorine-35 and Uranium-235¶

[17]:

from scipy import stats

[18]:

def pearson_corr(protons, neutrons, df):
    to_plot = df[(df["Z"] == protons) & (df["N"] == neutrons)].sort_values(by='Level_Number', ascending=True)
    pearson_coef, p_value = stats.pearsonr(to_plot['Level_Number'], to_plot['Energy'])
    print("Results for {}:".format(to_plot.Element_w_A.iloc[0]))
    print("The Pearson Correlation Coefficient is", pearson_coef, " with a P-value of P =", p_value)

[19]:

chlorine = ensdf_df[ensdf_df.Element_w_A == "35Cl"].sort_values(by='Level_Number', ascending=True)
uranium = ensdf_df[ensdf_df.Element_w_A == "235U"].sort_values(by='Level_Number', ascending=True)

[20]:

chlorine.iloc[:,:10].describe()

[20]:

	Level_Number	Energy	Spin	Parity	Half_Life	Gammas	Num_Decay_Modes
count	352.000000	352.000000	352.000000	352.000000	1.020000e+02	352.000000	352.0
mean	176.500000	8.579085	0.735795	0.085227	4.559895e-13	3.227273	0.0
std	101.757883	2.029201	2.180220	0.648125	3.147776e-12	4.781201	0.0
min	1.000000	0.000000	-1.000000	-1.000000	3.740000e-20	0.000000	0.0
25%	88.750000	7.691525	-1.000000	0.000000	6.124750e-18	0.000000	0.0
50%	176.500000	8.734000	0.500000	0.000000	1.901000e-16	0.000000	0.0
75%	264.250000	9.877750	2.500000	1.000000	1.350000e-14	6.000000	0.0
max	352.000000	13.900000	11.500000	1.000000	3.080000e-11	17.000000	0.0

[21]:

pd.DataFrame(chlorine.iloc[:,:10].corr()).sort_values(by='Energy', ascending=False).head()

[21]:

	Level_Number	Energy	Spin	Parity	Half_Life	Gammas	Num_Decay_Modes
Energy	0.939196	1.000000	-0.170705	-0.138014	-0.199070	-0.302143	NaN
Level_Number	1.000000	0.939196	-0.129374	-0.080478	-0.149506	-0.425753	NaN
Parity	-0.080478	-0.138014	0.092596	1.000000	-0.118308	-0.015462	NaN
Spin	-0.129374	-0.170705	1.000000	0.092596	0.174749	0.180695	NaN
Half_Life	-0.149506	-0.199070	0.174749	-0.118308	1.000000	-0.039308	NaN

[22]:

pearson_corr(17, 35-17, ensdf_df)

Results for 35Cl:
The Pearson Correlation Coefficient is 0.9391959945279038  with a P-value of P = 1.508712295824325e-164

[23]:

pearson_corr(17, 35-17, ensdf_cutoff_df)

Results for 35Cl:
The Pearson Correlation Coefficient is 0.9020790472746514  with a P-value of P = 1.0588253532475289e-14

We can create correlation by using the log on both energy and level number.

[24]:

ensdf_df_log = ensdf_df.copy()
ensdf_cutoff_df_log = ensdf_cutoff_df.copy()

[25]:

ensdf_df_log.Level_Number = np.log10(ensdf_df_log.Level_Number)
ensdf_df_log = ensdf_df_log[ensdf_df_log.Energy != 0]
ensdf_df_log.Energy = np.log10(ensdf_df_log.Energy)

[26]:

ensdf_cutoff_df_log.Level_Number = np.log10(ensdf_cutoff_df_log.Level_Number)
ensdf_cutoff_df_log = ensdf_cutoff_df_log[ensdf_cutoff_df_log.Energy != 0]
ensdf_cutoff_df_log.Energy = np.log10(ensdf_cutoff_df_log.Energy)

[27]:

pearson_corr(17, 35-17, ensdf_df_log)

Results for 35Cl:
The Pearson Correlation Coefficient is 0.9757198447187734  with a P-value of P = 2.959948935021139e-232

[28]:

pearson_corr(17, 35-17, ensdf_cutoff_df_log)

Results for 35Cl:
The Pearson Correlation Coefficient is 0.9635162177216403  with a P-value of P = 1.279072260851587e-21

In the log scale, the energy is highly correlated. The p-value results in 1% confidence that this correlation is significant. We, therefore, expect that a linear model will work in this data. However is not as simple as that, we know that this dependency might start as linear but start deviating at some point.