Reference Input Data Library (RIPL)

In this brief notebook, we use the processed RIPL data to explore and visualize some attributes. Let us start by importing the necessary packages.

NOTE: This notebook is not meant to be a complete exploration resource. You are responsible for exploring and validating any provided data.

[2]:
# # PROTOTYPE
# import sys
# sys.path.append("../..")
[3]:
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 50)

import seaborn as sns
import matplotlib.pyplot as plt
import os

import nucml.datasets as nuc_data
[4]:
# Specifying directory to save figures
figure_dir = "Figures/"

Loading RIPL/ENSDF Data

Let us first load both the original and the cut-off RIPL data. Recall that the cut-off version is based on the RIPL cut-off parameters.

[5]:
ensdf_df = nuc_data.load_ensdf(append_ame=True)
ensdf_cutoff_df = nuc_data.load_ensdf(cutoff=True, append_ame=True)
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/ENSDF\CSV_Files/ensdf.csv
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_all_merged_no_NaN.csv
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/ENSDF\CSV_Files/ensdf_cutoff.csv
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_all_merged_no_NaN.csv

Plotting Some Features Distributions

Isotope Energy Distribution

Let us observe what is the energy distribution as a function of the number of protons for all known levels against the cut-off dataset.

[6]:
sns.set(font_scale = 2)
sns.set_style("white")
[7]:
plt.figure(figsize=(14, 8))
sns.scatterplot(ensdf_df.Z, ensdf_df.Energy, alpha=0.5, label="All Known Levels")
sns.scatterplot(ensdf_cutoff_df.Z, ensdf_cutoff_df.Energy, alpha=0.5, label="RIPL Cut-Off")
plt.ylabel("Level Energy (MeV)")
plt.xlabel("Protons")
plt.savefig(os.path.join(figure_dir, 'ENSDF_Z_E.png'), transparent=False, bbox_inches='tight', dpi=600)
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
../_images/notebooks_2_EDA_RIPL_8_1.png

Similarly, for neutrons:

[8]:
plt.figure(figsize=(14, 8))
sns.scatterplot(ensdf_df.N, ensdf_df.Energy, alpha=0.5, label="All Known Levels")
sns.scatterplot(ensdf_cutoff_df.N, ensdf_cutoff_df.Energy, alpha=0.5, label="RIPL Cut-Off")
plt.ylabel("Level Energy (MeV)")
plt.xlabel("Neutrons")
plt.savefig(os.path.join(figure_dir, 'ENSDF_N_E.png'), transparent=False, bbox_inches='tight', dpi=600)
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
C:\Users\Pedro\Anaconda3\envs\nucml\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
../_images/notebooks_2_EDA_RIPL_10_1.png

Atomic Mass Number Distribution

[9]:
plt.figure(figsize=(14,8))
g = sns.kdeplot(ensdf_df.A, shade=True);
g.set(xlabel="Atomic Mass Number (A)", ylabel="Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_Atomic_Mass_Dist.png'), bbox_inches='tight', dpi=600)
../_images/notebooks_2_EDA_RIPL_12_0.png

Energy Distribution

[10]:
plt.figure(figsize=(14, 8))
sns.kdeplot(ensdf_df.Energy.values, shade=True, label="All Known Levels");
sns.kdeplot(ensdf_cutoff_df.Energy.values, shade=True, label='RIPL Cut-Off');
plt.xlabel("Energy [MeV]")
plt.ylabel("Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_E_Dist.png'), bbox_inches='tight', dpi=600)
../_images/notebooks_2_EDA_RIPL_14_0.png

Level Number Distribution

[11]:
plt.figure(figsize=(14,8))
sns.kdeplot(ensdf_df.Level_Number.values, shade=True, label="All Known Levels");
sns.kdeplot(ensdf_cutoff_df.Level_Number.values, shade=True, label="RIPL Cut-Off");
plt.xlabel("Level Number")
plt.ylabel("Fraction")
plt.savefig(os.path.join(figure_dir, 'ENSDF_L_Dist.png'), bbox_inches='tight', dpi=600)
../_images/notebooks_2_EDA_RIPL_16_0.png

Uranium and Chlorine Energy Distribution

[12]:
plt.figure(figsize=(14, 8))
chlorine = ensdf_df[ensdf_df.Element_w_A == "35Cl"]
uranium = ensdf_df[ensdf_df.Element_w_A == "233U"]
sns.kdeplot(chlorine.Energy.values, shade=True, label="Chlorine-25");
sns.kdeplot(uranium.Energy.values, shade=True, label='Uranium-235');
plt.xlabel("Energy [MeV]")
plt.ylabel("Fraction")
plt.legend()
plt.savefig(os.path.join(figure_dir, 'ENSDF_Cl_U_E_Dist.png'), bbox_inches='tight', dpi=600)
../_images/notebooks_2_EDA_RIPL_18_0.png

Energy vs Level Number

Ideally, we would like to model the level energy vs level number. It is a difficult challenge as there are collective states and more. Furthemore, when implementing the RIPL cut-off parameters, we lose more than half of the avaliable data for training.

[13]:
plt.figure(figsize=(14, 8))
sns.scatterplot(x='Level_Number', y='Energy', data=ensdf_df, alpha=0.4, label="All Known Levels")
sns.scatterplot(x='Level_Number', y='Energy', data=ensdf_cutoff_df, alpha=0.4, label="RIPL Cut-Off")
plt.xlabel("Level Number")
plt.ylabel("Level Energy")
plt.savefig(os.path.join(figure_dir, 'ENSDF_E_vs_L.png'), bbox_inches='tight', dpi=600)
../_images/notebooks_2_EDA_RIPL_20_0.png

NucML Plotting Utilities: Level and Level Density

For EXFOR, a useful future might be the level density at incident energies. To model RIPL/XUNDL we can take two approaches: (1) model the level energy as a function of the level number with the cut-off dataset or (2) model the level density. We can visualize the level density using NucML ensdf plotting utilities. Let us import it.

[14]:
import nucml.ensdf.plot as ensdf_plot

We can plot the level density of both the original and the cut-off dataset. All you need to do is pass both datasets to the level_density() function. For example, let us plot it for Chlorine 35:

[15]:
ensdf_plot.level_density(ensdf_df, 17, 35, df2=ensdf_cutoff_df, save=True, save_dir=figure_dir)
../_images/notebooks_2_EDA_RIPL_24_0.png

and for Uranium-233:

[16]:
ensdf_plot.level_density(ensdf_df, 92, 233, df2=ensdf_cutoff_df, save=True, save_dir=figure_dir)
../_images/notebooks_2_EDA_RIPL_26_0.png

Statistics Example - Chlorine-35 and Uranium-235

[17]:
from scipy import stats
[18]:
def pearson_corr(protons, neutrons, df):
    to_plot = df[(df["Z"] == protons) & (df["N"] == neutrons)].sort_values(by='Level_Number', ascending=True)
    pearson_coef, p_value = stats.pearsonr(to_plot['Level_Number'], to_plot['Energy'])
    print("Results for {}:".format(to_plot.Element_w_A.iloc[0]))
    print("The Pearson Correlation Coefficient is", pearson_coef, " with a P-value of P =", p_value)
[19]:
chlorine = ensdf_df[ensdf_df.Element_w_A == "35Cl"].sort_values(by='Level_Number', ascending=True)
uranium = ensdf_df[ensdf_df.Element_w_A == "235U"].sort_values(by='Level_Number', ascending=True)
[20]:
chlorine.iloc[:,:10].describe()
[20]:
Level_Number Energy Spin Parity Half_Life Gammas Num_Decay_Modes
count 352.000000 352.000000 352.000000 352.000000 1.020000e+02 352.000000 352.0
mean 176.500000 8.579085 0.735795 0.085227 4.559895e-13 3.227273 0.0
std 101.757883 2.029201 2.180220 0.648125 3.147776e-12 4.781201 0.0
min 1.000000 0.000000 -1.000000 -1.000000 3.740000e-20 0.000000 0.0
25% 88.750000 7.691525 -1.000000 0.000000 6.124750e-18 0.000000 0.0
50% 176.500000 8.734000 0.500000 0.000000 1.901000e-16 0.000000 0.0
75% 264.250000 9.877750 2.500000 1.000000 1.350000e-14 6.000000 0.0
max 352.000000 13.900000 11.500000 1.000000 3.080000e-11 17.000000 0.0
[21]:
pd.DataFrame(chlorine.iloc[:,:10].corr()).sort_values(by='Energy', ascending=False).head()
[21]:
Level_Number Energy Spin Parity Half_Life Gammas Num_Decay_Modes
Energy 0.939196 1.000000 -0.170705 -0.138014 -0.199070 -0.302143 NaN
Level_Number 1.000000 0.939196 -0.129374 -0.080478 -0.149506 -0.425753 NaN
Parity -0.080478 -0.138014 0.092596 1.000000 -0.118308 -0.015462 NaN
Spin -0.129374 -0.170705 1.000000 0.092596 0.174749 0.180695 NaN
Half_Life -0.149506 -0.199070 0.174749 -0.118308 1.000000 -0.039308 NaN
[22]:
pearson_corr(17, 35-17, ensdf_df)
Results for 35Cl:
The Pearson Correlation Coefficient is 0.9391959945279038  with a P-value of P = 1.508712295824325e-164
[23]:
pearson_corr(17, 35-17, ensdf_cutoff_df)
Results for 35Cl:
The Pearson Correlation Coefficient is 0.9020790472746514  with a P-value of P = 1.0588253532475289e-14

We can create correlation by using the log on both energy and level number.

[24]:
ensdf_df_log = ensdf_df.copy()
ensdf_cutoff_df_log = ensdf_cutoff_df.copy()
[25]:
ensdf_df_log.Level_Number = np.log10(ensdf_df_log.Level_Number)
ensdf_df_log = ensdf_df_log[ensdf_df_log.Energy != 0]
ensdf_df_log.Energy = np.log10(ensdf_df_log.Energy)
[26]:
ensdf_cutoff_df_log.Level_Number = np.log10(ensdf_cutoff_df_log.Level_Number)
ensdf_cutoff_df_log = ensdf_cutoff_df_log[ensdf_cutoff_df_log.Energy != 0]
ensdf_cutoff_df_log.Energy = np.log10(ensdf_cutoff_df_log.Energy)
[27]:
pearson_corr(17, 35-17, ensdf_df_log)
Results for 35Cl:
The Pearson Correlation Coefficient is 0.9757198447187734  with a P-value of P = 2.959948935021139e-232
[28]:
pearson_corr(17, 35-17, ensdf_cutoff_df_log)
Results for 35Cl:
The Pearson Correlation Coefficient is 0.9635162177216403  with a P-value of P = 1.279072260851587e-21

In the log scale, the energy is highly correlated. The p-value results in 1% confidence that this correlation is significant. We, therefore, expect that a linear model will work in this data. However is not as simple as that, we know that this dependency might start as linear but start deviating at some point.