Atomic Mass Evaluation (2016)

In this brief notebook, we use the processed AME data to explore and visualize some attributes. Let us start by importing the necessary packages.

NOTE: This notebook is not meant to be a complete exploration resource. You are responsible for exploring and validating the data.

[1]:
# # PROTOTYPE
# import sys
# sys.path.append("../..")
[2]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import os

pd.set_option('display.max_columns', 500)

import nucml.datasets as nuc_data
[3]:
# This is were our figures will be stored
fig_dir = "Figures/"

Loading Merged AME Files with Natural Data AND with and without NaNs

[4]:
ame = nuc_data.load_ame()
ame_filled = nuc_data.load_ame(natural=True, imputed_nan=True)
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_all_merged.csv
INFO:root:AME: Reading and loading Atomic Mass Evaluation files from:
 C:/Users/Pedro/Desktop/ML_Nuclear_Data/AME/CSV_Files\AME_Natural_Properties_no_NaN.csv
[5]:
ame.shape
[5]:
(3436, 65)
[6]:
# How many rows with missing values exists?
rows_w_missing = ame[ame.isnull().any(axis=1)].shape[0]
print("{:.2f}% of the rows have missing values.".format(100 * (rows_w_missing/ame.shape[0])))
46.16% of the rows have missing values.

We can get some quick statistics for each numerical feature of the AME dataset. The count value shows approximatedly how many missing values are per feature.

[16]:
# What is the minimum/maximum of the features in ame
ame.describe()
[16]:
N Z A Mass_Excess dMass_Excess Binding_Energy dBinding_Energy B_Decay_Energy dB_Decay_Energy Atomic_Mass_Micro dAtomic_Mass_Micro S(2n) dS(2n) S(2p) dS(2p) Q(a) dQ(a) Q(2B-) dQ(2B-) Q(ep) dQ(ep) Q(B-n) dQ(B-n) S(n) dS(n) S(p) dS(p) Q(4B-) dQ(4B-) Q(d,a) dQ(d,a) Q(p,a) dQ(p,a) Q(n,a) dQ(n,a) Q(g,p) Q(g,n) Q(g,pn) Q(g,d) Q(g,t) Q(g,He3) Q(g,2p) Q(g,2n) Q(g,a) Q(p,n) Q(p,2p) Q(p,pn) Q(p,d) Q(p,2n) Q(p,t) Q(p,3He) Q(n,2p) Q(n,np) Q(n,d) Q(n,2n) Q(n,t) Q(n,3He) Q(d,t) Q(d,3He) Q(3He,t) Q(3He,a) Q(t,a)
count 3436.000000 3436.000000 3436.000000 3436.000000 3436.000000 3436.000000 3436.000000 3141.000000 3141.000000 3.436000e+03 3436.000000 3199.000000 3199.000000 3081.000000 3081.000000 3298.000000 3298.000000 2848.000000 2848.000000 2964.000000 2964.000000 3023.000000 3023.000000 3318.000000 3318.000000 3258.000000 3258.000000 2284.000000 2284.000000 3367.000000 3367.000000 3331.000000 3331.000000 3195.000000 3195.000000 3258.000000 3318.000000 3367.000000 3367.000000 3331.000000 3195.000000 3081.000000 3199.000000 3298.000000 3141.000000 3258.000000 3318.000000 3318.000000 3023.000000 3199.000000 3367.000000 2964.000000 3258.000000 3258.000000 3318.000000 3367.000000 3081.000000 3318.000000 3258.000000 3141.000000 3318.000000 3258.000000
mean 82.034051 57.857392 139.891444 -24144.120957 123.588536 7959.806728 1.838921 -100.991337 155.851106 1.398655e+08 132.654412 15464.510253 156.349187 13711.908945 153.876813 -1028.352902 141.749175 -158.406791 148.478968 -6755.747156 150.761589 -7807.666279 151.842233 7755.557459 164.216263 6869.773398 160.008898 -324.475306 162.191690 11414.314298 175.347868 5894.960498 169.043993 6792.835383 160.967815 -6869.773398 -7755.557459 -14656.779602 -12432.213602 -13918.904402 -13784.784017 -13711.908945 -15464.510253 -1028.352902 -883.337837 -6869.773398 -7755.557459 -5530.991459 -8590.012779 -6982.715353 -6938.739202 -5973.400656 -6869.773398 -4645.207398 -7755.557459 -6174.984602 -5993.868545 -1498.328459 -1376.298998 -119.583337 12822.061941 12944.091502
std 43.293558 27.809406 70.599410 56200.705700 197.547987 738.982115 15.031735 8063.858254 239.079983 7.063095e+07 212.043923 6550.042919 242.338186 10078.915472 235.628998 6989.405614 233.060316 14319.652534 217.688173 12278.480250 225.980065 10536.785128 229.821244 3631.746683 254.377651 5444.802214 250.538488 23373.457604 203.785478 3977.016875 264.596607 4615.611122 257.035502 8215.124941 246.518545 5444.802214 3631.746683 3977.016875 3977.016875 4615.611122 8215.124941 10078.915472 6550.042919 6989.405614 8063.858254 5444.802214 3631.746683 3631.746683 10536.785128 6550.042919 3977.016875 12278.480250 5444.802214 5444.802214 3631.746683 3977.016875 10078.915472 3631.746683 5444.802214 8063.858254 3631.746683 5444.802214
min 0.000000 0.000000 1.000000 -91652.853000 0.000000 -2267.000000 0.000000 -28945.000000 0.000000 1.007825e+06 0.000000 -3120.000000 0.000000 -7630.000000 0.000000 -25474.730000 0.000000 -37359.770000 0.000000 -52959.000000 0.000000 -39622.000000 0.000000 -2488.000000 0.000000 -4527.000000 0.000000 -59615.000000 0.140000 -4128.000000 0.000000 -13545.000000 0.000000 -26083.000000 0.000000 -31008.000000 -27715.000000 -30199.093900 -27974.527900 -33358.864900 -46660.619400 -55187.000000 -40541.000000 -25474.730000 -29727.346500 -31008.000000 -27715.000000 -25490.434000 -40404.346500 -32059.205100 -22481.053500 -52176.653500 -31008.000000 -28783.434000 -27715.000000 -21717.298900 -47468.959600 -21457.771000 -25514.525600 -28963.592000 -7137.380600 -11194.135100
25% 47.000000 36.000000 84.000000 -65400.443000 3.005750 7730.734250 0.024000 -5356.454000 6.406000 8.392781e+07 3.226500 11068.100000 5.400000 5819.440000 6.550000 -6262.357500 3.782500 -9937.682500 8.100000 -14125.250000 7.277500 -14561.930000 7.100000 5414.787500 5.140000 2772.250000 5.842500 -17485.152500 11.670000 8767.635000 6.160000 3353.955000 5.900000 1986.590000 6.190000 -9949.867500 -9654.657500 -17303.458900 -15078.892900 -16459.909900 -18591.029400 -19423.390000 -19082.195000 -6262.357500 -6138.800500 -9949.867500 -9654.657500 -7430.091500 -15344.276500 -10600.400100 -9585.418500 -13342.903500 -9949.867500 -7725.301500 -9654.657500 -8821.663900 -11705.349600 -3397.428500 -4456.393100 -5375.046000 10922.961900 9863.997400
50% 81.000000 58.000000 139.000000 -39335.024000 14.474000 8073.000000 0.094000 -858.530000 27.916000 1.389127e+08 15.537500 14614.250000 23.320000 11755.380000 26.170000 -430.320000 18.690000 -1685.250000 28.090000 -4924.250000 28.120000 -8243.100000 27.950000 7249.960000 25.005000 5799.335000 24.815000 -2903.405000 42.070000 11813.880000 28.990000 6308.000000 26.760000 8407.020000 27.600000 -5799.335000 -7249.960000 -14257.213900 -12032.647900 -13505.864900 -12170.599400 -11755.380000 -14614.250000 -430.320000 -1640.876500 -5799.335000 -7249.960000 -5025.394000 -9025.446500 -6132.455100 -6539.173500 -4141.903500 -5799.335000 -3574.769000 -7249.960000 -5775.418900 -4037.339600 -992.731000 -305.860600 -877.122000 13327.659400 14014.529900
75% 114.000000 80.000000 194.000000 1172.301000 196.000000 8367.375000 1.000000 4700.000000 236.000000 1.939668e+08 211.000000 19082.195000 228.000000 19423.390000 229.590000 4561.017500 200.000000 8501.250000 263.052500 1947.500000 251.000000 -1284.525000 235.500000 9654.657500 269.057500 9949.867500 232.250000 14141.572500 298.000000 14414.505000 284.000000 9053.995000 277.000000 13247.000000 259.275000 -2772.250000 -5414.787500 -11656.588900 -9432.022900 -10759.869900 -7330.619400 -5819.440000 -11068.100000 4561.017500 3917.653500 -2772.250000 -5414.787500 -3190.221500 -2066.871500 -2586.305100 -3938.548500 2729.846500 -2772.250000 -547.684000 -5414.787500 -3174.793900 1898.600400 842.441500 2721.224400 4681.408000 15162.831900 17041.614900
max 177.000000 118.000000 295.000000 201512.000000 2003.000000 8794.553000 667.000000 31687.000000 2003.000000 2.952163e+08 2150.000000 40541.000000 2003.000000 55187.000000 2014.000000 11920.000000 2042.000000 52098.000000 2003.000000 28352.000000 2003.000000 31755.000000 2003.000000 27715.000000 2011.000000 31008.000000 2832.000000 77377.000000 2019.000000 23846.530000 2830.000000 21413.860000 2003.000000 24299.000000 2830.000000 4527.000000 2488.000000 -2224.563900 0.002100 1599.995100 3721.380600 7630.000000 3120.000000 11920.000000 30904.653500 4527.000000 2488.000000 4712.566000 30972.653500 11601.794900 5493.476500 29134.346500 4527.000000 6751.566000 2488.000000 6257.231100 15348.040400 8745.229000 10020.474400 31668.408000 23065.619400 24340.864900

Visualizing some Features

[19]:
sns.set(style="white", font_scale=2)

Binding Energy per Nucleon

We can easily plot the binding energy per nucleon using seaborn:

[20]:
sns.relplot(x="A", y="Binding_Energy", kind="line", data=ame, height=8, aspect=1.5, ci="sd")
plt.xlabel("Mass Number (A)")
plt.ylabel("Binding Energy per Nucleon (eV)")
plt.savefig(os.path.join(fig_dir, "AME_BE_per_A.png"), bbox_inches='tight', dpi=500)
../_images/notebooks_2_EDA_AME_13_0.png

Packing Fraction

[23]:
ame["packing_fraction"] = ((ame.Atomic_Mass_Micro/1E6) - ame.A) / ame.A
[24]:
g = sns.relplot(x="A", y="packing_fraction", kind="line", data=ame, height=8, aspect=1.5, ci="sd")
plt.xlabel("Mass Number (A)")
plt.ylabel("Packing Fraction")
g.axes[0][0].axhline(0, ls='--')
plt.savefig(os.path.join(fig_dir, "AME_packing_fraction.png"), bbox_inches='tight', dpi=500)
../_images/notebooks_2_EDA_AME_16_0.png

Proton-Neutron Behaviour

[25]:
plt.figure(figsize=(14,10))
ame.plot(x="Z", y="N", kind='scatter', figsize=(14,10))
plt.plot(ame.Z, ame.Z, linewidth=4, color="orange")
plt.xlabel("Protons (Z)")
plt.ylabel("Neutrons (N)")
plt.savefig(os.path.join(fig_dir, "AME_Z_vs_N.png"), bbox_inches='tight', dpi=500)
WARNING:matplotlib.axes._axes:*c* argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with *x* & *y*.  Please use the *color* keyword-argument or provide a 2-D array with a single row if you intend to specify the same RGB or RGBA value for all points.
<Figure size 1008x720 with 0 Axes>
../_images/notebooks_2_EDA_AME_18_2.png

Original vs Imputed AME Dataset

As mentioned in the loading data section of the documentation, the values for the imputed version of the AME dataset were obtained by linear interpolation. We can visualize the imputed values by overlapping both the AME dataset with and without NaN values. For example, let us visualize the Mass Excess value for natural uranium (remember that we needed natural data for EXFOR’s natural target samples):

[27]:
plt.figure(figsize=(18,8))

plt.scatter(ame_filled[ame_filled.Z == 92].sort_values(by="A").A,
            ame_filled[ame_filled.Z == 92].sort_values(by="A").Mass_Excess)
plt.scatter(ame[ame.Z == 92].sort_values(by="A").A,
            ame[ame.Z == 92].sort_values(by="A").Mass_Excess)
plt.ylabel("Mass Excess (keV)")
plt.xlabel("Atomic Mass Number")
[27]:
Text(0.5, 0, 'Atomic Mass Number')
../_images/notebooks_2_EDA_AME_20_1.png

We can automate the creation of these plots:

[28]:
def plot_comparison(protons, feature, ax=None):
    if ax is None:
        ax = plt.gca()
    ax.scatter(ame_filled[ame_filled.Z == protons].sort_values(by="A").A,
             ame_filled[ame_filled.Z == protons].sort_values(by="A")[feature])
    ax.scatter(ame[ame.Z == protons].sort_values(by="A").A,
             ame[ame.Z == protons].sort_values(by="A")[feature])
    ax.set_ylabel(feature.replace("_" ," "))
#     ax.set_xlabel("Atomic Mass Number")

In the following plots, the blue points represent the imputed values either for a natural data point or an isotope for which a value was not reported. In some cases, these values are not applicable but imputation is necessary for compatibility with some ML models. Let us plot the Mass Excess, Binding Energy, Beta Decay Energy, and the Neutron Separation Energy, all in a subplot

[29]:
# make figure with subplots
f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, sharex=False, figsize=(25,14))
plot_comparison(92, "Mass_Excess", ax1)
plot_comparison(92, "B_Decay_Energy", ax2)
plot_comparison(92, "Binding_Energy", ax3)
plot_comparison(92, "S(n)", ax4)

ax1.set_xlabel("Atomic Mass Number")
ax2.set_xlabel("Atomic Mass Number")
ax3.set_xlabel("Atomic Mass Number")
ax4.set_xlabel("Atomic Mass Number")

f.savefig(os.path.join(fig_dir, "AME_NaN.png"), bbox_inches='tight', dpi=500)
../_images/notebooks_2_EDA_AME_24_0.png

While not perfect, it is better than not using the AME database at all. The AME database contains very useful information that a model can leverage to make better predictions.