Experimental Nuclear Reaction Data (EXFOR)¶
The Experimental Nuclear Reaction Data (EXFOR) contains reaction data information for a variety of projectiles. NucML
makes it easy to download and parse the EXFOR C4
files to create ready to use ML-datasets. Let us start by importing the nucml.datasets
module.
[1]:
# # For prototype
# import sys
# sys.path.append("../..")
[2]:
import nucml.datasets as nuc_data
import pandas as pd
pd.set_option('display.max_columns', 500)
Original EXFOR Tables¶
When setting up, several EXFOR CSV
files were created for each reaction-inducing particle. These are stored in your local directory and follow a specific structure/convention: EXFOR/CSV_Files/EXFOR_<projectile>/EXFOR_<projectile>_ORIGINAL.csv
. You can simply read them using pandas
. Alternatively, you can use NucML
to handle loading scenarios.
[4]:
neutrons = nuc_data.load_exfor_raw(mode="neutrons")
C:\Users\Pedro\Anaconda3\envs\ml_gpu\lib\site-packages\IPython\core\interactiveshell.py:3357: DtypeWarning: Columns (17,30,31) have mixed types.Specify dtype option on import or set low_memory=False.
if (await self.run_code(code, result, async_=asy)):
[6]:
print("There are {} neutron-related data points.".format(neutrons.shape[0]))
There are 6007126 neutron-related data points.
[8]:
neutrons.columns
[8]:
Index(['Projectile', 'Target_Metastable_State', 'MF', 'MT',
'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',
'Energy', 'dEnergy', 'Data', 'dData', 'Cos/LO', 'dCos/LO', 'ELV/HL',
'dELV/HL', 'I78', 'Short_Reference', 'EXFOR_Accession_Number',
'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'A', 'N',
'Reaction_Notation', 'Title', 'Year', 'Author', 'Institute', 'Date',
'Reference', 'Dataset_Number', 'EXFOR_Entry', 'Reference_Code',
'Projectile_Z', 'Projectile_A', 'Projectile_N', 'Isotope', 'Element'],
dtype='object')
Other supported projectiles are protons
, alphas
, deuterons
, gammas
, and helions
. Alternatively, you can load all data from all projectiles by specifying mode="all"
Reaction Data (MF3) Only¶
There are a variety of Files (MT)
included in EXFOR
. In our work, we dealt only with reaction cross section vs energy datapoints (MT=3
). NucML
therefore offers many functionalities to create ML-ready particle-induced cross section reaction data.
We can simply load all neutron
induce cross section reaction EXFOR data points by calling the load_exfor()
method:
[9]:
exfor_neutrons = nuc_data.load_exfor(mode="neutrons")
INFO:root: MODE: neutrons
INFO:root: LOW ENERGY: False
INFO:root: LOG: False
INFO:root: BASIC: -1
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\EXFOR_neutrons/EXFOR_neutrons_MF3_AME_no_RawNaN.csv
INFO:root:Data read into dataframe with shape: (4255409, 104)
INFO:root:Finished. Resulting dataset has shape (4255409, 104)
When setting up NucML
, the parsing and data creation utilities automatically created a CSV
file ready which is almost ready for ML-modeling. The .load_exfor()
method uses these preprocessed files to filter, prepare and return a truly ML-ready dataset.
NOTE: You should always analyze and inspect any recommended dataset. The following loading options are purely an opinion. You can start from the raw original EXFOR tables to build your dataset ideally.
You will notice that the returned dataset contains less data than the original tables. This is because we are only using the MF=3
points and doing some basic filtering operations. Notice that the AME database information is already appended and no missing values will be found. For information on the filtering and imputation techniques used, see the documentation.
[10]:
exfor_neutrons.head()
[10]:
Projectile | Target_Metastable_State | MT | Product_Metastable_State | EXFOR_Status | Center_of_Mass_Flag | Energy | dEnergy | Data | dData | ELV/HL | dELV/HL | I78 | Short_Reference | EXFOR_Accession_Number | EXFOR_SubAccession_Number | EXFOR_Pointer | Z | Reaction_Notation | Title | Year | Author | Institute | Date | Reference | Dataset_Number | EXFOR_Entry | Reference_Code | Projectile_Z | Projectile_A | Projectile_N | Isotope | Element | N | A | Element_Flag | Nucleus_Radius | Neutron_Nucleus_Radius_Ratio | O | Mass_Excess | dMass_Excess | Binding_Energy | dBinding_Energy | B_Decay_Energy | dB_Decay_Energy | Atomic_Mass_Micro | dAtomic_Mass_Micro | S(2n) | dS(2n) | S(2p) | dS(2p) | Q(a) | dQ(a) | Q(2B-) | dQ(2B-) | Q(ep) | dQ(ep) | Q(B-n) | dQ(B-n) | S(n) | dS(n) | S(p) | dS(p) | Q(4B-) | dQ(4B-) | Q(d,a) | dQ(d,a) | Q(p,a) | dQ(p,a) | Q(n,a) | dQ(n,a) | Q(g,p) | Q(g,n) | Q(g,pn) | Q(g,d) | Q(g,t) | Q(g,He3) | Q(g,2p) | Q(g,2n) | Q(g,a) | Q(p,n) | Q(p,2p) | Q(p,pn) | Q(p,d) | Q(p,2n) | Q(p,t) | Q(p,3He) | Q(n,2p) | Q(n,np) | Q(n,d) | Q(n,2n) | Q(n,t) | Q(n,3He) | Q(d,t) | Q(d,3He) | Q(3He,t) | Q(3He,a) | Q(t,a) | N_valence | Z_valence | P_factor | N_tag | Z_tag | NZ_tag | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | neutron | All_or_Total | 1 | All_or_Total | Dependent | Lab | 88200000.0 | 882000.0 | 0.0300 | 0.001523 | 0.0 | 0.0 | Other | D.F.MEASDAY,ET.AL. (66) | 11152 | 2 | No Pointer | 0 | 0-NN-1(N,TOT),,SIG | NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... | 1966 | D.F.Measday+ | 1USAHRV | 1980/08/04 | Jour. Nuclear Physics Vol.85, p.142, 1966 | 11152002 | 11152 | (J,NP,85,142,6609) | 0 | 1 | 1 | 1n | n | 1 | 1 | I | 1.25 | 0.64 | Other | 8071.31713 | 0.00046 | 0.0 | 0.0 | 782.347 | 0.0 | 1.008665e+06 | 0.00049 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0005 | 0.0 | 0.0 | 2224.566 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6257.229 | 0.0 | 763.755 | 20577.6194 | 0.0 | 1 | 2 | 0.666667 | odd | even | odd_even |
1 | neutron | All_or_Total | 1 | All_or_Total | Dependent | Lab | 98100000.0 | 981000.0 | 0.0291 | 0.001516 | 0.0 | 0.0 | Other | D.F.MEASDAY,ET.AL. (66) | 11152 | 2 | No Pointer | 0 | 0-NN-1(N,TOT),,SIG | NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... | 1966 | D.F.Measday+ | 1USAHRV | 1980/08/04 | Jour. Nuclear Physics Vol.85, p.142, 1966 | 11152002 | 11152 | (J,NP,85,142,6609) | 0 | 1 | 1 | 1n | n | 1 | 1 | I | 1.25 | 0.64 | Other | 8071.31713 | 0.00046 | 0.0 | 0.0 | 782.347 | 0.0 | 1.008665e+06 | 0.00049 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0005 | 0.0 | 0.0 | 2224.566 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6257.229 | 0.0 | 763.755 | 20577.6194 | 0.0 | 1 | 2 | 0.666667 | odd | even | odd_even |
2 | neutron | All_or_Total | 1 | All_or_Total | Dependent | Lab | 110000000.0 | 1100000.0 | 0.0279 | 0.001415 | 0.0 | 0.0 | Other | D.F.MEASDAY,ET.AL. (66) | 11152 | 2 | No Pointer | 0 | 0-NN-1(N,TOT),,SIG | NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... | 1966 | D.F.Measday+ | 1USAHRV | 1980/08/04 | Jour. Nuclear Physics Vol.85, p.142, 1966 | 11152002 | 11152 | (J,NP,85,142,6609) | 0 | 1 | 1 | 1n | n | 1 | 1 | I | 1.25 | 0.64 | Other | 8071.31713 | 0.00046 | 0.0 | 0.0 | 782.347 | 0.0 | 1.008665e+06 | 0.00049 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0005 | 0.0 | 0.0 | 2224.566 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6257.229 | 0.0 | 763.755 | 20577.6194 | 0.0 | 1 | 2 | 0.666667 | odd | even | odd_even |
3 | neutron | All_or_Total | 1 | All_or_Total | Dependent | Lab | 119600000.0 | 1196000.0 | 0.0264 | 0.001403 | 0.0 | 0.0 | Other | D.F.MEASDAY,ET.AL. (66) | 11152 | 2 | No Pointer | 0 | 0-NN-1(N,TOT),,SIG | NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... | 1966 | D.F.Measday+ | 1USAHRV | 1980/08/04 | Jour. Nuclear Physics Vol.85, p.142, 1966 | 11152002 | 11152 | (J,NP,85,142,6609) | 0 | 1 | 1 | 1n | n | 1 | 1 | I | 1.25 | 0.64 | Other | 8071.31713 | 0.00046 | 0.0 | 0.0 | 782.347 | 0.0 | 1.008665e+06 | 0.00049 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0005 | 0.0 | 0.0 | 2224.566 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6257.229 | 0.0 | 763.755 | 20577.6194 | 0.0 | 1 | 2 | 0.666667 | odd | even | odd_even |
4 | neutron | All_or_Total | 1 | All_or_Total | Dependent | Lab | 129400000.0 | 1294000.0 | 0.0256 | 0.001397 | 0.0 | 0.0 | Other | D.F.MEASDAY,ET.AL. (66) | 11152 | 2 | No Pointer | 0 | 0-NN-1(N,TOT),,SIG | NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... | 1966 | D.F.Measday+ | 1USAHRV | 1980/08/04 | Jour. Nuclear Physics Vol.85, p.142, 1966 | 11152002 | 11152 | (J,NP,85,142,6609) | 0 | 1 | 1 | 1n | n | 1 | 1 | I | 1.25 | 0.64 | Other | 8071.31713 | 0.00046 | 0.0 | 0.0 | 782.347 | 0.0 | 1.008665e+06 | 0.00049 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0005 | 0.0 | 0.0 | 2224.566 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6257.229 | 0.0 | 763.755 | 20577.6194 | 0.0 | 1 | 2 | 0.666667 | odd | even | odd_even |
You can similarly load other particle-induce data such as proton-induce cross section reaction data.
ML-ready EXFOR datasets¶
The returned exfor_neutrons
DataFrame is only a starting point for the truly ML-ready dataset. The load_exfor()
method contains more than 10 customizable options that allow you to implement specific normalizers, transformations, numerical thresholds, splitting, and more.
Let us create an ML-ready dataset example:
[11]:
data, x_train, x_test, y_train, y_test, to_scale, scaler = nuc_data.load_exfor(
mode="neutrons", log=True, low_en=True, max_en=2.0E7, num=True, basic=0, normalize=True, filters=True)
INFO:root: MODE: neutrons
INFO:root: LOW ENERGY: True
INFO:root: LOG: True
INFO:root: BASIC: 0
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\EXFOR_neutrons/EXFOR_neutrons_MF3_AME_no_RawNaN.csv
INFO:root:Data read into dataframe with shape: (4184115, 8)
INFO:root:Splitting dataset into training and testing...
INFO:root:Normalizing dataset...
INFO:root:Fitting new scaler.
[12]:
x_train.head()
[12]:
Energy | Z | N | A | MT_1 | MT_101 | MT_102 | MT_103 | MT_104 | MT_105 | MT_106 | MT_107 | MT_108 | MT_111 | MT_112 | MT_113 | MT_155 | MT_158 | MT_159 | MT_16 | MT_17 | MT_18 | MT_2 | MT_22 | MT_24 | MT_28 | MT_29 | MT_3 | MT_32 | MT_33 | MT_4 | MT_41 | MT_51 | MT_9000 | MT_9001 | Center_of_Mass_Flag_Center_of_Mass | Center_of_Mass_Flag_Lab | Element_Flag_I | Element_Flag_N | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1544604 | 7.024445 | -0.788298 | -0.851543 | -0.828075 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
2653877 | 2.323788 | 0.637075 | 0.590841 | 0.608525 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
3608162 | 4.986324 | 1.225816 | 1.246470 | 1.239227 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
3587225 | 2.129271 | 1.225816 | 1.246470 | 1.239227 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
784173 | 5.957368 | -0.974217 | -0.963937 | -0.968231 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
Let us review what the code is doing here:
mode=neutrons
: specifies what projectile particle to extract data for.log=True
: will take the logarithm of both the Energy and Cross Section feature.low_en=True
: we want only data below a certain energy threshold (useful for modeling low energy reaction data)max_en=2.0E
: is the energy threshold used by thelow_en
argument.basic=0
: the set of features to be returned. Look at the documentation for other options.normalize
: indicates that we want to normalize/standardize our dataset.
Notice that the loader returned 7 objects:
data
: contains the raw unprocessed data.x_train
,x_test
,y_train
,y_test
: are the splits of the data that can be used to train an ML model. The default fraction is 0.1. You can change it by setting a differentfrac
number.to_scale
: is the name of the features that were subject to normalization.scaler
: the scaler object that can be used to transform future data or inversely transform any split back to its original scales.
There are many more options you can customize including the type of normalizer and transformer. Additionally, all core arguments reviewed here are applicable to other projectile particle reactions. For example, for protons:
[15]:
data, x_train, x_test, y_train, y_test, to_scale, scaler = nuc_data.load_exfor(
mode="protons", log=True, low_en=False, num=True, basic=0, normalize=True, filters=True)
INFO:root: MODE: protons
INFO:root: LOW ENERGY: False
INFO:root: LOG: True
INFO:root: BASIC: 0
INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\EXFOR_protons\EXFOR_protons_MF3_AME_no_RawNaN.csv
INFO:root:Data read into dataframe with shape: (137408, 8)
INFO:root:Splitting dataset into training and testing...
INFO:root:Normalizing dataset...
INFO:root:Fitting new scaler.
Notice that we setup low_en=False
. Proton-induced reactions are usually at high energies.
[14]:
x_train.head()
[14]:
Energy | Z | N | A | MT_102 | MT_103 | MT_104 | MT_105 | MT_106 | MT_107 | MT_108 | MT_109 | MT_111 | MT_112 | MT_114 | MT_115 | MT_116 | MT_117 | MT_152 | MT_153 | MT_155 | MT_156 | MT_16 | MT_160 | MT_161 | MT_162 | MT_165 | MT_168 | MT_17 | MT_179 | MT_18 | MT_190 | MT_191 | MT_192 | MT_193 | MT_198 | MT_22 | MT_23 | MT_24 | MT_25 | MT_28 | MT_29 | MT_3 | MT_37 | MT_4 | MT_41 | MT_42 | MT_44 | MT_45 | MT_51 | MT_9000 | MT_9001 | Center_of_Mass_Flag_Center_of_Mass | Center_of_Mass_Flag_Lab | Element_Flag_I | Element_Flag_N | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
28098 | 7.653213 | -0.741252 | -0.737797 | -0.739916 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 |
10127 | 7.017033 | -1.338033 | -1.218441 | -1.265850 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
110965 | 7.079181 | 0.878583 | 0.810945 | 0.837888 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 |
38532 | 8.447158 | -0.570743 | -0.630987 | -0.608432 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 |
72892 | 7.960851 | -0.101844 | -0.150343 | -0.131804 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 |
Look at the full arguments fot the load_exfor()
documentation for more information and more functionalities.