{ "cells": [ { "cell_type": "markdown", "id": "connected-principle", "metadata": {}, "source": [ "# Experimental Nuclear Reaction Data (EXFOR)\n", "\n", "The Experimental Nuclear Reaction Data (EXFOR) contains reaction data information for a variety of projectiles. `NucML` makes it easy to download and parse the EXFOR `C4` files to create ready to use ML-datasets. Let us start by importing the `nucml.datasets` module." ] }, { "cell_type": "code", "execution_count": 1, "id": "exclusive-gamma", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:32:30.049601Z", "start_time": "2021-05-06T18:32:30.046100Z" } }, "outputs": [], "source": [ "# # For prototype\n", "# import sys\n", "# sys.path.append(\"../..\")" ] }, { "cell_type": "code", "execution_count": 2, "id": "abroad-liechtenstein", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:32:34.681334Z", "start_time": "2021-05-06T18:32:33.165630Z" } }, "outputs": [], "source": [ "import nucml.datasets as nuc_data\n", "import pandas as pd\n", "pd.set_option('display.max_columns', 500)" ] }, { "cell_type": "markdown", "id": "human-liverpool", "metadata": {}, "source": [ "## Original EXFOR Tables\n", "\n", "When setting up, several EXFOR `CSV` files were created for each reaction-inducing particle. These are stored in your local directory and follow a specific structure/convention: `EXFOR/CSV_Files/EXFOR_/EXFOR__ORIGINAL.csv`. You can simply read them using `pandas`. Alternatively, you can use `NucML` to handle loading scenarios." ] }, { "cell_type": "code", "execution_count": 4, "id": "8e4deb1d", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:33:18.371837Z", "start_time": "2021-05-06T18:32:50.630337Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\Pedro\\Anaconda3\\envs\\ml_gpu\\lib\\site-packages\\IPython\\core\\interactiveshell.py:3357: DtypeWarning: Columns (17,30,31) have mixed types.Specify dtype option on import or set low_memory=False.\n", " if (await self.run_code(code, result, async_=asy)):\n" ] } ], "source": [ "neutrons = nuc_data.load_exfor_raw(mode=\"neutrons\")" ] }, { "cell_type": "code", "execution_count": 6, "id": "e3e5891e", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:33:19.393356Z", "start_time": "2021-05-06T18:33:19.388854Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "There are 6007126 neutron-related data points.\n" ] } ], "source": [ "print(\"There are {} neutron-related data points.\".format(neutrons.shape[0]))" ] }, { "cell_type": "code", "execution_count": 8, "id": "brown-sight", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:33:49.232218Z", "start_time": "2021-05-06T18:33:49.227716Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['Projectile', 'Target_Metastable_State', 'MF', 'MT',\n", " 'Product_Metastable_State', 'EXFOR_Status', 'Center_of_Mass_Flag',\n", " 'Energy', 'dEnergy', 'Data', 'dData', 'Cos/LO', 'dCos/LO', 'ELV/HL',\n", " 'dELV/HL', 'I78', 'Short_Reference', 'EXFOR_Accession_Number',\n", " 'EXFOR_SubAccession_Number', 'EXFOR_Pointer', 'Z', 'A', 'N',\n", " 'Reaction_Notation', 'Title', 'Year', 'Author', 'Institute', 'Date',\n", " 'Reference', 'Dataset_Number', 'EXFOR_Entry', 'Reference_Code',\n", " 'Projectile_Z', 'Projectile_A', 'Projectile_N', 'Isotope', 'Element'],\n", " dtype='object')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "neutrons.columns" ] }, { "cell_type": "markdown", "id": "disciplinary-serum", "metadata": {}, "source": [ "Other supported projectiles are `protons`, `alphas`, `deuterons`, `gammas`, and `helions`. Alternatively, you can load all data from all projectiles by specifying `mode=\"all\"`" ] }, { "cell_type": "markdown", "id": "accessory-tuesday", "metadata": {}, "source": [ "## Reaction Data (MF3) Only\n", "\n", "There are a variety of `Files (MT)` included in `EXFOR`. In our work, we dealt only with reaction cross section vs energy datapoints (`MT=3`). `NucML` therefore offers many functionalities to create ML-ready particle-induced cross section reaction data.\n", "\n", "We can simply load all `neutron` induce cross section reaction EXFOR data points by calling the `load_exfor()` method:" ] }, { "cell_type": "code", "execution_count": 9, "id": "cellular-capital", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:38:46.102508Z", "start_time": "2021-05-06T18:37:30.735813Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:root: MODE: neutrons\n", "INFO:root: LOW ENERGY: False\n", "INFO:root: LOG: False\n", "INFO:root: BASIC: -1\n", "INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\\EXFOR_neutrons/EXFOR_neutrons_MF3_AME_no_RawNaN.csv\n", "INFO:root:Data read into dataframe with shape: (4255409, 104)\n", "INFO:root:Finished. Resulting dataset has shape (4255409, 104)\n" ] } ], "source": [ "exfor_neutrons = nuc_data.load_exfor(mode=\"neutrons\")" ] }, { "cell_type": "markdown", "id": "residential-german", "metadata": {}, "source": [ "When setting up `NucML`, the parsing and data creation utilities automatically created a `CSV` file ready which is almost ready for ML-modeling. The `.load_exfor()` method uses these preprocessed files to filter, prepare and return a truly ML-ready dataset. \n", "\n", "**NOTE: You should always analyze and inspect any recommended dataset. The following loading options are purely an opinion. You can start from the raw original EXFOR tables to build your dataset ideally.**\n", "\n", "You will notice that the returned dataset contains less data than the original tables. This is because we are only using the `MF=3` points and doing some basic filtering operations. Notice that the AME database information is already appended and no missing values will be found. For information on the filtering and imputation techniques used, see the documentation." ] }, { "cell_type": "code", "execution_count": 10, "id": "fluid-manhattan", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:38:55.571006Z", "start_time": "2021-05-06T18:38:55.494507Z" }, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ProjectileTarget_Metastable_StateMTProduct_Metastable_StateEXFOR_StatusCenter_of_Mass_FlagEnergydEnergyDatadDataELV/HLdELV/HLI78Short_ReferenceEXFOR_Accession_NumberEXFOR_SubAccession_NumberEXFOR_PointerZReaction_NotationTitleYearAuthorInstituteDateReferenceDataset_NumberEXFOR_EntryReference_CodeProjectile_ZProjectile_AProjectile_NIsotopeElementNAElement_FlagNucleus_RadiusNeutron_Nucleus_Radius_RatioOMass_ExcessdMass_ExcessBinding_EnergydBinding_EnergyB_Decay_EnergydB_Decay_EnergyAtomic_Mass_MicrodAtomic_Mass_MicroS(2n)dS(2n)S(2p)dS(2p)Q(a)dQ(a)Q(2B-)dQ(2B-)Q(ep)dQ(ep)Q(B-n)dQ(B-n)S(n)dS(n)S(p)dS(p)Q(4B-)dQ(4B-)Q(d,a)dQ(d,a)Q(p,a)dQ(p,a)Q(n,a)dQ(n,a)Q(g,p)Q(g,n)Q(g,pn)Q(g,d)Q(g,t)Q(g,He3)Q(g,2p)Q(g,2n)Q(g,a)Q(p,n)Q(p,2p)Q(p,pn)Q(p,d)Q(p,2n)Q(p,t)Q(p,3He)Q(n,2p)Q(n,np)Q(n,d)Q(n,2n)Q(n,t)Q(n,3He)Q(d,t)Q(d,3He)Q(3He,t)Q(3He,a)Q(t,a)N_valenceZ_valenceP_factorN_tagZ_tagNZ_tag
0neutronAll_or_Total1All_or_TotalDependentLab88200000.0882000.00.03000.0015230.00.0OtherD.F.MEASDAY,ET.AL. (66)111522No Pointer00-NN-1(N,TOT),,SIGNEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...1966D.F.Measday+1USAHRV1980/08/04Jour. Nuclear Physics Vol.85, p.142, 19661115200211152(J,NP,85,142,6609)0111nn11I1.250.64Other8071.317130.000460.00.0782.3470.01.008665e+060.000490.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00050.00.02224.5660.00.00.00.00.00.00.00.00.06257.2290.0763.75520577.61940.0120.666667oddevenodd_even
1neutronAll_or_Total1All_or_TotalDependentLab98100000.0981000.00.02910.0015160.00.0OtherD.F.MEASDAY,ET.AL. (66)111522No Pointer00-NN-1(N,TOT),,SIGNEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...1966D.F.Measday+1USAHRV1980/08/04Jour. Nuclear Physics Vol.85, p.142, 19661115200211152(J,NP,85,142,6609)0111nn11I1.250.64Other8071.317130.000460.00.0782.3470.01.008665e+060.000490.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00050.00.02224.5660.00.00.00.00.00.00.00.00.06257.2290.0763.75520577.61940.0120.666667oddevenodd_even
2neutronAll_or_Total1All_or_TotalDependentLab110000000.01100000.00.02790.0014150.00.0OtherD.F.MEASDAY,ET.AL. (66)111522No Pointer00-NN-1(N,TOT),,SIGNEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...1966D.F.Measday+1USAHRV1980/08/04Jour. Nuclear Physics Vol.85, p.142, 19661115200211152(J,NP,85,142,6609)0111nn11I1.250.64Other8071.317130.000460.00.0782.3470.01.008665e+060.000490.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00050.00.02224.5660.00.00.00.00.00.00.00.00.06257.2290.0763.75520577.61940.0120.666667oddevenodd_even
3neutronAll_or_Total1All_or_TotalDependentLab119600000.01196000.00.02640.0014030.00.0OtherD.F.MEASDAY,ET.AL. (66)111522No Pointer00-NN-1(N,TOT),,SIGNEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...1966D.F.Measday+1USAHRV1980/08/04Jour. Nuclear Physics Vol.85, p.142, 19661115200211152(J,NP,85,142,6609)0111nn11I1.250.64Other8071.317130.000460.00.0782.3470.01.008665e+060.000490.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00050.00.02224.5660.00.00.00.00.00.00.00.00.06257.2290.0763.75520577.61940.0120.666667oddevenodd_even
4neutronAll_or_Total1All_or_TotalDependentLab129400000.01294000.00.02560.0013970.00.0OtherD.F.MEASDAY,ET.AL. (66)111522No Pointer00-NN-1(N,TOT),,SIGNEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO...1966D.F.Measday+1USAHRV1980/08/04Jour. Nuclear Physics Vol.85, p.142, 19661115200211152(J,NP,85,142,6609)0111nn11I1.250.64Other8071.317130.000460.00.0782.3470.01.008665e+060.000490.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00050.00.02224.5660.00.00.00.00.00.00.00.00.06257.2290.0763.75520577.61940.0120.666667oddevenodd_even
\n", "
" ], "text/plain": [ " Projectile Target_Metastable_State MT Product_Metastable_State EXFOR_Status \\\n", "0 neutron All_or_Total 1 All_or_Total Dependent \n", "1 neutron All_or_Total 1 All_or_Total Dependent \n", "2 neutron All_or_Total 1 All_or_Total Dependent \n", "3 neutron All_or_Total 1 All_or_Total Dependent \n", "4 neutron All_or_Total 1 All_or_Total Dependent \n", "\n", " Center_of_Mass_Flag Energy dEnergy Data dData ELV/HL \\\n", "0 Lab 88200000.0 882000.0 0.0300 0.001523 0.0 \n", "1 Lab 98100000.0 981000.0 0.0291 0.001516 0.0 \n", "2 Lab 110000000.0 1100000.0 0.0279 0.001415 0.0 \n", "3 Lab 119600000.0 1196000.0 0.0264 0.001403 0.0 \n", "4 Lab 129400000.0 1294000.0 0.0256 0.001397 0.0 \n", "\n", " dELV/HL I78 Short_Reference EXFOR_Accession_Number \\\n", "0 0.0 Other D.F.MEASDAY,ET.AL. (66) 11152 \n", "1 0.0 Other D.F.MEASDAY,ET.AL. (66) 11152 \n", "2 0.0 Other D.F.MEASDAY,ET.AL. (66) 11152 \n", "3 0.0 Other D.F.MEASDAY,ET.AL. (66) 11152 \n", "4 0.0 Other D.F.MEASDAY,ET.AL. (66) 11152 \n", "\n", " EXFOR_SubAccession_Number EXFOR_Pointer Z Reaction_Notation \\\n", "0 2 No Pointer 0 0-NN-1(N,TOT),,SIG \n", "1 2 No Pointer 0 0-NN-1(N,TOT),,SIG \n", "2 2 No Pointer 0 0-NN-1(N,TOT),,SIG \n", "3 2 No Pointer 0 0-NN-1(N,TOT),,SIG \n", "4 2 No Pointer 0 0-NN-1(N,TOT),,SIG \n", "\n", " Title Year Author \\\n", "0 NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... 1966 D.F.Measday+ \n", "1 NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... 1966 D.F.Measday+ \n", "2 NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... 1966 D.F.Measday+ \n", "3 NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... 1966 D.F.Measday+ \n", "4 NEUTRON TOTAL CROSS SECTIONS FOR NEUTRONS, PRO... 1966 D.F.Measday+ \n", "\n", " Institute Date Reference \\\n", "0 1USAHRV 1980/08/04 Jour. Nuclear Physics Vol.85, p.142, 1966 \n", "1 1USAHRV 1980/08/04 Jour. Nuclear Physics Vol.85, p.142, 1966 \n", "2 1USAHRV 1980/08/04 Jour. Nuclear Physics Vol.85, p.142, 1966 \n", "3 1USAHRV 1980/08/04 Jour. Nuclear Physics Vol.85, p.142, 1966 \n", "4 1USAHRV 1980/08/04 Jour. Nuclear Physics Vol.85, p.142, 1966 \n", "\n", " Dataset_Number EXFOR_Entry Reference_Code Projectile_Z Projectile_A \\\n", "0 11152002 11152 (J,NP,85,142,6609) 0 1 \n", "1 11152002 11152 (J,NP,85,142,6609) 0 1 \n", "2 11152002 11152 (J,NP,85,142,6609) 0 1 \n", "3 11152002 11152 (J,NP,85,142,6609) 0 1 \n", "4 11152002 11152 (J,NP,85,142,6609) 0 1 \n", "\n", " Projectile_N Isotope Element N A Element_Flag Nucleus_Radius \\\n", "0 1 1n n 1 1 I 1.25 \n", "1 1 1n n 1 1 I 1.25 \n", "2 1 1n n 1 1 I 1.25 \n", "3 1 1n n 1 1 I 1.25 \n", "4 1 1n n 1 1 I 1.25 \n", "\n", " Neutron_Nucleus_Radius_Ratio O Mass_Excess dMass_Excess \\\n", "0 0.64 Other 8071.31713 0.00046 \n", "1 0.64 Other 8071.31713 0.00046 \n", "2 0.64 Other 8071.31713 0.00046 \n", "3 0.64 Other 8071.31713 0.00046 \n", "4 0.64 Other 8071.31713 0.00046 \n", "\n", " Binding_Energy dBinding_Energy B_Decay_Energy dB_Decay_Energy \\\n", "0 0.0 0.0 782.347 0.0 \n", "1 0.0 0.0 782.347 0.0 \n", "2 0.0 0.0 782.347 0.0 \n", "3 0.0 0.0 782.347 0.0 \n", "4 0.0 0.0 782.347 0.0 \n", "\n", " Atomic_Mass_Micro dAtomic_Mass_Micro S(2n) dS(2n) S(2p) dS(2p) Q(a) \\\n", "0 1.008665e+06 0.00049 0.0 0.0 0.0 0.0 0.0 \n", "1 1.008665e+06 0.00049 0.0 0.0 0.0 0.0 0.0 \n", "2 1.008665e+06 0.00049 0.0 0.0 0.0 0.0 0.0 \n", "3 1.008665e+06 0.00049 0.0 0.0 0.0 0.0 0.0 \n", "4 1.008665e+06 0.00049 0.0 0.0 0.0 0.0 0.0 \n", "\n", " dQ(a) Q(2B-) dQ(2B-) Q(ep) dQ(ep) Q(B-n) dQ(B-n) S(n) dS(n) S(p) \\\n", "0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " dS(p) Q(4B-) dQ(4B-) Q(d,a) dQ(d,a) Q(p,a) dQ(p,a) Q(n,a) dQ(n,a) \\\n", "0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Q(g,p) Q(g,n) Q(g,pn) Q(g,d) Q(g,t) Q(g,He3) Q(g,2p) Q(g,2n) \\\n", "0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Q(g,a) Q(p,n) Q(p,2p) Q(p,pn) Q(p,d) Q(p,2n) Q(p,t) Q(p,3He) \\\n", "0 0.0 0.0005 0.0 0.0 2224.566 0.0 0.0 0.0 \n", "1 0.0 0.0005 0.0 0.0 2224.566 0.0 0.0 0.0 \n", "2 0.0 0.0005 0.0 0.0 2224.566 0.0 0.0 0.0 \n", "3 0.0 0.0005 0.0 0.0 2224.566 0.0 0.0 0.0 \n", "4 0.0 0.0005 0.0 0.0 2224.566 0.0 0.0 0.0 \n", "\n", " Q(n,2p) Q(n,np) Q(n,d) Q(n,2n) Q(n,t) Q(n,3He) Q(d,t) Q(d,3He) \\\n", "0 0.0 0.0 0.0 0.0 0.0 0.0 6257.229 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 0.0 6257.229 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 0.0 6257.229 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 0.0 6257.229 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 6257.229 0.0 \n", "\n", " Q(3He,t) Q(3He,a) Q(t,a) N_valence Z_valence P_factor N_tag Z_tag \\\n", "0 763.755 20577.6194 0.0 1 2 0.666667 odd even \n", "1 763.755 20577.6194 0.0 1 2 0.666667 odd even \n", "2 763.755 20577.6194 0.0 1 2 0.666667 odd even \n", "3 763.755 20577.6194 0.0 1 2 0.666667 odd even \n", "4 763.755 20577.6194 0.0 1 2 0.666667 odd even \n", "\n", " NZ_tag \n", "0 odd_even \n", "1 odd_even \n", "2 odd_even \n", "3 odd_even \n", "4 odd_even " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "exfor_neutrons.head()" ] }, { "cell_type": "markdown", "id": "unexpected-madness", "metadata": {}, "source": [ "You can similarly load other particle-induce data such as proton-induce cross section reaction data. \n", "\n", "## ML-ready EXFOR datasets\n", "\n", "The returned `exfor_neutrons` DataFrame is only a starting point for the truly ML-ready dataset. The `load_exfor()` method contains more than 10 customizable options that allow you to implement specific normalizers, transformations, numerical thresholds, splitting, and more.\n", "\n", "Let us create an ML-ready dataset example:" ] }, { "cell_type": "code", "execution_count": 11, "id": "handy-voluntary", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:40:52.163455Z", "start_time": "2021-05-06T18:39:07.615697Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:root: MODE: neutrons\n", "INFO:root: LOW ENERGY: True\n", "INFO:root: LOG: True\n", "INFO:root: BASIC: 0\n", "INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\\EXFOR_neutrons/EXFOR_neutrons_MF3_AME_no_RawNaN.csv\n", "INFO:root:Data read into dataframe with shape: (4184115, 8)\n", "INFO:root:Splitting dataset into training and testing...\n", "INFO:root:Normalizing dataset...\n", "INFO:root:Fitting new scaler.\n" ] } ], "source": [ "data, x_train, x_test, y_train, y_test, to_scale, scaler = nuc_data.load_exfor(\n", " mode=\"neutrons\", log=True, low_en=True, max_en=2.0E7, num=True, basic=0, normalize=True, filters=True)" ] }, { "cell_type": "code", "execution_count": 12, "id": "equal-reader", "metadata": { "ExecuteTime": { "end_time": "2021-05-06T18:40:52.185455Z", "start_time": "2021-05-06T18:40:52.164955Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EnergyZNAMT_1MT_101MT_102MT_103MT_104MT_105MT_106MT_107MT_108MT_111MT_112MT_113MT_155MT_158MT_159MT_16MT_17MT_18MT_2MT_22MT_24MT_28MT_29MT_3MT_32MT_33MT_4MT_41MT_51MT_9000MT_9001Center_of_Mass_Flag_Center_of_MassCenter_of_Mass_Flag_LabElement_Flag_IElement_Flag_N
15446047.024445-0.788298-0.851543-0.82807510000000000000000000000000000000110
26538772.3237880.6370750.5908410.60852510000000000000000000000000000000110
36081624.9863241.2258161.2464701.23922700000000000000000010000000000000110
35872252.1292711.2258161.2464701.23922710000000000000000000000000000000110
7841735.957368-0.974217-0.963937-0.96823110000000000000000000000000000000101
\n", "
" ], "text/plain": [ " Energy Z N A MT_1 MT_101 MT_102 MT_103 \\\n", "1544604 7.024445 -0.788298 -0.851543 -0.828075 1 0 0 0 \n", "2653877 2.323788 0.637075 0.590841 0.608525 1 0 0 0 \n", "3608162 4.986324 1.225816 1.246470 1.239227 0 0 0 0 \n", "3587225 2.129271 1.225816 1.246470 1.239227 1 0 0 0 \n", "784173 5.957368 -0.974217 -0.963937 -0.968231 1 0 0 0 \n", "\n", " MT_104 MT_105 MT_106 MT_107 MT_108 MT_111 MT_112 MT_113 \\\n", "1544604 0 0 0 0 0 0 0 0 \n", "2653877 0 0 0 0 0 0 0 0 \n", "3608162 0 0 0 0 0 0 0 0 \n", "3587225 0 0 0 0 0 0 0 0 \n", "784173 0 0 0 0 0 0 0 0 \n", "\n", " MT_155 MT_158 MT_159 MT_16 MT_17 MT_18 MT_2 MT_22 MT_24 \\\n", "1544604 0 0 0 0 0 0 0 0 0 \n", "2653877 0 0 0 0 0 0 0 0 0 \n", "3608162 0 0 0 0 0 0 1 0 0 \n", "3587225 0 0 0 0 0 0 0 0 0 \n", "784173 0 0 0 0 0 0 0 0 0 \n", "\n", " MT_28 MT_29 MT_3 MT_32 MT_33 MT_4 MT_41 MT_51 MT_9000 \\\n", "1544604 0 0 0 0 0 0 0 0 0 \n", "2653877 0 0 0 0 0 0 0 0 0 \n", "3608162 0 0 0 0 0 0 0 0 0 \n", "3587225 0 0 0 0 0 0 0 0 0 \n", "784173 0 0 0 0 0 0 0 0 0 \n", "\n", " MT_9001 Center_of_Mass_Flag_Center_of_Mass Center_of_Mass_Flag_Lab \\\n", "1544604 0 0 1 \n", "2653877 0 0 1 \n", "3608162 0 0 1 \n", "3587225 0 0 1 \n", "784173 0 0 1 \n", "\n", " Element_Flag_I Element_Flag_N \n", "1544604 1 0 \n", "2653877 1 0 \n", "3608162 1 0 \n", "3587225 1 0 \n", "784173 0 1 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x_train.head()" ] }, { "cell_type": "markdown", "id": "classified-trade", "metadata": {}, "source": [ "Let us review what the code is doing here:\n", "\n", "1. `mode=neutrons`: specifies what projectile particle to extract data for.\n", "2. `log=True`: will take the logarithm of both the Energy and Cross Section feature.\n", "3. `low_en=True`: we want only data below a certain energy threshold (useful for modeling low energy reaction data)\n", "4. `max_en=2.0E`: is the energy threshold used by the `low_en` argument.\n", "5. `basic=0`: the set of features to be returned. Look at the documentation for other options.\n", "6. `normalize`: indicates that we want to normalize/standardize our dataset.\n", "\n", "Notice that the loader returned 7 objects:\n", "\n", "1. `data`: contains the raw unprocessed data.\n", "2. `x_train`, `x_test`, `y_train`, `y_test`: are the splits of the data that can be used to train an ML model. The default fraction is 0.1. You can change it by setting a different `frac` number.\n", "3. `to_scale`: is the name of the features that were subject to normalization.\n", "4. `scaler`: the scaler object that can be used to transform future data or inversely transform any split back to its original scales.\n", "\n", "There are many more options you can customize including the type of normalizer and transformer. Additionally, all core arguments reviewed here are applicable to other projectile particle reactions. For example, for protons:" ] }, { "cell_type": "code", "execution_count": 15, "id": "israeli-springfield", "metadata": { "ExecuteTime": { "end_time": "2021-02-23T18:18:59.937680Z", "start_time": "2021-02-23T18:18:56.441940Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:root: MODE: protons\n", "INFO:root: LOW ENERGY: False\n", "INFO:root: LOG: True\n", "INFO:root: BASIC: 0\n", "INFO:root:Reading data from C:/Users/Pedro/Desktop/ML_Nuclear_Data/EXFOR/CSV_Files\\EXFOR_protons\\EXFOR_protons_MF3_AME_no_RawNaN.csv\n", "INFO:root:Data read into dataframe with shape: (137408, 8)\n", "INFO:root:Splitting dataset into training and testing...\n", "INFO:root:Normalizing dataset...\n", "INFO:root:Fitting new scaler.\n" ] } ], "source": [ "data, x_train, x_test, y_train, y_test, to_scale, scaler = nuc_data.load_exfor(\n", " mode=\"protons\", log=True, low_en=False, num=True, basic=0, normalize=True, filters=True)" ] }, { "cell_type": "markdown", "id": "impossible-bridge", "metadata": {}, "source": [ "Notice that we setup `low_en=False`. Proton-induced reactions are usually at high energies." ] }, { "cell_type": "code", "execution_count": 14, "id": "respected-remains", "metadata": { "ExecuteTime": { "end_time": "2021-02-23T18:18:33.574095Z", "start_time": "2021-02-23T18:18:33.546106Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EnergyZNAMT_102MT_103MT_104MT_105MT_106MT_107MT_108MT_109MT_111MT_112MT_114MT_115MT_116MT_117MT_152MT_153MT_155MT_156MT_16MT_160MT_161MT_162MT_165MT_168MT_17MT_179MT_18MT_190MT_191MT_192MT_193MT_198MT_22MT_23MT_24MT_25MT_28MT_29MT_3MT_37MT_4MT_41MT_42MT_44MT_45MT_51MT_9000MT_9001Center_of_Mass_Flag_Center_of_MassCenter_of_Mass_Flag_LabElement_Flag_IElement_Flag_N
280987.653213-0.741252-0.737797-0.7399160000000000000000000000000000000000000000000000100101
101277.017033-1.338033-1.218441-1.2658500000010000000000000000000000000000000000000000000110
1109657.0791810.8785830.8109450.8378880000000000000000000000000000000000000000000000010101
385328.447158-0.570743-0.630987-0.6084320000000000000000000000000000000000000000000000100101
728927.960851-0.101844-0.150343-0.1318040000000000000000000000000000000000000000000000100101
\n", "
" ], "text/plain": [ " Energy Z N A MT_102 MT_103 MT_104 \\\n", "28098 7.653213 -0.741252 -0.737797 -0.739916 0 0 0 \n", "10127 7.017033 -1.338033 -1.218441 -1.265850 0 0 0 \n", "110965 7.079181 0.878583 0.810945 0.837888 0 0 0 \n", "38532 8.447158 -0.570743 -0.630987 -0.608432 0 0 0 \n", "72892 7.960851 -0.101844 -0.150343 -0.131804 0 0 0 \n", "\n", " MT_105 MT_106 MT_107 MT_108 MT_109 MT_111 MT_112 MT_114 \\\n", "28098 0 0 0 0 0 0 0 0 \n", "10127 0 0 1 0 0 0 0 0 \n", "110965 0 0 0 0 0 0 0 0 \n", "38532 0 0 0 0 0 0 0 0 \n", "72892 0 0 0 0 0 0 0 0 \n", "\n", " MT_115 MT_116 MT_117 MT_152 MT_153 MT_155 MT_156 MT_16 MT_160 \\\n", "28098 0 0 0 0 0 0 0 0 0 \n", "10127 0 0 0 0 0 0 0 0 0 \n", "110965 0 0 0 0 0 0 0 0 0 \n", "38532 0 0 0 0 0 0 0 0 0 \n", "72892 0 0 0 0 0 0 0 0 0 \n", "\n", " MT_161 MT_162 MT_165 MT_168 MT_17 MT_179 MT_18 MT_190 MT_191 \\\n", "28098 0 0 0 0 0 0 0 0 0 \n", "10127 0 0 0 0 0 0 0 0 0 \n", "110965 0 0 0 0 0 0 0 0 0 \n", "38532 0 0 0 0 0 0 0 0 0 \n", "72892 0 0 0 0 0 0 0 0 0 \n", "\n", " MT_192 MT_193 MT_198 MT_22 MT_23 MT_24 MT_25 MT_28 MT_29 \\\n", "28098 0 0 0 0 0 0 0 0 0 \n", "10127 0 0 0 0 0 0 0 0 0 \n", "110965 0 0 0 0 0 0 0 0 0 \n", "38532 0 0 0 0 0 0 0 0 0 \n", "72892 0 0 0 0 0 0 0 0 0 \n", "\n", " MT_3 MT_37 MT_4 MT_41 MT_42 MT_44 MT_45 MT_51 MT_9000 \\\n", "28098 0 0 0 0 0 0 0 0 1 \n", "10127 0 0 0 0 0 0 0 0 0 \n", "110965 0 0 0 0 0 0 0 0 0 \n", "38532 0 0 0 0 0 0 0 0 1 \n", "72892 0 0 0 0 0 0 0 0 1 \n", "\n", " MT_9001 Center_of_Mass_Flag_Center_of_Mass Center_of_Mass_Flag_Lab \\\n", "28098 0 0 1 \n", "10127 0 0 1 \n", "110965 1 0 1 \n", "38532 0 0 1 \n", "72892 0 0 1 \n", "\n", " Element_Flag_I Element_Flag_N \n", "28098 0 1 \n", "10127 1 0 \n", "110965 0 1 \n", "38532 0 1 \n", "72892 0 1 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x_train.head()" ] }, { "cell_type": "markdown", "id": "smaller-chrome", "metadata": {}, "source": [ "Look at the full arguments fot the `load_exfor()` documentation for more information and more functionalities." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }