nucml package

Submodules

nucml.config module

nucml.configure module

nucml.configure.configure(user_path, ace_path, matlab_exe_path='')[source]

Configures an internal file necessary to enable all NucML functionalities that deal with loading data files from EXFOR, AME, ACE, ENSDF, and RIPL.

The ace_path can be an already existing directory from a serpent distribution. The ML_Nuclear_Data repository contaings a version of ACE files which were used to develop and test all functionalities. If the .ace files have different structure, the ace utilities may not work.

Parameters
  • user_path (str) – Path-like string pointing to the project directory.

  • ace_path (str) – Path-like string pointing to the .ace files.

  • matplab_exe_path (str, optional) – Path-like string pointing towards the MATLAB executable. The default is None

Returns

None

nucml.datasets module

nucml.datasets.generate_bigquery_csv()[source]

Creates a single EXFOR data file to update Google BigQuery database.

Returns

None

nucml.datasets.generate_exfor_dataset(user_path, modes=['neutrons', 'protons', 'alphas', 'deuterons', 'gammas', 'helions'])[source]

Generates all needed EXFOR datasets for neutron-, proton-, alpha-, deuterons-, gammas-, and helion-induce reactions. Beware, NucML configuration needs to be performed first. See nucml.configure. The modes argument can be modified for the function to generate only user-defined datasets.

Parameters
  • user_path (str) – path-like string where all information including the datasets will be stored.

  • modes (list, optional) – Type of projectile for which to generate the datasets. Defaults to [“neutrons”, “protons”, “alphas”, “deuterons”, “gammas”, “helions”].

Returns

None

nucml.datasets.load_ame(natural=False, imputed_nan=False, file='merged')[source]

Loads the Atomic Mass Evaluation 2016 data generated by NucML using the parsing utilities.

For file=”merged”, there are four AME dataset versions: 1. AME_all_merged (natural=False, imputed_nan=False): Contains all avaliable AME information from the mass, rct1, and rct2 files. 2. AME_all_merged_no_NaN (natural=False, imputed_nan=True): Same as 1, except all missing values are imputed linearly and element-wise. 3. AME_Natural_Properties_w_NaN (natural=True, imputed_nan=False): Similar to 2, except data for natural abundance elements is included. 4. AME_Natural_Properties_no_NaN (natural=True, imputed_nan=True): Same as 3. except all missing values are imputed linearly and element-wise.

Parameters
  • natural (bool) – if True, the AME data containing natural element data will be loaded. Only applicable when file=’merged’.

  • imputed_nan (bool) – If True, the dataset loaded will not contain any missing values (imputed version will be loaded).

  • file (str) – Dataset to extract. Options include ‘merged’, ‘mass16’, ‘rct1’, and ‘rct2’.

Returns

a pandas dataframe cantaining the queried AME data.

Return type

DataFrame

nucml.datasets.load_ensdf(cutoff=False, append_ame=False)[source]

Loads the Evalauted Nuclear Structure Data File structure levels data generated through NucML parsings utilities.

Parameters
  • cutoff (bool, optional) – If True, the excited levels are cut-off according to the RIPL cutoof parameters. Defaults to False.

  • append_ame (bool, optional) –

Returns

DataFrame

nucml.datasets.load_ensdf_ground_states()[source]

Loads the ENSDF file. Only ground state information.

Returns

DataFrame

nucml.datasets.load_ensdf_headers()[source]

Loads ENSDF headers from RIPL .dat files.

Returns

DataFrame

nucml.datasets.load_ensdf_isotopic(isotope, filetype='levels')[source]

Loads level or gamma records for a given isotope (i.e. U235).

Parameters
  • isotope (str) – Isotope to query (i.e. u235, cl35, 239Pu)

  • filetype (str, optional) – Specifies if level or gamma records are to be extracted. Options include “levels” and “gammas”. Defaults to “levels”.

Returns

DataFrame

nucml.datasets.load_ensdf_ml(cutoff=False, log_sqrt=False, log=False, append_ame=False, basic=- 1, num=False, frac=0.3, scaling_type='standard', scaler_dir=None, normalize=True)[source]

EXPERIMENTAL (NOT MEANT FOR USE)

Parameters
  • cutoff (bool, optional) – [description]. Defaults to False.

  • log_sqrt (bool, optional) – [description]. Defaults to False.

  • log (bool, optional) – [description]. Defaults to False.

  • append_ame (bool, optional) – [description]. Defaults to False.

  • basic (int, optional) – [description]. Defaults to -1.

  • num (bool, optional) – [description]. Defaults to False.

  • frac (float, optional) – [description]. Defaults to 0.3.

  • scaling_type (str, optional) – [description]. Defaults to “standard”.

  • scaler_dir ([type], optional) – [description]. Defaults to None.

  • normalize (bool, optional) – [description]. Defaults to True.

Returns

[description]

Return type

[type]

nucml.datasets.load_evaluation(isotope, MT, mode='neutrons', library='endfb8.0', mev_to_ev=True, mb_to_b=True, log=True, drop_u=True)[source]

Reads an evaluation file for a specific isotope, reaction channel, and evaluated library. It is important to inspect the returned data since it queries a local database of an external source which extracted data from ENDF using an extraction script. It has been found that some particular reactions are not included. These can be added manually for future loading.

Parameters
  • isotope (str) – Isotope to query (i.e. U233, Cl35).

  • MT (int) – Reaction channel ENDF code. Must be an integer (i.e. 1, 2, 3)

  • mode (str) – Type of projectile. Only “neutrons” and “protons” are supported for now.

  • library (str) – Evaluation library to query. Allowed options include endfb8.0, jendl4.0, jeff3.3, and tendl.2019.

  • mev_to_ev (bool) – If True, it converts the energy from MeV to eV.

  • mb_to_b (bool) – If True, it converts the cross sections from millibarns to barns.

  • log (bool) – If True, it applies the log10 to both the Energy and the Cross Section.

  • drop_u (bool) – Sometimes, evaluation files contain uncertainty values. If True, these features are removed.

Returns

pandas DataFrame containing the ENDF datapoints.

Return type

evaluation (DataFrame)

nucml.datasets.load_exfor(log=False, low_en=False, basic=- 1, num=False, frac=0.1, mode='neutrons', scaling_type='standard', scaler_dir=None, filters=False, max_en=20000000.0, mt_coding='one_hot', scale_energy=False, projectile_coding='one_hot', normalize=True, pedro=False, pedro_v2=False)[source]

Loads the EXFOR dataset in its varius forms. This function helps load ML-ready EXFOR datasets for different particle induce reactions or all of them.

Parameters
  • log (bool, optional) – If True, the log of the Energy and Cross Section is taken. Defaults to False.

  • low_en (bool, optional) – If True, an upper limit in energy is applied given by the max_en argument. Defaults to False.

  • basic (int, optional) – Indicates how many features to load. -1 means all avaliable features. Defaults to -1.

  • num (bool, optional) – If True, only numerical and relevant categorical features are loaded. Defaults to False.

  • frac (float, optional) – Fraction of the dataset for test set. Defaults to 0.1.

  • mode (str, optional) – Dataset to load. Options include neutrons, gammas, and protons. Defaults to “neutrons”.

  • scaling_type (str, optional) – Type of scaler to use for normalizing the dataset. Defaults to “standard”.

  • scaler_dir (str, optional) – Directory in which to store the trained scaler. Defaults to None.

  • filters (bool, optional) – If True, a variety of filters are applied that help discard irregular data. Defaults to False.

  • max_en (float, optional) – Maximum energy threshold by which the dataset is filtered. Defaults to 2.0E7.

  • mt_coding (str, optional) – Method used to process the MT reaction channel codes. Defaults to “one_hot”.

  • scale_energy (bool, optional) – If True, the energy will be normalized along with all other features. Defaults to False.

  • projectile_coding (str, optional) – Method used to process the type of projectile. Defaults to “one_hot”.

  • pedro (bool, optional) – Personal settings. Defaults to False.

Raises
  • FileNotFoundError – If mode is all and one of the files is missing.

  • FileNotFoundError – If the selected mode file does not exist.

Returns

Only returns one dataset if num=False. DataFrames: Multiple dataframes and objects if num=True.

Return type

DataFrame

nucml.datasets.load_exfor_raw(mode='neutrons')[source]

Loads the original EXFOR library.

Parameters

mode (str, optional) – Projectile type to load data for. Defaults to “neutrons”. Options also include “alphas”, “deuterons”, “gammas”, “helions”, and “protons”.

Returns

pd.DataFrame

nucml.datasets.load_ripl_parameters()[source]

Loads the RIPL level cut-off parameters file.

Returns

DataFrame

nucml.general_utilities module

nucml.general_utilities.check_if_files_exist(files_list)[source]

Checks if all files in a list of filepaths exists.

Parameters

files_list (list) – List of relative or absolute path-like strings to check for existence.

Returns

True if all exists, False if more than one does not exist.

Return type

bool

nucml.general_utilities.func(x, c, d)[source]

Line equation function. Used to interpolate AME features.

Parameters
  • x (int or float) – Input parameter.

  • c (int or float) – Intercept parameter.

  • d (int or float) – Weight parameter.

Returns

Linear equation result.

Return type

float

nucml.general_utilities.get_files_w_extension(directory, extension)[source]

Gets a list of relative paths to files that match the given extension in the given directory.

Parameters
  • directory (str) – Path-like string to the directory where the search will be conducted.

  • extension (str) – The extension for which to search files in the directory and all subdirectories (i.e. “.csv”).

Returns

Contains relative path to each encountered file containing the given extension.

Return type

list

nucml.general_utilities.initialize_directories(directory, reset=False)[source]

Creates and/or resets the given directory path.

Parameters
  • directory (str) – Path-like string to directory to create and/or reset.

  • reset (bool, optional) – If True, the directory will be deleted and created again.

Returns

None

nucml.general_utilities.load_obj(file_path)[source]

Loads a saved pickle python object.

Parameters

file_path (str) – Path-like string to the object to be loaded.

Returns

object

nucml.general_utilities.parse_isotope(isotope, parse_for='ENDF')[source]

This is an internal function that transforms element tags (i.e. U235) into formats appropiate for other internal functions.

Parameters
  • isotope (str) – Isotope to format (i.e. U235, 35cl).

  • parse_for (str, optional) – What loader object is requesting the parsing. Options include “EXFOR” and “ENDF”. Defaults to “ENDF”.

Returns

Formatted isotope identifier.

Return type

str

nucml.general_utilities.parse_mt(mt_number, mt_for='ENDF', one_hot=False)[source]

Universal ENDF reaction code parser. This internal function is used to parse and format the reaction integer code for internal functions used by NucML.

Parameters
  • mt_number (int) – Reaction channel code as defined by ENDF/EXFOR.

  • mt_for (str, optional) – What loader object is requesting the parsing. Options include “EXFOR” and “ENDF”. Defaults to “ENDF”.

  • one_hot (bool, optional) – If mt_for=”EXFOR”, then this argument specifies if the MT code should be formated for one-hot encoded dataframe. is for a one-hot encoded dataframe. Defaults to False.

Returns

The formatted reaction channel code.

Return type

str or int

nucml.general_utilities.save_obj(obj, saving_dir, name)[source]

Saves a python object with pickle in the saving_dir directory using name. Useful to quickly store objects such as lists or numpy arrays. Do not include the extension in the name. The function automatically adds the .pkl extension to all saved files.

Parameters
  • obj (object) – Object to save. Can be a list, np.array, pd.DataFrame, etc.

  • saving_dir (str) – Path-like string where the object will be saved.

  • name (str) – Name of the object without extension.

Returns

None

nucml.processing module

nucml.processing.impute_values(df)[source]

Imputes feature values using linear interpolation element-wise. The passed dataframe must contain both the number of protons and mass number as “Z” and “A” respetively.

Parameters

df (pd.DataFrame) – DataFrame to impute values off. All missing values will be filled.

Returns

New imputed DataFrame.

Return type

pd.DataFrame

nucml.processing.normalize_features(df, to_scale, scaling_type='standard', scaler_dir=None)[source]

Applies a transformer or normalizer to a set of specific features in the provided dataframe.

Parameters
  • df (pd.DataFrame) – DataFrame to normalize/transform.

  • to_scale (list) – List of columns to apply the normalization to.

  • scaling_type (str) – Scaling or transformer to use. Options include “poweryeo”, “standard”, “minmax”, “maxabs”, “robust”, and “quantilenormal”. See the scikit-learn documentation for more information on each of these.

  • scaler_dir (str) – Path-like string to a previously saved scaler. If provided, this overides any other parameter by loading the scaler from the provided path and using it to transform the provided dataframe. Defaults to None.

Returns

Scikit-learn scaler object.

Return type

object

Module contents