nucml package¶
Subpackages¶
Submodules¶
nucml.config module¶
nucml.configure module¶
-
nucml.configure.
configure
(user_path, ace_path, matlab_exe_path='')[source]¶ Configures an internal file necessary to enable all NucML functionalities that deal with loading data files from EXFOR, AME, ACE, ENSDF, and RIPL.
The ace_path can be an already existing directory from a serpent distribution. The ML_Nuclear_Data repository contaings a version of ACE files which were used to develop and test all functionalities. If the .ace files have different structure, the ace utilities may not work.
- Parameters
user_path (str) – Path-like string pointing to the project directory.
ace_path (str) – Path-like string pointing to the .ace files.
matplab_exe_path (str, optional) – Path-like string pointing towards the MATLAB executable. The default is None
- Returns
None
nucml.datasets module¶
-
nucml.datasets.
generate_bigquery_csv
()[source]¶ Creates a single EXFOR data file to update Google BigQuery database.
- Returns
None
-
nucml.datasets.
generate_exfor_dataset
(user_path, modes=['neutrons', 'protons', 'alphas', 'deuterons', 'gammas', 'helions'])[source]¶ Generates all needed EXFOR datasets for neutron-, proton-, alpha-, deuterons-, gammas-, and helion-induce reactions. Beware, NucML configuration needs to be performed first. See nucml.configure. The modes argument can be modified for the function to generate only user-defined datasets.
- Parameters
user_path (str) – path-like string where all information including the datasets will be stored.
modes (list, optional) – Type of projectile for which to generate the datasets. Defaults to [“neutrons”, “protons”, “alphas”, “deuterons”, “gammas”, “helions”].
- Returns
None
-
nucml.datasets.
load_ame
(natural=False, imputed_nan=False, file='merged')[source]¶ Loads the Atomic Mass Evaluation 2016 data generated by NucML using the parsing utilities.
For file=”merged”, there are four AME dataset versions: 1. AME_all_merged (natural=False, imputed_nan=False): Contains all avaliable AME information from the mass, rct1, and rct2 files. 2. AME_all_merged_no_NaN (natural=False, imputed_nan=True): Same as 1, except all missing values are imputed linearly and element-wise. 3. AME_Natural_Properties_w_NaN (natural=True, imputed_nan=False): Similar to 2, except data for natural abundance elements is included. 4. AME_Natural_Properties_no_NaN (natural=True, imputed_nan=True): Same as 3. except all missing values are imputed linearly and element-wise.
- Parameters
natural (bool) – if True, the AME data containing natural element data will be loaded. Only applicable when file=’merged’.
imputed_nan (bool) – If True, the dataset loaded will not contain any missing values (imputed version will be loaded).
file (str) – Dataset to extract. Options include ‘merged’, ‘mass16’, ‘rct1’, and ‘rct2’.
- Returns
a pandas dataframe cantaining the queried AME data.
- Return type
DataFrame
-
nucml.datasets.
load_ensdf
(cutoff=False, append_ame=False)[source]¶ Loads the Evalauted Nuclear Structure Data File structure levels data generated through NucML parsings utilities.
- Parameters
cutoff (bool, optional) – If True, the excited levels are cut-off according to the RIPL cutoof parameters. Defaults to False.
append_ame (bool, optional) –
- Returns
DataFrame
-
nucml.datasets.
load_ensdf_ground_states
()[source]¶ Loads the ENSDF file. Only ground state information.
- Returns
DataFrame
-
nucml.datasets.
load_ensdf_headers
()[source]¶ Loads ENSDF headers from RIPL .dat files.
- Returns
DataFrame
-
nucml.datasets.
load_ensdf_isotopic
(isotope, filetype='levels')[source]¶ Loads level or gamma records for a given isotope (i.e. U235).
- Parameters
isotope (str) – Isotope to query (i.e. u235, cl35, 239Pu)
filetype (str, optional) – Specifies if level or gamma records are to be extracted. Options include “levels” and “gammas”. Defaults to “levels”.
- Returns
DataFrame
-
nucml.datasets.
load_ensdf_ml
(cutoff=False, log_sqrt=False, log=False, append_ame=False, basic=- 1, num=False, frac=0.3, scaling_type='standard', scaler_dir=None, normalize=True)[source]¶ EXPERIMENTAL (NOT MEANT FOR USE)
- Parameters
cutoff (bool, optional) – [description]. Defaults to False.
log_sqrt (bool, optional) – [description]. Defaults to False.
log (bool, optional) – [description]. Defaults to False.
append_ame (bool, optional) – [description]. Defaults to False.
basic (int, optional) – [description]. Defaults to -1.
num (bool, optional) – [description]. Defaults to False.
frac (float, optional) – [description]. Defaults to 0.3.
scaling_type (str, optional) – [description]. Defaults to “standard”.
scaler_dir ([type], optional) – [description]. Defaults to None.
normalize (bool, optional) – [description]. Defaults to True.
- Returns
[description]
- Return type
[type]
-
nucml.datasets.
load_evaluation
(isotope, MT, mode='neutrons', library='endfb8.0', mev_to_ev=True, mb_to_b=True, log=True, drop_u=True)[source]¶ Reads an evaluation file for a specific isotope, reaction channel, and evaluated library. It is important to inspect the returned data since it queries a local database of an external source which extracted data from ENDF using an extraction script. It has been found that some particular reactions are not included. These can be added manually for future loading.
- Parameters
isotope (str) – Isotope to query (i.e. U233, Cl35).
MT (int) – Reaction channel ENDF code. Must be an integer (i.e. 1, 2, 3)
mode (str) – Type of projectile. Only “neutrons” and “protons” are supported for now.
library (str) – Evaluation library to query. Allowed options include endfb8.0, jendl4.0, jeff3.3, and tendl.2019.
mev_to_ev (bool) – If True, it converts the energy from MeV to eV.
mb_to_b (bool) – If True, it converts the cross sections from millibarns to barns.
log (bool) – If True, it applies the log10 to both the Energy and the Cross Section.
drop_u (bool) – Sometimes, evaluation files contain uncertainty values. If True, these features are removed.
- Returns
pandas DataFrame containing the ENDF datapoints.
- Return type
evaluation (DataFrame)
-
nucml.datasets.
load_exfor
(log=False, low_en=False, basic=- 1, num=False, frac=0.1, mode='neutrons', scaling_type='standard', scaler_dir=None, filters=False, max_en=20000000.0, mt_coding='one_hot', scale_energy=False, projectile_coding='one_hot', normalize=True, pedro=False, pedro_v2=False)[source]¶ Loads the EXFOR dataset in its varius forms. This function helps load ML-ready EXFOR datasets for different particle induce reactions or all of them.
- Parameters
log (bool, optional) – If True, the log of the Energy and Cross Section is taken. Defaults to False.
low_en (bool, optional) – If True, an upper limit in energy is applied given by the max_en argument. Defaults to False.
basic (int, optional) – Indicates how many features to load. -1 means all avaliable features. Defaults to -1.
num (bool, optional) – If True, only numerical and relevant categorical features are loaded. Defaults to False.
frac (float, optional) – Fraction of the dataset for test set. Defaults to 0.1.
mode (str, optional) – Dataset to load. Options include neutrons, gammas, and protons. Defaults to “neutrons”.
scaling_type (str, optional) – Type of scaler to use for normalizing the dataset. Defaults to “standard”.
scaler_dir (str, optional) – Directory in which to store the trained scaler. Defaults to None.
filters (bool, optional) – If True, a variety of filters are applied that help discard irregular data. Defaults to False.
max_en (float, optional) – Maximum energy threshold by which the dataset is filtered. Defaults to 2.0E7.
mt_coding (str, optional) – Method used to process the MT reaction channel codes. Defaults to “one_hot”.
scale_energy (bool, optional) – If True, the energy will be normalized along with all other features. Defaults to False.
projectile_coding (str, optional) – Method used to process the type of projectile. Defaults to “one_hot”.
pedro (bool, optional) – Personal settings. Defaults to False.
- Raises
FileNotFoundError – If mode is all and one of the files is missing.
FileNotFoundError – If the selected mode file does not exist.
- Returns
Only returns one dataset if num=False. DataFrames: Multiple dataframes and objects if num=True.
- Return type
DataFrame
nucml.general_utilities module¶
-
nucml.general_utilities.
check_if_files_exist
(files_list)[source]¶ Checks if all files in a list of filepaths exists.
- Parameters
files_list (list) – List of relative or absolute path-like strings to check for existence.
- Returns
True if all exists, False if more than one does not exist.
- Return type
bool
-
nucml.general_utilities.
func
(x, c, d)[source]¶ Line equation function. Used to interpolate AME features.
- Parameters
x (int or float) – Input parameter.
c (int or float) – Intercept parameter.
d (int or float) – Weight parameter.
- Returns
Linear equation result.
- Return type
float
-
nucml.general_utilities.
get_files_w_extension
(directory, extension)[source]¶ Gets a list of relative paths to files that match the given extension in the given directory.
- Parameters
directory (str) – Path-like string to the directory where the search will be conducted.
extension (str) – The extension for which to search files in the directory and all subdirectories (i.e. “.csv”).
- Returns
Contains relative path to each encountered file containing the given extension.
- Return type
list
-
nucml.general_utilities.
initialize_directories
(directory, reset=False)[source]¶ Creates and/or resets the given directory path.
- Parameters
directory (str) – Path-like string to directory to create and/or reset.
reset (bool, optional) – If True, the directory will be deleted and created again.
- Returns
None
-
nucml.general_utilities.
load_obj
(file_path)[source]¶ Loads a saved pickle python object.
- Parameters
file_path (str) – Path-like string to the object to be loaded.
- Returns
object
-
nucml.general_utilities.
parse_isotope
(isotope, parse_for='ENDF')[source]¶ This is an internal function that transforms element tags (i.e. U235) into formats appropiate for other internal functions.
- Parameters
isotope (str) – Isotope to format (i.e. U235, 35cl).
parse_for (str, optional) – What loader object is requesting the parsing. Options include “EXFOR” and “ENDF”. Defaults to “ENDF”.
- Returns
Formatted isotope identifier.
- Return type
str
-
nucml.general_utilities.
parse_mt
(mt_number, mt_for='ENDF', one_hot=False)[source]¶ Universal ENDF reaction code parser. This internal function is used to parse and format the reaction integer code for internal functions used by NucML.
- Parameters
mt_number (int) – Reaction channel code as defined by ENDF/EXFOR.
mt_for (str, optional) – What loader object is requesting the parsing. Options include “EXFOR” and “ENDF”. Defaults to “ENDF”.
one_hot (bool, optional) – If mt_for=”EXFOR”, then this argument specifies if the MT code should be formated for one-hot encoded dataframe. is for a one-hot encoded dataframe. Defaults to False.
- Returns
The formatted reaction channel code.
- Return type
str or int
-
nucml.general_utilities.
save_obj
(obj, saving_dir, name)[source]¶ Saves a python object with pickle in the saving_dir directory using name. Useful to quickly store objects such as lists or numpy arrays. Do not include the extension in the name. The function automatically adds the .pkl extension to all saved files.
- Parameters
obj (object) – Object to save. Can be a list, np.array, pd.DataFrame, etc.
saving_dir (str) – Path-like string where the object will be saved.
name (str) – Name of the object without extension.
- Returns
None
nucml.processing module¶
-
nucml.processing.
impute_values
(df)[source]¶ Imputes feature values using linear interpolation element-wise. The passed dataframe must contain both the number of protons and mass number as “Z” and “A” respetively.
- Parameters
df (pd.DataFrame) – DataFrame to impute values off. All missing values will be filled.
- Returns
New imputed DataFrame.
- Return type
pd.DataFrame
-
nucml.processing.
normalize_features
(df, to_scale, scaling_type='standard', scaler_dir=None)[source]¶ Applies a transformer or normalizer to a set of specific features in the provided dataframe.
- Parameters
df (pd.DataFrame) – DataFrame to normalize/transform.
to_scale (list) – List of columns to apply the normalization to.
scaling_type (str) – Scaling or transformer to use. Options include “poweryeo”, “standard”, “minmax”, “maxabs”, “robust”, and “quantilenormal”. See the scikit-learn documentation for more information on each of these.
scaler_dir (str) – Path-like string to a previously saved scaler. If provided, this overides any other parameter by loading the scaler from the provided path and using it to transform the provided dataframe. Defaults to None.
- Returns
Scikit-learn scaler object.
- Return type
object