{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Generating Benchmark Files and Serpent Scripts\n", "\n", "It is really important to validate trained models using criticality benchmarks. NucML provides a couple of scripts to aid automate this tedious process. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:17:53.935202Z", "start_time": "2021-05-07T22:17:53.932201Z" } }, "outputs": [], "source": [ "# Prototype\n", "import sys\n", "# This allows us to import the nucml utilities\n", "sys.path.append(\"..\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:17:57.741286Z", "start_time": "2021-05-07T22:17:54.421117Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "import os\n", "import logging\n", "logger = logging.getLogger()\n", "logger.setLevel(logging.CRITICAL)\n", "\n", "pd.set_option('display.max_columns', 500)\n", "pd.set_option('display.max_rows', 50)\n", "pd.options.mode.chained_assignment = None # default='warn'\n", "\n", "import nucml.datasets as nuc_data\n", "import nucml.ace.data_utilities as ace_utils\n", "import nucml.model.utilities as model_utils" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:17:58.062378Z", "start_time": "2021-05-07T22:17:58.059880Z" } }, "outputs": [], "source": [ "figure_dir = \"Figures/\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading Datasets\n", "\n", "In our work, several models were trained with Datasets 0-4. Since the models will be used to query data at the original ACE's energy grid, we need to load the original data to expand the energy grid for the isotopes of interest (among other processes)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T02:20:33.992108Z", "start_time": "2021-04-09T02:09:46.504840Z" } }, "outputs": [], "source": [ "# LOADING DATASET\n", "df_b0, _, _, _, _, to_scale_b0, _ = nuc_data.load_exfor(pedro=True, basic=0, normalize=False)\n", "df_b1, _, _, _, _, to_scale_b1, _ = nuc_data.load_exfor(pedro=True, basic=1, normalize=False)\n", "df_b2, _, _, _, _, to_scale_b2, _ = nuc_data.load_exfor(pedro=True, basic=2, normalize=False)\n", "df_b3, _, _, _, _, to_scale_b3, _ = nuc_data.load_exfor(pedro=True, basic=3, normalize=False)\n", "df_b4, _, _, _, _, to_scale_b4, _ = nuc_data.load_exfor(pedro=True, basic=4, normalize=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading Decision Tree Results\n", "\n", "`NucML` will create a directory per model and create subdirectories for every criticality benchmark case. Therefore, we need to specify the directories where the model directories and subdirectories will be stored. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T02:33:29.671649Z", "start_time": "2021-04-09T02:33:29.668647Z" } }, "outputs": [], "source": [ "dt_ml_ace_dir_b0 = \"ml/DT_B0/\"\n", "dt_ml_ace_dir_b1 = \"ml/DT_B1/\"\n", "dt_ml_ace_dir_b2 = \"ml/DT_B2/\"\n", "dt_ml_ace_dir_b3 = \"ml/DT_B3/\"\n", "dt_ml_ace_dir_b4 = \"ml/DT_B4/\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Having defined the directories, we can read in the training results. In this example, I read the samples provided with the repository. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:18:00.262284Z", "start_time": "2021-05-07T22:18:00.196997Z" } }, "outputs": [], "source": [ "# read in the training results\n", "results_b0 = pd.read_csv(\"../ML_EXFOR_neutrons/2_DT/dt_resultsB0.csv\").sort_values(by=\"max_depth\")\n", "results_b1 = pd.read_csv(\"../ML_EXFOR_neutrons/2_DT/dt_resultsB1.csv\").sort_values(by=\"max_depth\")\n", "results_b2 = pd.read_csv(\"../ML_EXFOR_neutrons/2_DT/dt_resultsB2.csv\").sort_values(by=\"max_depth\")\n", "results_b3 = pd.read_csv(\"../ML_EXFOR_neutrons/2_DT/dt_resultsB3.csv\").sort_values(by=\"max_depth\")\n", "results_b4 = pd.read_csv(\"../ML_EXFOR_neutrons/2_DT/dt_resultsB4.csv\").sort_values(by=\"max_depth\")\n", "\n", "results_b0 = results_b0[results_b0.normalizer == \"none\"]" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:19:24.592239Z", "start_time": "2021-05-07T22:19:24.588238Z" } }, "source": [ "Let us take a look at the columns included in the results:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:18:06.266233Z", "start_time": "2021-05-07T22:18:06.260235Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['id', 'max_depth', 'mss', 'msl', 'mt_strategy', 'normalizer',\n", " 'train_mae', 'train_mse', 'train_evs', 'train_mae_m', 'train_r2',\n", " 'val_mae', 'val_mse', 'val_evs', 'val_mae_m', 'val_r2', 'test_mae',\n", " 'test_mse', 'test_evs', 'test_mae_m', 'test_r2', 'model_path',\n", " 'training_time', 'scaler_path'],\n", " dtype='object')" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results_b0.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that there are a lot of performance metrics available. You can include more information in your own results files. **THE ONLY CONDITION IS THAT THE RESULTING DATAFRAME CONTAINS THE FOLLOWING COLUMNS:**\n", "\n", "- model_path\n", "- scaler_path" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:18:45.293352Z", "start_time": "2021-05-07T22:18:45.284851Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
model_pathscaler_path
52E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL10_none...E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL10_none...
47E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL3_none_...E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL3_none_...
45E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL1_none_...E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL1_none_...
43E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL7_none_...E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL7_none_...
41E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL5_none_...E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL5_none_...
\n", "
" ], "text/plain": [ " model_path \\\n", "52 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL10_none... \n", "47 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL3_none_... \n", "45 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL1_none_... \n", "43 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL7_none_... \n", "41 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL5_none_... \n", "\n", " scaler_path \n", "52 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL10_none... \n", "47 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL3_none_... \n", "45 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS15_MSL1_none_... \n", "43 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL7_none_... \n", "41 E:\\ML_Models_EXFOR\\DT_B0\\DT60_MSS10_MSL5_none_... " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results_b0[['model_path', 'scaler_path']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us extract a single path:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:23:08.229226Z", "start_time": "2021-05-07T22:23:08.225227Z" } }, "outputs": [ { "data": { "text/plain": [ "'E:\\\\ML_Models_EXFOR\\\\DT_B0\\\\DT60_MSS15_MSL10_none_one_hot_B0_v1\\\\DT60_MSS15_MSL10_none_one_hot_B0_v1.joblib'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "example_filename = results_b0.model_path.values[0]\n", "example_filename" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NucML uses the following convention to extract the model's name:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:23:42.348340Z", "start_time": "2021-05-07T22:23:42.344842Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First, extract the model filename: DT60_MSS15_MSL10_none_one_hot_B0_v1.joblib\n" ] } ], "source": [ "example_basename = os.path.basename(example_filename)\n", "print(\"First, extract the model filename: \", example_basename)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:23:50.868384Z", "start_time": "2021-05-07T22:23:50.864385Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Then we split the filename to remove the file extension: DT60_MSS15_MSL10_none_one_hot_B0_v1\n" ] } ], "source": [ "print(\"Then we split the filename to remove the file extension: \", example_basename.split(\".\")[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is the `DT60_MSS15_MSL10_none_one_hot_B0_v1` name that will be use to create a directory." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generating Benchmark Files\n", "\n", "The next step is to select the benchmark of interest. For information on the benchmarks available and instructions on how to including your own, please read the `ML_Nuclear_Data/Benchmarks/inputs/README.md` file. The included benchmarks are formatted in a specific way for `NucML` to read them.\n", "\n", "In this case, we select the `U233_MET_FAST_001` (U-233 Jezebel Criticality Benchmark). When configuring NucML, the path to the benchmark folder is automatically saved, and therefore, only the name needs to be specified.\n", "\n", "Only those isotopes with a composition higher than 10% per benchmark component are replaced with ML-generated cross sections. There are a lot of assumptions going on in the backend concerning how unitarity is enforced. More information can be found in my Thesis. It is by no means the best nor the worst. It was created as proof-of-concept work. The `generate_bench_ml_xs` is a function that should be lab-specific. In other words, you should create your own processing step if possible. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:37:58.107598Z", "start_time": "2021-05-07T22:37:58.105097Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_001\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T06:31:25.515909Z", "start_time": "2021-04-09T03:04:44.347972Z" }, "scrolled": true }, "outputs": [], "source": [ "ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, dt_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, dt_ml_ace_dir_b1, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, dt_ml_ace_dir_b2, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, dt_ml_ace_dir_b3, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, dt_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us see what directories where created:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:40:01.920873Z", "start_time": "2021-05-07T22:40:01.916374Z" } }, "outputs": [ { "data": { "text/plain": [ "['DT_B0', 'DT_B1', 'DT_B2', 'DT_B3', 'DT_B4']" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "os.listdir(\"ml/\")[:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Indeed, all four Decision Tree directories were created successfully (other models are shown due to previous work). Let us peak at the content of the first and subsequent directories:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:40:20.960279Z", "start_time": "2021-05-07T22:40:20.956280Z" } }, "outputs": [ { "data": { "text/plain": [ "['DT100_MSS10_MSL1_none_one_hot_B0_v1',\n", " 'DT100_MSS10_MSL3_none_one_hot_B0_v1',\n", " 'DT100_MSS10_MSL5_none_one_hot_B0_v1',\n", " 'DT100_MSS10_MSL7_none_one_hot_B0_v1',\n", " 'DT100_MSS15_MSL1_none_one_hot_B0_v1']" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "os.listdir(\"ml/DT_B0/\")[:5]" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:40:25.984863Z", "start_time": "2021-05-07T22:40:25.980363Z" } }, "outputs": [ { "data": { "text/plain": [ "['U233_MET_FAST_001', 'U233_MET_FAST_002_001', 'U233_MET_FAST_002_002']" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "os.listdir(\"ml/DT_B0/DT100_MSS10_MSL1_none_one_hot_B0_v1/\")" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:40:36.801050Z", "start_time": "2021-05-07T22:40:36.796053Z" } }, "outputs": [ { "data": { "text/plain": [ "['acelib',\n", " 'converter.m',\n", " 'input',\n", " 'input.out',\n", " 'input.seed',\n", " 'input_res.m',\n", " 'ml_xs_csv',\n", " 'results.mat',\n", " 'sss_endfb7u.xsdata']" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "os.listdir(\"ml/DT_B0/DT100_MSS10_MSL1_none_one_hot_B0_v1/U233_MET_FAST_001/\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generating SERPENT Bash Script\n", "\n", "This is a completely experimental feature. You can pass in the directory for which you want NucML to scan and generate a single bash script to run all cases. " ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:42:02.769358Z", "start_time": "2021-04-09T19:42:01.790369Z" } }, "outputs": [], "source": [ "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:42:36.520058Z", "start_time": "2021-05-07T22:42:36.515560Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL1_none_one_hot_B0_v1/U233_MET_FAST_001/\\n', 'sss2 -omp 10 input\\n', '/mnt/c/Program\\\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run(\\'converter.m\\');exit;\" \\n', 'cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL3_none_one_hot_B0_v1/U233_MET_FAST_001/\\n', 'sss2 -omp 10 input\\n', '/mnt/c/Program\\\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run(\\'converter.m\\');exit;\" \\n', 'cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL5_none_one_hot_B0_v1/U233_MET_FAST_001/\\n', 'sss2 -omp 10 input\\n', '/mnt/c/Program\\\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run(\\'converter.m\\');exit;\" \\n', 'cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL7_none_one_hot_B0_v1/U233_MET_FAST_001/\\n']\n" ] } ], "source": [ "with open(\"ml/DT_B0/U233_MET_FAST_001.sh\") as myfile:\n", " head = [next(myfile) for x in range(10)]\n", "print(head)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "ExecuteTime": { "end_time": "2021-05-07T22:44:04.927401Z", "start_time": "2021-05-07T22:44:04.922899Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL1_none_one_hot_B0_v1/U233_MET_FAST_001/\n", "\n", "sss2 -omp 10 input\n", "\n", "/mnt/c/Program\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run('converter.m');exit;\" \n", "\n", "cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL3_none_one_hot_B0_v1/U233_MET_FAST_001/\n", "\n", "sss2 -omp 10 input\n", "\n", "/mnt/c/Program\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run('converter.m');exit;\" \n", "\n", "cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL5_none_one_hot_B0_v1/U233_MET_FAST_001/\n", "\n", "sss2 -omp 10 input\n", "\n", "/mnt/c/Program\\ Files/MATLAB/R2019a/bin/matlab.exe -nodisplay -nosplash -nodesktop -r \"run('converter.m');exit;\" \n", "\n", "cd /mnt/c/Users/Pedro/Desktop/ML_Nuclear_Data/Benchmarks/ml/DT_B0/DT100_MSS10_MSL7_none_one_hot_B0_v1/U233_MET_FAST_001/\n", "\n" ] } ], "source": [ "with open(\"ml/DT_B0/U233_MET_FAST_001.sh\", \"r\") as file: # the a opens it in append mode\n", " for i in range(10):\n", " line = next(file)\n", " print(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the full path to the Matlab executable is defined here. You will probably need to change it by either writing a script or simply using \"Replace All\" in any code editor. Matlab is used to convert the serpent output into a `.mat` file. This helps analytic tools read easily detector information.\n", "\n", "The next step is to simply run the script and you are Done! See the next notebook for information on how to gather and analyze the results. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### --------------------------------------------------------- PRIVATE SECTION\n", "\n", "Same material for other benchmarks. " ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:41:40.667968Z", "start_time": "2021-04-09T19:41:40.663970Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_001\"" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T12:51:32.531642Z", "start_time": "2021-04-09T06:31:25.527910Z" } }, "outputs": [], "source": [ "ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, dt_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, dt_ml_ace_dir_b1, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, dt_ml_ace_dir_b2, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, dt_ml_ace_dir_b3, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, dt_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:41:43.076968Z", "start_time": "2021-04-09T19:41:42.110970Z" } }, "outputs": [], "source": [ "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:41:49.097969Z", "start_time": "2021-04-09T19:41:49.094969Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_002\"" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:21:07.005488Z", "start_time": "2021-04-09T12:51:32.533642Z" } }, "outputs": [], "source": [ "ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, dt_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, dt_ml_ace_dir_b1, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, dt_ml_ace_dir_b2, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, dt_ml_ace_dir_b3, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, dt_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "ExecuteTime": { "end_time": "2021-04-09T19:41:51.425969Z", "start_time": "2021-04-09T19:41:50.446970Z" } }, "outputs": [], "source": [ "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(dt_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## K-Nearest-Neighbors" ] }, { "cell_type": "code", "execution_count": 112, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T05:31:22.024924Z", "start_time": "2021-04-10T05:31:22.020933Z" } }, "outputs": [], "source": [ "knn_ml_ace_dir_b0 = \"ml/KNN_B0/\"\n", "knn_ml_ace_dir_b1 = \"ml/KNN_B1/\"\n", "knn_ml_ace_dir_b2 = \"ml/KNN_B2/\"\n", "knn_ml_ace_dir_b3 = \"ml/KNN_B3/\"\n", "knn_ml_ace_dir_b4 = \"ml/KNN_B4/\"" ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T05:31:42.793622Z", "start_time": "2021-04-10T05:31:42.772362Z" } }, "outputs": [], "source": [ "# results_b0 = pd.read_csv(\"../ML_EXFOR_neutrons/1_KNN/knn_results_B0.csv\").sort_values(by=\"id\")\n", "results_b1 = pd.read_csv(\"../ML_EXFOR_neutrons/1_KNN/knn_results_B1.csv\").sort_values(by=\"id\")\n", "results_b2 = pd.read_csv(\"../ML_EXFOR_neutrons/1_KNN/knn_results_B2.csv\").sort_values(by=\"id\")\n", "results_b3 = pd.read_csv(\"../ML_EXFOR_neutrons/1_KNN/knn_results_B3.csv\").sort_values(by=\"id\")\n", "results_b4 = pd.read_csv(\"../ML_EXFOR_neutrons/1_KNN/knn_results_B4.csv\").sort_values(by=\"id\")\n", "\n", "# results_b0[\"scale_energy\"] = results_b0.run_name.apply(lambda x: True if \"v2\" in x else False)\n", "# results_b0[\"Model\"] = results_b0.model_path.apply(lambda x: os.path.basename(os.path.dirname(x)))\n", "\n", "# results_b0 = results_b0[results_b0.normalizer == \"minmax\"]\n", "# results_b0 = results_b0[results_b0.scale_energy == True]\n", "# results_b0 = results_b0[results_b0.distance_metric == 'manhattan']" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T05:31:44.463441Z", "start_time": "2021-04-10T05:31:44.460430Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_001\"" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T04:15:13.536830Z", "start_time": "2021-04-10T01:55:48.363619Z" } }, "outputs": [], "source": [ "# ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, knn_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, knn_ml_ace_dir_b1, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, knn_ml_ace_dir_b2, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, knn_ml_ace_dir_b3, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, knn_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T04:15:55.349310Z", "start_time": "2021-04-10T04:15:55.216866Z" } }, "outputs": [], "source": [ "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 115, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T05:32:04.390877Z", "start_time": "2021-04-10T05:32:04.387877Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_001\"" ] }, { "cell_type": "code", "execution_count": 116, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T10:37:29.952710Z", "start_time": "2021-04-10T05:32:20.614116Z" } }, "outputs": [], "source": [ "# ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, knn_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1[3:], BENCHMARK_NAME, to_scale_b1, knn_ml_ace_dir_b1, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, knn_ml_ace_dir_b2, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, knn_ml_ace_dir_b3, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, knn_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T10:37:29.989709Z", "start_time": "2021-04-10T10:37:29.954710Z" } }, "outputs": [], "source": [ "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 118, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T10:37:29.995709Z", "start_time": "2021-04-10T10:37:29.991709Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_002\"" ] }, { "cell_type": "code", "execution_count": 119, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:29:13.200690Z", "start_time": "2021-04-10T10:37:29.997711Z" } }, "outputs": [], "source": [ "# ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, knn_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, knn_ml_ace_dir_b1, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, knn_ml_ace_dir_b2, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, knn_ml_ace_dir_b3, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, knn_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 120, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:29:13.236687Z", "start_time": "2021-04-10T16:29:13.201674Z" } }, "outputs": [], "source": [ "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(knn_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(knn_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## XGBoost" ] }, { "cell_type": "code", "execution_count": 121, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:29:13.242675Z", "start_time": "2021-04-10T16:29:13.237675Z" } }, "outputs": [], "source": [ "xgb_ml_ace_dir_b0 = \"ml/XGB_B0/\"\n", "xgb_ml_ace_dir_b1 = \"ml/XGB_B1/\"\n", "xgb_ml_ace_dir_b2 = \"ml/XGB_B2/\"\n", "xgb_ml_ace_dir_b3 = \"ml/XGB_B3/\"\n", "xgb_ml_ace_dir_b4 = \"ml/XGB_B4/\"" ] }, { "cell_type": "code", "execution_count": 122, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:29:13.308222Z", "start_time": "2021-04-10T16:29:13.244675Z" } }, "outputs": [], "source": [ "results_b0 = pd.read_csv(\"../ML_EXFOR_neutrons/3_XGB/xgb_resultsB0.csv\")\n", "results_b1 = pd.read_csv(\"../ML_EXFOR_neutrons/3_XGB/xgb_resultsB1.csv\")\n", "results_b2 = pd.read_csv(\"../ML_EXFOR_neutrons/3_XGB/xgb_resultsB2.csv\")\n", "results_b3 = pd.read_csv(\"../ML_EXFOR_neutrons/3_XGB/xgb_resultsB3.csv\")\n", "results_b4 = pd.read_csv(\"../ML_EXFOR_neutrons/3_XGB/xgb_resultsB4.csv\")" ] }, { "cell_type": "code", "execution_count": 123, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:29:13.313846Z", "start_time": "2021-04-10T16:29:13.310223Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_001\"" ] }, { "cell_type": "code", "execution_count": 124, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:48:22.933720Z", "start_time": "2021-04-10T16:29:13.316850Z" }, "scrolled": false }, "outputs": [], "source": [ "ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, xgb_ml_ace_dir_b0, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, xgb_ml_ace_dir_b1, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, xgb_ml_ace_dir_b2, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, xgb_ml_ace_dir_b3, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, xgb_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 125, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:48:22.973721Z", "start_time": "2021-04-10T16:48:22.934721Z" } }, "outputs": [], "source": [ "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 126, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T16:48:22.977711Z", "start_time": "2021-04-10T16:48:22.974720Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_001\"" ] }, { "cell_type": "code", "execution_count": 127, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T17:21:03.641538Z", "start_time": "2021-04-10T16:48:22.978711Z" }, "scrolled": true }, "outputs": [], "source": [ "# ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, xgb_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, xgb_ml_ace_dir_b1, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, xgb_ml_ace_dir_b2, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, xgb_ml_ace_dir_b3, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, xgb_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 128, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T17:21:03.699539Z", "start_time": "2021-04-10T17:21:03.642527Z" } }, "outputs": [], "source": [ "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 129, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T17:21:03.703528Z", "start_time": "2021-04-10T17:21:03.701528Z" } }, "outputs": [], "source": [ "BENCHMARK_NAME = \"U233_MET_FAST_002_002\"" ] }, { "cell_type": "code", "execution_count": 130, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T17:53:40.597496Z", "start_time": "2021-04-10T17:21:03.704527Z" }, "scrolled": true }, "outputs": [], "source": [ "# ace_utils.generate_bench_ml_xs(df_b0, results_b0, BENCHMARK_NAME, to_scale_b0, xgb_ml_ace_dir_b0, reset=True)\n", "ace_utils.generate_bench_ml_xs(df_b1, results_b1, BENCHMARK_NAME, to_scale_b1, xgb_ml_ace_dir_b1, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b2, results_b2, BENCHMARK_NAME, to_scale_b2, xgb_ml_ace_dir_b2, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b3, results_b3, BENCHMARK_NAME, to_scale_b3, xgb_ml_ace_dir_b3, reset=True)\n", "# ace_utils.generate_bench_ml_xs(df_b4, results_b4, BENCHMARK_NAME, to_scale_b4, xgb_ml_ace_dir_b4, reset=True)" ] }, { "cell_type": "code", "execution_count": 131, "metadata": { "ExecuteTime": { "end_time": "2021-04-10T17:53:40.674500Z", "start_time": "2021-04-10T17:53:40.598498Z" } }, "outputs": [], "source": [ "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b0, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b1, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b2, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b3, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)\n", "# ace_utils.generate_serpent_bash(xgb_ml_ace_dir_b4, BENCHMARK_NAME, benchmark=BENCHMARK_NAME)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2021-03-10T19:28:36.079940Z", "start_time": "2021-03-10T19:28:36.076935Z" } }, "outputs": [], "source": [ "# all_serpent_files = []\n", "\n", "# for root, _, files in os.walk(\"ml/DT_B0\"):\n", "# for file in files:\n", "# if \"U233_MET_FAST_001_001\" in root:\n", "# all_serpent_files.append(os.path.abspath(os.path.join(root, file)))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }