# Data Preprocess ## What is this folder This folder contains the files for pre-processing. We divide each processing method into their respective folders to modularize the pre-processing methods. This helps to make it easier to test different methods and reduce coupling between stages. ## Instructions First, we apply the pre-processing by running code from the desired folder. Using `no_preprocess` directory as an example: - `cd no_preprocess` - Follow the instructions found in the sub-directory - After code execution, the processed file will be placed into `exports/preprocessed_data.csv` We then run the data split code to create our k-fold splits. - `cd` back to the `data_preprocess` directory - `python split_data.py` You will now have the datasets in `exports/dataset/group_{1,2,3,4,5}`