hipom_data_mapping/data_preprocess/README.md

809 B

Data Preprocess

What is this folder

This folder contains the files for pre-processing.

We divide each processing method into their respective folders to modularize the pre-processing methods. This helps to make it easier to test different methods and reduce coupling between stages.

Instructions

First, we apply the pre-processing by running code from the desired folder.

Using no_preprocess directory as an example:

  • cd no_preprocess
  • Follow the instructions found in the sub-directory
  • After code execution, the processed file will be placed into exports/preprocessed_data.csv

We then run the data split code to create our k-fold splits.

  • cd back to the data_preprocess directory
  • python split_data.py

You will now have the datasets in exports/dataset/group_{1,2,3,4,5}