hipom_data_mapping/data_preprocess/README.md

27 lines
809 B
Markdown
Raw Normal View History

# Data Preprocess
## What is this folder
This folder contains the files for pre-processing.
We divide each processing method into their respective folders to modularize the
pre-processing methods. This helps to make it easier to test different methods
and reduce coupling between stages.
## Instructions
First, we apply the pre-processing by running code from the desired folder.
Using `no_preprocess` directory as an example:
- `cd no_preprocess`
- Follow the instructions found in the sub-directory
- After code execution, the processed file will be placed into
`exports/preprocessed_data.csv`
We then run the data split code to create our k-fold splits.
- `cd` back to the `data_preprocess` directory
- `python split_data.py`
You will now have the datasets in `exports/dataset/group_{1,2,3,4,5}`