hipom_data_mapping/data_preprocess/README.md

# Data Preprocess

## What is this folder

This folder contains the files for pre-processing.

We divide each processing method into their respective folders to modularize the
pre-processing methods. This helps to make it easier to test different methods
and reduce coupling between stages.

## Instructions

First, we apply the pre-processing by running code from the desired folder.

Using `no_preprocess` directory as an example:

- `cd no_preprocess`
- Follow the instructions found in the sub-directory
- After code execution, the processed file will be placed into
`exports/preprocessed_data.csv`

We then run the data split code to create our k-fold splits.

- `cd` back to the `data_preprocess` directory 
- `python split_data.py`

You will now have the datasets in `exports/dataset/group_{1,2,3,4,5}`
Chore: changed ipynb to py files in the data_preprocess folder Doc: added descriptions and instructions for the data_preprocess folder 2024-10-29 22:55:22 +09:00			`# Data Preprocess`

			`## What is this folder`

			`This folder contains the files for pre-processing.`

			`We divide each processing method into their respective folders to modularize the`
			`pre-processing methods. This helps to make it easier to test different methods`
			`and reduce coupling between stages.`

			`## Instructions`

			`First, we apply the pre-processing by running code from the desired folder.`

			Using `no_preprocess` directory as an example:

			- `cd no_preprocess`
			`- Follow the instructions found in the sub-directory`
			`- After code execution, the processed file will be placed into`
			`exports/preprocessed_data.csv`

			`We then run the data split code to create our k-fold splits.`

			- `cd` back to the `data_preprocess` directory
			- `python split_data.py`

			You will now have the datasets in `exports/dataset/group_{1,2,3,4,5}`