Richard Wong
1f3970459f
Feat: introduced BERT-based binary classification |
||
---|---|---|
.. | ||
abbreviations | ||
check_data | ||
exports | ||
no_preprocess | ||
rule_base_replacement | ||
.gitignore | ||
README.md | ||
split_data.py |
README.md
Data Preprocess
What is this folder
This folder contains the files for pre-processing.
We divide each processing method into their respective folders to modularize the pre-processing methods. This helps to make it easier to test different methods and reduce coupling between stages.
Instructions
First, we apply the pre-processing by running code from the desired folder.
Using no_preprocess
directory as an example:
cd no_preprocess
- Follow the instructions found in the sub-directory
- After code execution, the processed file will be placed into
exports/preprocessed_data.csv
We then run the data split code to create our k-fold splits.
cd
back to thedata_preprocess
directorypython split_data.py
You will now have the datasets in exports/dataset/group_{1,2,3,4,5}