Doc: added descriptions and instructions for the data_preprocess folder |
||
|---|---|---|
| .. | ||
| exports | ||
| make_figures | ||
| outputs | ||
| .gitignore | ||
| README.md | ||
| make_csv.py | ||
| select_db.py | ||
README.md
Data Import
What is this folder
This folder contains the files needed to import files from the remote database to local csv files.
This folder contains the following files:
select_db.py:- use this to pull the raw datasets
data_mapping.csvanddata_model_master_export.csv
- use this to pull the raw datasets
make_csv.py:- perform basic processing
- produces the following files:
raw_data.csv:data_mapping.csvwithout some fieldsdata_mapping_mdm.csv: mdm subset ofraw_data.csv
make_figuressub-directoryplot_class_token.ipynb: get number of thing-property combinations, and plot the histogram of thing-property counts along with the tag_description character countsplot_count.ipynb: get counts of ship-data and platform-data
exportssub-directory:- this folder stores the files that were produced from import
outputssub-directory:- this folder stores the exported figures from
make_figures
- this folder stores the exported figures from
Instructions
Check the following:
- Remember to activate your python environment
- Ensure that the
db_connection_info.txtis linked to this directory- e.g.
ln -s /some/directory/db_connection_info.txt .
- e.g.
To import data, execute the following:
cdinto this folder.python select_db.pypython make_csv.py
Export files will be found in exports. This helps to keep the folder clean.