hipom_data_mapping/data_import/README.md

42 lines
1.4 KiB
Markdown
Raw Normal View History

# Data Import
## What is this folder
This folder contains the files needed to import files from the remote database
to local csv files.
This folder contains the following files:
- `select_db.py`:
- use this to pull the raw datasets `data_mapping.csv` and
`data_model_master_export.csv`
- `make_csv.py`:
- perform basic processing
- produces the following files:
- `raw_data.csv`: `data_mapping.csv` without some fields
- `data_mapping_mdm.csv`: mdm subset of `raw_data.csv`
- `make_figures` sub-directory
- `plot_class_token.ipynb`: get number of thing-property combinations, and
plot the histogram of thing-property counts along with the tag_description
character counts
- `plot_count.ipynb`: get counts of ship-data and platform-data
- `exports` sub-directory:
- this folder stores the files that were produced from import
- `outputs` sub-directory:
- this folder stores the exported figures from `make_figures`
## Instructions
Check the following:
- Remember to activate your python environment
- Ensure that the `db_connection_info.txt` is linked to this directory
- e.g. `ln -s /some/directory/db_connection_info.txt .`
To import data, execute the following:
- `cd` into this folder.
- `python select_db.py`
- `python make_csv.py`
Export files will be found in `exports`. This helps to keep the folder clean.