42 lines
1.4 KiB
Markdown
42 lines
1.4 KiB
Markdown
|
# Data Import
|
||
|
|
||
|
## What is this folder
|
||
|
|
||
|
This folder contains the files needed to import files from the remote database
|
||
|
to local csv files.
|
||
|
|
||
|
This folder contains the following files:
|
||
|
|
||
|
- `select_db.py`:
|
||
|
- use this to pull the raw datasets `data_mapping.csv` and
|
||
|
`data_model_master_export.csv`
|
||
|
- `make_csv.py`:
|
||
|
- perform basic processing
|
||
|
- produces the following files:
|
||
|
- `raw_data.csv`: `data_mapping.csv` without some fields
|
||
|
- `data_mapping_mdm.csv`: mdm subset of `raw_data.csv`
|
||
|
- `make_figures` sub-directory
|
||
|
- `plot_class_token.ipynb`: get number of thing-property combinations, and
|
||
|
plot the histogram of thing-property counts along with the tag_description
|
||
|
character counts
|
||
|
- `plot_count.ipynb`: get counts of ship-data and platform-data
|
||
|
- `exports` sub-directory:
|
||
|
- this folder stores the files that were produced from import
|
||
|
- `outputs` sub-directory:
|
||
|
- this folder stores the exported figures from `make_figures`
|
||
|
|
||
|
## Instructions
|
||
|
|
||
|
Check the following:
|
||
|
|
||
|
- Remember to activate your python environment
|
||
|
- Ensure that the `db_connection_info.txt` is linked to this directory
|
||
|
- e.g. `ln -s /some/directory/db_connection_info.txt .`
|
||
|
|
||
|
To import data, execute the following:
|
||
|
|
||
|
- `cd` into this folder.
|
||
|
- `python select_db.py`
|
||
|
- `python make_csv.py`
|
||
|
|
||
|
Export files will be found in `exports`. This helps to keep the folder clean.
|