Doc: added descriptions and instructions for the data_preprocess folder  | 
			||
|---|---|---|
| .. | ||
| exports | ||
| make_figures | ||
| outputs | ||
| .gitignore | ||
| README.md | ||
| make_csv.py | ||
| select_db.py | ||
		
			
				
				README.md
			
		
		
			
			
		
	
	Data Import
What is this folder
This folder contains the files needed to import files from the remote database to local csv files.
This folder contains the following files:
select_db.py:- use this to pull the raw datasets 
data_mapping.csvanddata_model_master_export.csv 
- use this to pull the raw datasets 
 make_csv.py:- perform basic processing
 - produces the following files:
raw_data.csv:data_mapping.csvwithout some fieldsdata_mapping_mdm.csv: mdm subset ofraw_data.csv
 
make_figuressub-directoryplot_class_token.ipynb: get number of thing-property combinations, and plot the histogram of thing-property counts along with the tag_description character countsplot_count.ipynb: get counts of ship-data and platform-data
exportssub-directory:- this folder stores the files that were produced from import
 
outputssub-directory:- this folder stores the exported figures from 
make_figures 
- this folder stores the exported figures from 
 
Instructions
Check the following:
- Remember to activate your python environment
 - Ensure that the 
db_connection_info.txtis linked to this directory- e.g. 
ln -s /some/directory/db_connection_info.txt . 
 - e.g. 
 
To import data, execute the following:
cdinto this folder.python select_db.pypython make_csv.py
Export files will be found in exports. This helps to keep the folder clean.