Data Analysis
Pandas NumPy Matplotlib Scikit
ELK Open-source stack which consists of Elasticsearch, Logstash and Kibana
Tableau
- Analyze the data to provide business teams with data-driven insights
- Validate or invalidate experiments via data analysis
- Attempts to glean useful info (data driven insights, often to do things like increase sales or optimize drug dosage)
https://support.microsoft.com/en-us/office/find-and-remove-duplicates-00e35bea-b46a-4d5d-b28e-66a552dc138d
Perform EDA (exploratory data analysis)
- Find independent and dependent variable
- Independent variable is the cause
-
Dependent variable is the effect (or outcome)
- Look for correlation between variables
- Is it categorical variables (example: blood type) or continuous variables (example: population)?
- Are the samples independent or is it a time series?
spaghetti plot
Mode.com
- Upload some data (beware, it gets uploaded to public)
- Make query
- Build notebook
data
applied_statistics
]