The true power of analytics comes from looking at the relationships among multiple variables.
We will examine how two variables can go together with correlation
We will also examine how to use one variable to predict another variable using regression
Class Data Demos on Google Sheets: W4CovCorr, W5Regression
Complete assigned readings before next class meeting, which are yellow-highlighted
W4-1.1 Covariance and Correlation: PPT
- NFF Chapter 11: Correlation and linear regression (pdf)
- DA Worksheet #07: Correlation (Solution; Python Solution)
- The Importance of Being Causal (Harvard Data Science Review)
- When to Act on a Correlation, and When Not To (Harvard Data Science Review)
- Beware Spurious Correlations (Harvard Business Review)*
- [MINI] Anscombe's Quartet by Data Skeptic podcast
- Patterns of Trustworthy Experimentation: Pre-Experiment Stage (Microsoft)
- Accelerating Retention Experiments with Partially Observed Data (Yelp Engineering)
- Experimentation Works: The Surprising Power of Business Experiments (by Dr. Thomke)
- Correlation vs. Causation by Harvard psychologist Steve Pinker
W4-2.1 Regression: PPT
- Required Reading: Machine Learning Predicts Home Price
- DA Worksheet #08: Regression 1 of 2 (Solution; Python Solution)
- Zillow Prize (from 2018)
- Netflix Prize (2009)
- JASP software: Find correlation and regression tutorials here
- Simple Linear Regression (Penn State)
- Ordinary Least Squares Regression Explained Visually
- Reading: A Refresher on Regression Analysis (Harvard Business Review)*
- Khan Academy module on Inference about Slope
- [MINI] Ordinary Least Squares Regression by Data Skeptic podcast
- [MINI] Noise! by Data Skeptic podcast
- Ordinary Least Squares: Where It All Began by Quantitude Podcast
- The Question of When: The Oscars, Class Presentations, and Prom Dates
W4-3.1 Regression Part 2 of 2: PPT
- GS Logistic Regression Chapter (pages 61-65): Statistical Analysis in JASP - A Guide for Students
- DA Worksheet #09: Regression 2 of 2 (Solution)
- The Gambler Who Cracked the Horse-Racing Code
- A Visual Introduction to Linear Regression (by Amazon's Machine Learning University)
- [MINI] R-squared by Data Skeptic podcast
- F-Distribution Tables