# Key Themes

**Data and Modeling Overview**[Data and Modeling Overview]- What it means to ‘model’ a phenomenon
- What is a variable
- What is a measurement
- Observation (direct) vs estimation (indirect)
- Interpolation as an instance of estimation (indirect)
- Introduce concepts such as sampling vs complete population-level observations
- Types of sampling e.g., random, systematic, cluster, etc.

**Basic Data Analysis**- Summary Statistics [Data and Modeling Overview, Deep Dive into Data “Set”]
- Measures of central tendency and their benefits/drawbacks
- E.g., Mean, median, mode

- Measures of variability and their benefits/drawbacks
- E.g., Variation, range, standard deviation

- Discuss the limitations of summary statistics with examples
- E.g., Anscombe Quartet

- Measures of central tendency and their benefits/drawbacks
- Aggregations
- When and why do aggregations make sense or are necessary [Relating Data “Sets”, Mapping]
- How to aggregate two data sets into one [Relating Data “Sets”]
- How to aggregate data sets between levels of granularity [Mapping]

- Outliers [Plotting/Graphing]
- Why we care about outliers
- How to spot and handle outliers

- Summary Statistics [Data and Modeling Overview, Deep Dive into Data “Set”]
**Techniques for Data Visualization**[Plotting/Graphing]- Plotting vs graphing
- Why we visualize data
- Types of plots
- Scatter, line, map, bar, histogram, etc.

- How to visualize 1-D vs 2-D vs N-D Data

**Techniques for Statistical Modeling**- Distributions [Plotting/Graphing(Advanced Topic)]
- Theoretical vs empirical (i.e. probability distributions vs frequency distributions)
- Normal/Gaussian distribution
- Properties of the normal distribution
- When it is appropriate to model using the normal distribution

- Exponential distribution
- Other distributions
- Bernoulli
- Binomial

- Correlations [Correlations]
- Linear vs quadratic vs exponential
- Degree and direction of correlation
- How to identify correlations
- Visually (e.g., scatter plot)
- Numerically (e.g., correlation coefficient)

- Common pitfalls such as overgeneralization, Simpson’s paradox, etc.
- Correlation vs. causation

- Distributions [Plotting/Graphing(Advanced Topic)]
**Data Science in the Real World**[Data and Modeling Overview]- Lies, damn lies and statistics
- Manipulative visuals
- Manipulative statistics

- Cognitive biases
- Survivorship bias
- False causality fallacy
- Other biases

- Lies, damn lies and statistics