Key Themes

  • Data and Modeling Overview [Data and Modeling Overview]
    • What it means to ‘model’ a phenomenon
    • What is a variable
    • What is a measurement
      • Observation (direct) vs estimation (indirect)
      • Interpolation as an instance of estimation (indirect)
      • Introduce concepts such as sampling vs complete population-level observations
        • Types of sampling e.g., random, systematic, cluster, etc.
  • Basic Data Analysis
    • Summary Statistics [Data and Modeling Overview, Deep Dive into Data “Set”]
      • Measures of central tendency and their benefits/drawbacks
        • E.g., Mean, median, mode
      • Measures of variability and their benefits/drawbacks
        • E.g., Variation, range, standard deviation
      • Discuss the limitations of summary statistics with examples
        • E.g., Anscombe Quartet
    • Aggregations
    • Outliers [Plotting/Graphing]
      • Why we care about outliers
      • How to spot and handle outliers
  • Techniques for Data Visualization [Plotting/Graphing]
    • Plotting vs graphing
    • Why we visualize data
    • Types of plots
      • Scatter, line, map, bar, histogram, etc.
    • How to visualize 1-D vs 2-D vs N-D Data
  • Techniques for Statistical Modeling
    • Distributions [Plotting/Graphing(Advanced Topic)]
      • Theoretical vs empirical (i.e. probability distributions vs frequency distributions)
      • Normal/Gaussian distribution
        • Properties of the normal distribution
        • When it is appropriate to model using the normal distribution
      • Exponential distribution
      • Other distributions
        • Bernoulli
        • Binomial
    • Correlations [Correlations]
      • Linear vs quadratic vs exponential
      • Degree and direction of correlation
      • How to identify correlations
        • Visually (e.g., scatter plot)
        • Numerically (e.g., correlation coefficient)
      • Common pitfalls such as overgeneralization, Simpson’s paradox, etc.
      • Correlation vs. causation
  • Data Science in the Real World [Data and Modeling Overview]
    • Lies, damn lies and statistics
      • Manipulative visuals
      • Manipulative statistics
    • Cognitive biases
      • Survivorship bias
      • False causality fallacy
      • Other biases