Modules
All of the available course content can be found on this page, which is composed of two sections:
- Overviews summarizes the available modules and lists the modules in progress
- Tables of Contents provides the table of contents for each available module
If you are having trouble accessing this content, please fill out this form. If you would like to contribute to this endeavor, please reach out by filling out this form and help us add more content.
Overviews
Module 1: Data Overview
This module is meant to serve as the introductory class(es) in a data literacy course. It provides motivation for why data literacy matters, introduces key concepts such as measurements, variables, models, etc., and defines some basic descriptive statistics and data visualization techniques.
Module 2: Deep Dive into Data “Set”
Introduces students to the idea of a data “set” as well as some basic terms and statistical measures (e.g., mean, standard deviation, etc.) while highlighting important caveats and takeaways when drawing conclusions from these measures.
Module 3: Relating Data “Sets”
Introduces students to the idea of related data sets and describes how to merge data sets into one large data collection. Real data sets are analyzed in depth to demonstrate the benefits of building a nuanced, complete picture of the world.
Module 4: Plotting/Graphing
Introduces students to some common plots and graphs and discusses when to use each depending on the data and the context. It also defines and explores various distributions with analysis of their properties and examples from the real world.
Module 5: Correlations
Introduces students to the idea of correlation between two variables and describes how to observe, quantify, and label correlations, with a focus on linear correlations. Strategies for distinguishing real correlations from noise are provided and common pitfalls, such as correlation vs causation, are discussed in depth.
Module 6: Mapping
Introduces students to the idea of geographic data and discusses how to visualize and aggregate geographic data.
Module 7: Time Series
Coming soon!
Tables of Contents
Module 1: Data Overview
- Lies, Damn Lies, and Statistics
- Why is Data Literacy Important?
- Manipulating Visuals
- Misleading Statistics
- Cognitive Biases
- Survivorship Bias
- False Causality Fallacy
- Other Biases
- Truth, Damn Truths, and Statistics
- Data Is Powerful
- So, What is Data, Anyway?
- Variables
- Measurements
- Direct Observation
- Estimation
- Interpolation and Extrapolation
- Population
- Sampling
- Random Sampling
- Systematic Sampling
- Convenience Sampling
- Cluster Sampling
- Models
- The Big Picture
- Communicating Data
- Descriptive Statistics
- Mode
- Median
- Mean
- Standard deviation
- Data Visualization
- Bar Chart
- Line Chart
- Scatter Plot
- Histogram
- Map Plot
Module 2: Deep Dive into Data “Set”
- Where Do Data Sets Come From?
- Examples of Datasets
- Exploring Datasets
- Variables & Dimensions
- Categorical vs Numeric Variables
- Number of Observations
- Exercise: Dice Rolls
- Missing Observations
- Summary Statistics
- Measures of Central Tendency
- One Measure to Rule Them All?
- Mean
- Median
- Mode
- One and Done?
- Three and Complete?
- Measures of Variability
- Range and Standard Deviation
- Assignments
- Explore, Analyze and Compare Two Datasets
Module 3: Relating Data “Sets”
- What Are Related Data Sets?
- Examples of Related Data Sets
- What About “Unrelated” Data Sets?
- Exercise: Finding Relationships
- How Do We Combine Related Data Sets?
- Visualizing Data Sets
- Examples of Tabular Data
- Basic Data Combinations
- Adding Rows
- Adding Columns
- More Complicated Combinations
- Exercise: Find the Combination
- Analyzing Combined Data
- Why Do We Combine Related Data Sets?
- Case Studies
- Blood Pressure vs. Age
- Simpson’s Paradox
- Global Covid Data
- Which Country is the “Best”?
- Exercise: Exploring New Places
- Exercise: Where to Live?
Module 4: Plotting/Graphing
- Plot vs. Graph
- Why Do We Plot?
- Common Plots
- Scatter
- Line
- Map
- Bar
- Histogram
- 1-D vs. 2-D vs. N-D
- Exercise: Exploring Plots and Graphs
- Outliers
- Where Do Outliers Come From?
- How Do We Handle Outliers?
- Advanced Topic: Introduction to Distributions
- Normal Distribution
- Definition
- Properties
- When is Data Normal?
- Exercise: Is It Normal?
- Exceptions to the Rule
- Exponential
- Other Distributions (Optional)
- Bernoulli
- Binomial
Module 5: Correlations
- What is Correlation?
- Linear, Quadratic, and Exponential Correlations
- Direction of Correlation: Positive and Negative
- Degree of Correlation
- Identifying Correlations
- Visual: Scatter Plot
- Line of Best Fit
- Numerical: Correlation Coefficient
- Examples of Correlated/Uncorrelated Variables
- Exercise: Spotting Correlations
- Exercise: Correlations in the Wild
- Common Pitfalls
- Lack of Data
- Simpson’s Paradox
- Important Outliers
- Weighted Correlation Coefficient
- Misleading Correlations
- Overgeneralization
- Correlation and Causation
- What is Causation?
- Correlation vs Causation
- What’s the Difference?
- Can We Have…
- Confounding Variable
- Case Studies
Module 6: Mapping
Table of Contents
- Examples of Geographic Data
- Levels of Granularity
- Latitude/Longitude Coordinates
- Mini-Exercise: Getting Used to Lat/Lon
- Numerical vs. Categorical Granularity
- Two Places, One Name? One Place, Two Names?
- Change Over Time
- Change Across Borders
- Visualizing Geographic Data: Map Plots
- Interpreting Geographic Data
- Analyzing Geographic Data
- Converting between Levels of Granularity
- Mapping Step
- Aggregation Step
- Aggregation by Summing
- Aggregation by Averaging
- Aggregation by Something Else?
- Exercise: Aggregation by You
- Converting between Types of Granularity
Module 7: Time Series
Coming soon!
Page last updated: November 21, 2024 • Send feedback about this page