Category Archives: Module

Module 11: Pandas

  1. Videos
  2. No Textbook
  3. Homework (Due Sunday 11/28, late 11/29)
    1. Slicing and Dictionaries
  4. No Group Worksheet
  5. Lab 11 (Due Friday 12/3, late 12/12) – Note you have more time than usual for the late period

Videos

These are not from the Data8 material. These are a refresher of Python Data Structures which is important to understand if you want to take CS216 and use the pandas library. The Homework focuses on slicing and dictionaries, which are the most relevant to completing the lab.

  1. Sequences: Strings, List, Tuple (with slicing)
  2. Sequences: Strings, List, Tuple Part 2
  3. Sets and Dictionaries

Supplement

pandas documentation

Module 10: Classification and Classifiers

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 11/14, late 11/15)
    1. Part 1: Classification
    2. Part 2: Classifiers
  4. Group Worksheet
  5. Lab 10 (Due Friday 11/19, late 11/21)

Videos

Part 1: Classification

  1. Introduction (14:24)
  2. Nearest Neighbor (15:08)
  3. Examples (Optional, 13:31)

Part 2: Classifiers

  1. Terminology (7:49)
  2. Dataset (11:59)
  3. Distance (15:30)
  4. Nearest Neighbors (21:04)
  5. Evaluation (13:34)
  6. Decision Boundaries (20:01)

Textbook

Module 9: Residuals and Regression Inference

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 11/7, late 11/8)
    1. Part 1: Residuals
    2. Part 2: Regression Inference
  4. Group Worksheet
  5. Lab 09 (Due Friday 11/12, late 11/14)

Videos

Part 1: Residuals

  1. Introduction (5:26)
  2. Regression Diagnostics (5:40)
  3. Properties of Residuals (7:55)
  4. Discussion Question (1:55)

Part 2: Regression Inference

  1. Regression Model (9:38)
  2. Prediction Variability (10:18)
  3. The True Slope (7:14)

Textbook

Module 8: Normal Curve, Correlation, Regression, and Least Squares

This module is 1 class period longer than usual. This is to accommodate Exam 2 that is on Thursday 11/4.

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 10/24, late 10/25)
    1. Part 1: Normal Curve
    2. Part 2: Correlation
    3. Part 3: Regression
    4. Part 4: Least Squares
  4. Group Worksheet
  5. Lab 08 (Due Tuesday 11/2, late 11/7)

Videos

Part 1: Normal Curve

  1. Standard Units (14:08)
  2. SD and Bell Curves (8:52)
  3. Normal Distribution (8:41)
  4. Central Limit Theorem (19:28)

Part 2: Correlation

  1. Visualization (15:12)
  2. Calculation (19:54)
  3. Interpretation (11:21)

Part 3: Regression

  1. Prediction (11:53)
  2. Linear Regression (18:37)
  3. Regression to the Mean (6:33)
  4. Regression Equation (22:38)
  5. Interpreting the Slope (3:22)

Part 4: Least Squares

  1. Linear Regression Review (optional, 5:25)
  2. Discussion Question (optional, 5:09)
  3. Squared Error (9:55)
  4. Least Squares (6:15)

Textbook

Module 7: Causality, Confidence Intervals, Interpreting Confidence, and Center & Spread

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 10/17, late 10/18)
    1. Part 1: Causality
    2. Part 2: Confidence Intervals
    3. Part 3: Interpreting Confidence
    4. Part 4: Center and Spread
  4. Group Worksheet
  5. Lab 07 (Due Friday 10/22)

Videos

Part 1: Causality

  1. Introduction (7:29)
  2. Hypotheses (5:57)
  3. Test Statistic (3:09)
  4. Performing a Test (8:44)

Part 2: Confidence Intervals

  1. Percentiles (4:57)
  2. Estimation (9:29)
  3. Estimate Variability (7:22)
  4. The Bootstrap (21:10)

Part 3: Interpreting Confidence

  1. Applying the Bootstrap (11:05)
  2. Confidence Interval Pitfalls (5:54)
  3. Confidence Interval Tests (1:57)

Part 4: Center and Spread

  1. Introduction (16:27)
  2. Average and Median (8:35)
  3. Standard Deviation (12:49)
  4. Chebyshev’s Bounds (19:12)

Textbook

Module 6: Sampling, Simulation, Hypothesis Testing, Comparing Distributions, Decisions and Uncertainty, and A/B Testing

Note: Due to fall break, there will be no class Tuesday 10/5 and this module spans 1.5 weeks. This means the amount of content is also a little longer than usual, so plan accordingly.

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Tuesday 10/5, late 10/6)
    1. Part 1: Sampling and Simulation
    2. Part 2: Hypothesis Testing
    3. Part 3: Comparing Distributions
    4. Part 4: Decisions and Uncertainty
    5. Part 5: A/B Testing
  4. Group Worksheet
  5. Lab 06 (In containers, Due Friday 10/15)

Videos

Part 1a: Sampling

  1. Probability & Sampling (2:28)
  2. Sampling (6:47)

Part 1b: Simulation

  1. Distributions (3:48)
  2. Large Random Samples (5:25)
  3. Simulation (2:53)
  4. Statistics (7:25)

Part 2: Hypothesis Testing

  1. Assessing Models (3:12)
  2. A Model about Random Selection (13:58)
  3. A Genetic Model (15:44)
  4. Example (Optional, 3:59)

Part 3: Comparing Distributions

  1. Introduction (6:03)
  2. Total Variation Distance (12:54)
    1. Alternatively, you can read the following sections in Ch 11.2: Comparison with Panels Selected at Random and A New Statistic: The Distance between Two Distributions.  Your main goal is to understand what the total variation distance (TVD) statistic is.
  3. Assessment (Optional, 15:32)
  4. Summary (2:48)

Part 4: Decisions and Uncertainty

  1. Introduction and Terminology (10:31)
  2. Performing a Test (11:58)
    1. Alternatively, you can read Ch 11.3’s section The GSI’s Defense. The main point of this video is to see that sometimes it’s not straightforward in whether or not to reject the null hypothesis. Below is a histogram of the averages when simulating a section’s average grade given the data. The red is where a section is claiming that their average was lower than is consistent with the overall data. But is it? How small is small enough to reject the hypothesis?
      png
  3. Statistical Significance (11:06)
  4. An Error Probability (8:55)
  5. Origin of the Conventions (Optional, 4:08)

Part 5a: A/B Testing

  1. Introduction (9:59)
  2. Hypotheses and Statistic (4:39)
  3. Performing the Test (15:59)

Part 5b: Deflategate Example (Optional)

  1. Deflategate Introduction (Optional, 12:59)
  2. Deflategate Testing (Optional, 11:09)

Textbook

Module 5: Iteration and Probability

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 9/26, late 9/27)
    1. Part 1: Iteration
    2. Part 2: Probability
  4. Group Worksheet
  5. Lab 05 (Due Friday 10/01)

Videos

Part 0: Table Examples (Optional)

  1. Table Method Review (Optional, 7:17)
  2. Discussion Question (6:20)
  3. Old Midterm Question (7:46)
  4. Advanced Where (9:00)

Part 1: Iteration

  1. Comparison (6:16)
  2. Predicates (2:08)
  3. Random Selection (4:55)
  4. Random Selection Discussion (4:15)
  5. Print (2:43)
  6. Control Statements (6:06)
  7. For Statements (10:09)

Part 2: Probability

  1. Monty Hall Problem (12:47)
  2. Probability (1:43)
  3. Multiplication Rule (3:45)
  4. Addition Rule (1:28)
  5. Probability Example (1:59)

Textbook

Module 4: Groups, Pivots, and Joins

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 9/19, late 9/20)
    1. Part 1a: Groups
    2. Part 1b: Pivot Tables
    3. Part 2: Joins
  4. Group Worksheet
  5. Lab 04 (In containers, Due Friday 9/24)

Videos

Part 1a: Groups

  1. One Attribute Group (14:39)
  2. Cross Classification (11:08)
  3. Example 1 (Optional, 5:04)

Part 1b: Pivot Tables

  1. Pivot Tables (13:09)
  2. Example 2 (Optional, 5:35)
  3. Comparing Distributions (12:00

Part 2: Joins

  1. Joins (10:28)
  2. Bikes (9:41)
  3. Shortest Trips (4:59)
  4. Maps (9:36)

Textbook

Module 3: Census, Charts, & Functions

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 9/5, late 9/6)
    1. Homework 03 Part 1: Census & Charts
    2. Homework 03 Part 2: Histograms & Functions
  4. Group Worksheet
  5. Lab 03 (In Box folder if not in your container already)

Videos

Part 1: Census

  1. Census (6:59)
  2. Column Arithmetic (3:23)
  3. Accessing Values (6:04)
  4. Males and Females (7:13, Optional)

Part 2: Charts

  1. Line Graphs (11:40)
  2. Example 1 (4:31, Optional)
  3. Scatter Plots (7:28)
  4. Example 2 (6:15, Optional)
  5. How to Choose (2:15)
  6. Types of Data (3:58)
  7. Distributions (10:47)
  8. Example 3 (8:02)

Part 3: Histograms

  1. Area Principle (7:22)
  2. Binning (18:04)
  3. Example 1 (5:07, Optional)
  4. Drawing Histograms (13:06)
  5. Density (9:38)
  6. Example 2 (6:38, Optional)
  7. Example 3 (5:32, Optional)

Part 4a: Comparing Histograms

  1. Comparing Histograms (6:08)
  2. Comparing Histograms Discussion (2:48)

Part 4b: Functions

  1. Defining Functions (5:15)
  2. Defining Functions Discussion (8:04)
  3. Apply (3:30)
  4. Example Prediction (10:46)

Textbook

Module 2: Python, Tables, Expressions, & Strings

  1. Videos – Note some are optional
  2. Textbook links are supplemental
  3. Homework(Due Sunday 8/29. Can submit once late 9/4)
    1. Homework 01: Cause and Effect
    2. Homework 02 Part 1: Python & Tables
    3. Homework 02 Part 2: Expressions
    4. Homework 02 Part 3: Strings & Building Tables
  4. Group Worksheet
  5. Lab 02

Videos

Part 1a: Python

  1. Python (6:44)
  2. Names (10:24)
  3. Call Expressions (5:29)

Part 1b: Tables

  1. Tables (3:59)
  2. Select (6:21)
  3. Sorting (11:23)
  4. Bar Charts (12:00)

Part 2: Expressions

  1. Arithmetic (11:12)
  2. Arithmetic Question (2:45)
  3. Exponential Growth (8:57)
  4. Arrays (3:22)
  5. Columns (7:57)

Part 3a: Strings

  1. Creating Tables (5:53)
  2. Strings (9:28)
  3. String Exercise (0:50, Optional)
  4. Exercise Answer (1:39, Optional)

Part 3b: Minard’s Map (Optional)

  1. Minard’s Map (3:14, Optional)
  2. Minard’s Map Code (8:18, Optional)

Part 3c: Building Tables

  1. Lists (8:07)
  2. Take (3:50)
  3. Where  (10:42)

Textbook (Supplemental)