# Module 11: Pandas

1. Videos
2. No Textbook
3. Homework (Due Sunday 11/28, late 11/29)
4. No Group Worksheet
5. Lab 11 (Due Friday 12/3, late 12/12) – Note you have more time than usual for the late period

# Videos

These are not from the Data8 material. These are a refresher of Python Data Structures which is important to understand if you want to take CS216 and use the pandas library. The Homework focuses on slicing and dictionaries, which are the most relevant to completing the lab.

# Supplement

pandas documentation

# Module 10: Classification and Classifiers

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 11/14, late 11/15)
4. Group Worksheet
5. Lab 10 (Due Friday 11/19, late 11/21)

# Videos

## Part 1: Classification

1. Introduction (14:24)
2. Nearest Neighbor (15:08)
3. Examples (Optional, 13:31)

## Part 2: Classifiers

1. Terminology (7:49)
2. Dataset (11:59)
3. Distance (15:30)
4. Nearest Neighbors (21:04)
5. Evaluation (13:34)
6. Decision Boundaries (20:01)

# Module 9: Residuals and Regression Inference

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 11/7, late 11/8)
4. Group Worksheet
5. Lab 09 (Due Friday 11/12, late 11/14)

# Videos

## Part 1: Residuals

1. Introduction (5:26)
2. Regression Diagnostics (5:40)
3. Properties of Residuals (7:55)
4. Discussion Question (1:55)

## Part 2: Regression Inference

1. Regression Model (9:38)
2. Prediction Variability (10:18)
3. The True Slope (7:14)

# Module 8: Normal Curve, Correlation, Regression, and Least Squares

This module is 1 class period longer than usual. This is to accommodate Exam 2 that is on Thursday 11/4.

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 10/24, late 10/25)
4. Group Worksheet
5. Lab 08 (Due Tuesday 11/2, late 11/7)

# Videos

## Part 1: Normal Curve

1. Standard Units (14:08)
2. SD and Bell Curves (8:52)
3. Normal Distribution (8:41)
4. Central Limit Theorem (19:28)

## Part 2: Correlation

1. Visualization (15:12)
2. Calculation (19:54)
3. Interpretation (11:21)

## Part 3: Regression

1. Prediction (11:53)
2. Linear Regression (18:37)
3. Regression to the Mean (6:33)
4. Regression Equation (22:38)
5. Interpreting the Slope (3:22)

## Part 4: Least Squares

1. Linear Regression Review (optional, 5:25)
2. Discussion Question (optional, 5:09)
3. Squared Error (9:55)
4. Least Squares (6:15)

# Module 7: Causality, Confidence Intervals, Interpreting Confidence, and Center & Spread

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 10/17, late 10/18)
4. Group Worksheet
5. Lab 07 (Due Friday 10/22)

# Videos

## Part 1: Causality

1. Introduction (7:29)
2. Hypotheses (5:57)
3. Test Statistic (3:09)
4. Performing a Test (8:44)

## Part 2: Confidence Intervals

1. Percentiles (4:57)
2. Estimation (9:29)
3. Estimate Variability (7:22)
4. The Bootstrap (21:10)

## Part 3: Interpreting Confidence

1. Applying the Bootstrap (11:05)
2. Confidence Interval Pitfalls (5:54)
3. Confidence Interval Tests (1:57)

## Part 4: Center and Spread

1. Introduction (16:27)
2. Average and Median (8:35)
3. Standard Deviation (12:49)
4. Chebyshev’s Bounds (19:12)

# Module 6: Sampling, Simulation, Hypothesis Testing, Comparing Distributions, Decisions and Uncertainty, and A/B Testing

Note: Due to fall break, there will be no class Tuesday 10/5 and this module spans 1.5 weeks. This means the amount of content is also a little longer than usual, so plan accordingly.

1. Videos
2. Textbook (supplemental)
3. Homework (Due Tuesday 10/5, late 10/6)
4. Group Worksheet
5. Lab 06 (In containers, Due Friday 10/15)

# Videos

## Part 1a: Sampling

1. Probability & Sampling (2:28)
2. Sampling (6:47)

## Part 1b: Simulation

1. Distributions (3:48)
2. Large Random Samples (5:25)
3. Simulation (2:53)
4. Statistics (7:25)

## Part 2: Hypothesis Testing

1. Assessing Models (3:12)
2. A Model about Random Selection (13:58)
3. A Genetic Model (15:44)
4. Example (Optional, 3:59)

## Part 3: Comparing Distributions

1. Introduction (6:03)
2. Total Variation Distance (12:54)
1. Alternatively, you can read the following sections in Ch 11.2: Comparison with Panels Selected at Random and A New Statistic: The Distance between Two Distributions.  Your main goal is to understand what the total variation distance (TVD) statistic is.
3. Assessment (Optional, 15:32)
4. Summary (2:48)

## Part 4: Decisions and Uncertainty

1. Introduction and Terminology (10:31)
2. Performing a Test (11:58)
1. Alternatively, you can read Ch 11.3’s section The GSI’s Defense. The main point of this video is to see that sometimes it’s not straightforward in whether or not to reject the null hypothesis. Below is a histogram of the averages when simulating a section’s average grade given the data. The red is where a section is claiming that their average was lower than is consistent with the overall data. But is it? How small is small enough to reject the hypothesis?
3. Statistical Significance (11:06)
4. An Error Probability (8:55)
5. Origin of the Conventions (Optional, 4:08)

## Part 5a: A/B Testing

1. Introduction (9:59)
2. Hypotheses and Statistic (4:39)
3. Performing the Test (15:59)

## Part 5b: Deflategate Example (Optional)

1. Deflategate Introduction (Optional, 12:59)
2. Deflategate Testing (Optional, 11:09)

# Module 5: Iteration and Probability

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 9/26, late 9/27)
4. Group Worksheet
5. Lab 05 (Due Friday 10/01)

# Videos

## Part 0: Table Examples (Optional)

1. Table Method Review (Optional, 7:17)
2. Discussion Question (6:20)
3. Old Midterm Question (7:46)

## Part 1: Iteration

1. Comparison (6:16)
2. Predicates (2:08)
3. Random Selection (4:55)
4. Random Selection Discussion (4:15)
5. Print (2:43)
6. Control Statements (6:06)
7. For Statements (10:09)

## Part 2: Probability

1. Monty Hall Problem (12:47)
2. Probability (1:43)
3. Multiplication Rule (3:45)
5. Probability Example (1:59)

# Module 4: Groups, Pivots, and Joins

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 9/19, late 9/20)
4. Group Worksheet
5. Lab 04 (In containers, Due Friday 9/24)

# Videos

## Part 1a: Groups

1. One Attribute Group (14:39)
2. Cross Classification (11:08)
3. Example 1 (Optional, 5:04)

## Part 1b: Pivot Tables

1. Pivot Tables (13:09)
2. Example 2 (Optional, 5:35)
3. Comparing Distributions (12:00

## Part 2: Joins

1. Joins (10:28)
2. Bikes (9:41)
3. Shortest Trips (4:59)
4. Maps (9:36)

# Module 3: Census, Charts, & Functions

1. Videos
2. Textbook (supplemental)
3. Homework (Due Sunday 9/5, late 9/6)
4. Group Worksheet
5. Lab 03 (In Box folder if not in your container already)

# Videos

## Part 1: Census

1. Census (6:59)
2. Column Arithmetic (3:23)
3. Accessing Values (6:04)
4. Males and Females (7:13, Optional)

## Part 2: Charts

1. Line Graphs (11:40)
2. Example 1 (4:31, Optional)
3. Scatter Plots (7:28)
4. Example 2 (6:15, Optional)
5. How to Choose (2:15)
6. Types of Data (3:58)
7. Distributions (10:47)
8. Example 3 (8:02)

## Part 3: Histograms

1. Area Principle (7:22)
2. Binning (18:04)
3. Example 1 (5:07, Optional)
4. Drawing Histograms (13:06)
5. Density (9:38)
6. Example 2 (6:38, Optional)
7. Example 3 (5:32, Optional)

## Part 4a: Comparing Histograms

1. Comparing Histograms (6:08)
2. Comparing Histograms Discussion (2:48)

## Part 4b: Functions

1. Defining Functions (5:15)
2. Defining Functions Discussion (8:04)
3. Apply (3:30)
4. Example Prediction (10:46)

# Module 2: Python, Tables, Expressions, & Strings

1. Videos – Note some are optional
3. Homework(Due Sunday 8/29. Can submit once late 9/4)
4. Group Worksheet
5. Lab 02

# Videos

## Part 1a: Python

1. Python (6:44)
2. Names (10:24)
3. Call Expressions (5:29)

## Part 1b: Tables

1. Tables (3:59)
2. Select (6:21)
3. Sorting (11:23)
4. Bar Charts (12:00)

## Part 2: Expressions

1. Arithmetic (11:12)
2. Arithmetic Question (2:45)
3. Exponential Growth (8:57)
4. Arrays (3:22)
5. Columns (7:57)

## Part 3a: Strings

1. Creating Tables (5:53)
2. Strings (9:28)
3. String Exercise (0:50, Optional)