All posts by Dr Kristin Stephens-Martinez, Ph.D.

Exam 2

This post outlines what Exam 2 will be like. The format will be very similar to Exam 1. Anything that is different than Exam 1 ([DIFF] ) or new ([NEW] ) is marked as such.

Exam Logistics

  • [DIFF] The exam will cover up to and including Module 7. It will emphasize Modules 4-7, but will include Module 1-3 content as needed because this class’s material is cumulative.
  • The exam will be take-home. It is open book, open note, open internet, but closed to people.
    • This means you cannot communicate with a person while taking the exam, including asking someone through the Internet (like stackoverflow) for help and receiving help.
  • [DIFF] Timeframe: It must be completed on Thursday 11/04 between 10:15 am (start of class) and 11:59 pm.
    • The exam will close at 11:59 pm regardless of when you started.
  • The exam has two parts: Multiple Choice and Jupyter Notebook.
    • You may take a break between each part.
    • Both parts are timed through Sakai.
  • The exam must be done individually. It is a violation of class policy if you collaborate in any way with another person (in or not in the class) on the exam. You can only talk to the teaching staff about the exam.
  • Protect the integrity of the exam and your exam submission.
    • Do not talk to anyone about the exam during the exam period.
    • Take your exam in a secure location where no one can bother you.
    • Take your exam in a place where you will not be distracted or tempted to talk to someone.
  • The exam has randomized elements in it so no one’s exam will be identical to another person’s.
  • If you have a question during the exam, ask it as a private new message on the class forum. Or on Zoom if a teaching staff member is on call at that time.
    • We will do our best to always have someone checking the forum, however, we cannot make promises someone will instantly answer your question.
    • Prof. Stephens-Martinez will be in the class Zoom during class time and in her office hours zoom during her office hours that are immediately after Thursday’s class.
    • [DIFF] David has office hours Thursday 12:30-1:30 pm ET.
    • The exam is tested for readability, so the wording should be straightforward.
  • [DIFF] There is no mock exam.

Multiple Choice Questions (30 minutes)

  • You will have 30 minutes to complete this part.
  • It will be a Sakai Quiz (like the homework).
  • You can submit only once.
  • You will not see your score until after the testing period is over.

[DIFF] Jupyter Notebook (30 45 minutes)

  • [DIFF] You will have 30 45 minutes to complete this part.
  • You will get your Jupyter Notebook zip file inside a Sakai Quiz that is not the multiple-choice part.
  • You will submit it on Gradescope.
  • [NEW] We strongly recommend you submit to Gradescope multiple times, such as after each question.
  • You can rely on the Sakai Quiz timer to tell you how much time you have left.
  • We will use your logged start time in Sakai to track if you submitted on Gradescope on time.
  • You do not need to do anything with Sakai after you retrieve your zip file from the quiz.
  • During your testing period, you can submit as many times as you want to Gradescope. We will take your last submission.
  • The autograder will tell you if your values are the correct type, but not necessarily if they are the correct value. There are hidden tests. Your score will only be revealed after we have finished all grading, including the manual grading part.

How to Prepare

See Exam 1’s post.

Module 8: Normal Curve, Correlation, Regression, and Least Squares

This module is 1 class period longer than usual. This is to accommodate Exam 2 that is on Thursday 11/4.

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 10/24, late 10/25)
    1. Part 1: Normal Curve
    2. Part 2: Correlation
    3. Part 3: Regression
    4. Part 4: Least Squares
  4. Group Worksheet
  5. Lab 08 (Due Tuesday 11/2, late 11/7)

Videos

Part 1: Normal Curve

  1. Standard Units (14:08)
  2. SD and Bell Curves (8:52)
  3. Normal Distribution (8:41)
  4. Central Limit Theorem (19:28)

Part 2: Correlation

  1. Visualization (15:12)
  2. Calculation (19:54)
  3. Interpretation (11:21)

Part 3: Regression

  1. Prediction (11:53)
  2. Linear Regression (18:37)
  3. Regression to the Mean (6:33)
  4. Regression Equation (22:38)
  5. Interpreting the Slope (3:22)

Part 4: Least Squares

  1. Linear Regression Review (optional, 5:25)
  2. Discussion Question (optional, 5:09)
  3. Squared Error (9:55)
  4. Least Squares (6:15)

Textbook

Module 7: Causality, Confidence Intervals, Interpreting Confidence, and Center & Spread

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 10/17, late 10/18)
    1. Part 1: Causality
    2. Part 2: Confidence Intervals
    3. Part 3: Interpreting Confidence
    4. Part 4: Center and Spread
  4. Group Worksheet
  5. Lab 07 (Due Friday 10/22)

Videos

Part 1: Causality

  1. Introduction (7:29)
  2. Hypotheses (5:57)
  3. Test Statistic (3:09)
  4. Performing a Test (8:44)

Part 2: Confidence Intervals

  1. Percentiles (4:57)
  2. Estimation (9:29)
  3. Estimate Variability (7:22)
  4. The Bootstrap (21:10)

Part 3: Interpreting Confidence

  1. Applying the Bootstrap (11:05)
  2. Confidence Interval Pitfalls (5:54)
  3. Confidence Interval Tests (1:57)

Part 4: Center and Spread

  1. Introduction (16:27)
  2. Average and Median (8:35)
  3. Standard Deviation (12:49)
  4. Chebyshev’s Bounds (19:12)

Textbook

Project 2

The zip file will be in the class Box folder in the Project folder. You will submit this as a group on Gradescope. This covers up to module 6. It is due Friday 10/29, late to Sunday 10/31.

To work collaboratively, you can choose to use Google Colab. Put the file in your Google Drive and share it with your group. When you open the file, it will open in Google Colab. You all should be able to work on the notebook at the same time. However, working within the same cell may not work. You may notice that the file locations for the data are over the internet, rather than local. This change is to make working with Colab easier, which does not hold onto the data files between uses.

Group Plan 2

It’s time for round 2 of groups!

Some notes from the group reflection:

  • 77% found the group contracts useful or maybe useful.
  • 85% said they would want to do it again or maybe do it again.
  • 44% didn’t think the group contracts needed to change
  • Common themes for change included:
    • Remove/change the roles section
    • Add a section for what happens if something happens so someone cannot contribute as originally planned
    • More flexibility in what goes in the contract
    • Have a better plan on what, when, and how the group will work together.

Therefore, I’m going to require a group plan but not provide a template. There is still the group contract template in the prior post if your group wants to use it as a starting point. It’s a plan rather than a contract. This change of framing is to refocus how you all will use the document.

The following needs to be in your plan:

  1. Names of all team members
  2. Optional: Team name
    1. For inspiration, There are many team name generators on the internet.
  3. How you will communicate
  4. When you will work together
  5. Where you will work together (including a potential meeting outside of class)
  6. How you will work together
  7. Proposal for what to do if something happens to a team member and they cannot finish their planned work.
    1. Contacting Prof. Stephens-Martinez can be part of this proposal

Module 6: Sampling, Simulation, Hypothesis Testing, Comparing Distributions, Decisions and Uncertainty, and A/B Testing

Note: Due to fall break, there will be no class Tuesday 10/5 and this module spans 1.5 weeks. This means the amount of content is also a little longer than usual, so plan accordingly.

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Tuesday 10/5, late 10/6)
    1. Part 1: Sampling and Simulation
    2. Part 2: Hypothesis Testing
    3. Part 3: Comparing Distributions
    4. Part 4: Decisions and Uncertainty
    5. Part 5: A/B Testing
  4. Group Worksheet
  5. Lab 06 (In containers, Due Friday 10/15)

Videos

Part 1a: Sampling

  1. Probability & Sampling (2:28)
  2. Sampling (6:47)

Part 1b: Simulation

  1. Distributions (3:48)
  2. Large Random Samples (5:25)
  3. Simulation (2:53)
  4. Statistics (7:25)

Part 2: Hypothesis Testing

  1. Assessing Models (3:12)
  2. A Model about Random Selection (13:58)
  3. A Genetic Model (15:44)
  4. Example (Optional, 3:59)

Part 3: Comparing Distributions

  1. Introduction (6:03)
  2. Total Variation Distance (12:54)
    1. Alternatively, you can read the following sections in Ch 11.2: Comparison with Panels Selected at Random and A New Statistic: The Distance between Two Distributions.  Your main goal is to understand what the total variation distance (TVD) statistic is.
  3. Assessment (Optional, 15:32)
  4. Summary (2:48)

Part 4: Decisions and Uncertainty

  1. Introduction and Terminology (10:31)
  2. Performing a Test (11:58)
    1. Alternatively, you can read Ch 11.3’s section The GSI’s Defense. The main point of this video is to see that sometimes it’s not straightforward in whether or not to reject the null hypothesis. Below is a histogram of the averages when simulating a section’s average grade given the data. The red is where a section is claiming that their average was lower than is consistent with the overall data. But is it? How small is small enough to reject the hypothesis?
      png
  3. Statistical Significance (11:06)
  4. An Error Probability (8:55)
  5. Origin of the Conventions (Optional, 4:08)

Part 5a: A/B Testing

  1. Introduction (9:59)
  2. Hypotheses and Statistic (4:39)
  3. Performing the Test (15:59)

Part 5b: Deflategate Example (Optional)

  1. Deflategate Introduction (Optional, 12:59)
  2. Deflategate Testing (Optional, 11:09)

Textbook

Group Reflection 01

With Project 1 almost over, it is time to reflect on how working in your group went. You will find the reflection form on Gradescope starting on 9/28 (the day after the project is due) and it is due Friday 10/01, 11:59 pm, with a late submission accepted until 10/03, 11:59 pm.

The purpose of the reflection is to help you, your future group, and Prof. Stephens-Martinez to better understand how to make groups better and whether the group contract was a worthwhile exercise.

Module 5: Iteration and Probability

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 9/26, late 9/27)
    1. Part 1: Iteration
    2. Part 2: Probability
  4. Group Worksheet
  5. Lab 05 (Due Friday 10/01)

Videos

Part 0: Table Examples (Optional)

  1. Table Method Review (Optional, 7:17)
  2. Discussion Question (6:20)
  3. Old Midterm Question (7:46)
  4. Advanced Where (9:00)

Part 1: Iteration

  1. Comparison (6:16)
  2. Predicates (2:08)
  3. Random Selection (4:55)
  4. Random Selection Discussion (4:15)
  5. Print (2:43)
  6. Control Statements (6:06)
  7. For Statements (10:09)

Part 2: Probability

  1. Monty Hall Problem (12:47)
  2. Probability (1:43)
  3. Multiplication Rule (3:45)
  4. Addition Rule (1:28)
  5. Probability Example (1:59)

Textbook

Module 4: Groups, Pivots, and Joins

  1. Videos
  2. Textbook (supplemental)
  3. Homework (Due Sunday 9/19, late 9/20)
    1. Part 1a: Groups
    2. Part 1b: Pivot Tables
    3. Part 2: Joins
  4. Group Worksheet
  5. Lab 04 (In containers, Due Friday 9/24)

Videos

Part 1a: Groups

  1. One Attribute Group (14:39)
  2. Cross Classification (11:08)
  3. Example 1 (Optional, 5:04)

Part 1b: Pivot Tables

  1. Pivot Tables (13:09)
  2. Example 2 (Optional, 5:35)
  3. Comparing Distributions (12:00

Part 2: Joins

  1. Joins (10:28)
  2. Bikes (9:41)
  3. Shortest Trips (4:59)
  4. Maps (9:36)

Textbook

Exam 1

This post outlines what exam 1 will be like. Because the format is likely new to most people we will have a mock exam during class Tuesday 9/14 to learn what the process and format will be like.

Exam Logistics

  • The exam will cover Modules 1 through 3 inclusive.
  • The exam will be take-home. It is open book, open note, open internet, but closed to people.
    • This means you cannot communicate with a person while taking the exam, including asking someone through the Internet (like stackoverflow) for help and receiving help.
  • Timeframe: It must be completed on Thursday 9/16 between 10:15 am (start of class) and 11:59 pm.
    • The exam will close at 11:59 pm regardless of when you started.
  • The exam has two parts: Multiple Choice and Jupyter Notebook.
    • You may take a break between each part.
    • Both parts are timed through Sakai.
  • The exam must be done individually. It is a violation of class policy if you collaborate in any way with another person (in or not in the class) on the exam. You can only talk to the teaching staff about the exam.
  • Protect the integrity of the exam and your exam submission.
    • Do not talk to anyone about the exam during the exam period.
    • Take your exam in a secure location where no one can bother you.
    • Take your exam in a place where you will not be distracted or tempted to talk to someone.
  • The exam has randomization elements in it so no one’s exam will be identical to another person’s.
  • If you have a question during the exam, ask it as a private new message on the class forum. Or on Zoom if a teaching staff member is on call at that time.
    • We will do our best to always have someone checking the forum, however, we cannot make promises someone will instantly answer your question.
    • Prof. Stephens-Martinez will be in the class Zoom during class time and in her office hours zoom during her office hours that are immediately after Thursday’s class.
    • David has office hours Thursday 5-6 pm ET.
    • The exam is tested for readability, so the wording should be straightforward.

Multiple Choice Questions (30 minutes)

  • You will have 30 minutes to complete this part.
  • It will be a Sakai Quiz (like the homework).
  • You can submit only once.
  • You will not see your score until after the testing period is over.

Jupyter Notebook (30 minutes)

  • You will have 30 minutes to complete this part.
  • You will get your Jupyter Notebook zip file inside a Sakai Quiz that is not the multiple-choice part.
  • You will submit it on Gradescope.
  • You can rely on the Sakai Quiz timer to tell you how much time you have left.
  • We will use your logged start time in Sakai to track if you submitted on Gradescope on time.
  • You do not need to do anything with Sakai after you retrieve your zip file from the quiz.
  • During your testing period, you can submit as many times as you want to Gradescope. We will take your last submission.
  • The autograder will tell you if your values are the correct type, but not necessarily if they are the correct value. There are hidden tests. Your score will only be revealed after we have finished all grading, including the manual grading part.

How to Prepare

  • First, do all of your assigned work. The homework and labs are there to help you learn the material. The exam is to check if you actually learned it. Therefore, if you did the homework and labs and ensured you understood it, you will do fine on the exam.
  • We will make copies of all of the homework for studying, you can submit to these an unlimited amount of times.
    • The best practice is to do the homework without looking at your notes.
    • Anything you got wrong is information on what you need to work on and focus on studying.
  • The class Box folder has unsolved versions of Lab 2 and 3. You can download and do these again.
    • By redoing them without looking at your prior solution, you can find out what you are struggling with. Any time you can’t easily rewrite the answer, that is telling you what you need to study more on.
  • If you are struggling with something, ask on the class forum or go to office hours. You can also answer questions on the class forum yourself, articulating an answer is a great way to check your learning!
  • [Optional] Work on Project 1. It spans modules 1-3, which is what the exam will cover. However, it is not due until 9/24 so you should not feel compelled to finish it before the exam.

Mock Exam Logistics

Please come to class. This way if any hiccups occur the teaching staff will be available to help you figure it out.

If you have an SDAO accommodation, you should see that accommodation reflected in this mock exam. If you do not, notify Prof. Stephens-Martinez immediately.