# Module 2B: Probability

1. Prepare (soft due Th 9/2, hard due M 9/13)
1. Content below
2. Sakai quizzes
2. Group Worksheet (soft due F 9/3, hard due M 9/13)
3. Practice (due M 9/13)
4. Perform (due M 9/27)

## Content

2B.A – Foundations of Probability (52 min.)

1. Outcomes, Events, Probabilities (15 min.)
2. Joint and Conditional Probability (11 min.)
3. Marginalization and Bayes’ Theorem (15 min.)
4. Random Variables and Expectations (11 min.)

2B.B – Distributions of Random Variables (46 min.)

1. Distributions, Means, Variance (19 min.)
2. Monte Carlo Simulation (15 min.)
3. Central Limit Theorem (12 min.)

## Optional Supplements

You can access an excellent free online textbook on OpenIntro Statistics here, co-authored by Duke faculty. You can pay a suggested but adjustable price for a tablet-friendly pdf, but you can also just get the regular pdf for free. For this module, the following optional readings may be particularly helpful supplements:

• Chapter 3: Probability. This provides more information on many of the topics from the above videos in Foundations of Probability.
• Chapter 4: Distributions of random variables. This provides much more information about particular classic distributions than is provided in 2B.B.1.
• Chapter 5.1: Point estimates and sampling variability. This provides more information on some of the topics from 2B.B.2-3.

In addition, you can find documentation for the two pseudorandom number generating / sampling libraries in python that we mentioned here:

# Module 2A: Numpy & Pandas

1. Prepare (soft due Tu 8/31, hard due M 9/13)
1. Content below
2. Sakai quizzes
2. Group Worksheet (soft due W 9/1, hard due M 9/13)
3. Practice (due M 9/13)
4. Perform (due M 9/27)

## Content

2A.A – Numpy (1 hour)

1. Why Numpy (8 min.)
2. Numpy Array Basics (15 min.)
3. Numpy Universal Functions (20 min.)
4. Numpy Axis (14 min.)

2A.B – Pandas (45 min.)

1. Why Pandas (7 min.)
2. Pandas Series (19 min.)
3. Pandas Dataframe (21 min.)

# Module 1: What is Data Science, Anaconda, Python, & Jupyter

1. Prepare (soft due Th 8/26, hard due M 8/30)
1. Content below
2. See Sakai for quiz
3. Install Anaconda
2. Group Worksheet (soft due F 8/27, hard due M 8/30)
3. Practice (due M 8/30) (Solution)
4. No Perform

## Content

1.A – What is Data Science? (in class or see recording)

1.B – Python3 (12 min.)

1. Python vs. Java (3 min.)
2. Data Types (2 min.)
3. Iteration, Functions, Classes (7 min.)

1.C – Python for Data Science

1. Anaconda and Jupyter (10 min.)
2. Jupyter Notebook Demo (11 min.)