## Flipping Coins and Independence

An experimenter has two fair coins and one biased coin. The biased coin lands on heads with probability 3/4.

The experimenter randomly selects one of the three coins and flips it until they get heads.

Let \(A\) be the event that the experimenter flipped the biased coin.

Let \(B\) be the event that it took the experimenter an even number of flips to get heads.

Are events \(A\) and \(B\) independent?

## Working Batteries

A warehouse stores batteries. Most of the batteries work properly, but about 0.1%$ are faulty.

If a company orders 500 batteries, what is the probability that less than 3 will be faulty? Do this problem three ways:

- Find the probability exactly.
- Use a Poisson approximation to estimate.
- Use a normal approximation to estimate.

A company needs 10,000 working batteries. How many batteries should the company order from the warehouse in order to be 99.7% certain that they will receive at least 10,000 working batteries?

## Biased coin

You have a biased coin, but you don’t know what the bias is. Let \(p\) be the actual probability of getting heads on a single coin flip, \(p=\mathbb{P}(Heads).\)

- Suppose \(p=0.8\). What is the probability of observing between 76 and 84 heads out of 100 flips of the coin.
- Suppose you flip the coin 100 times and observe 80 heads. What is the 95% confidence interval for \(p\)?

## August Birthdays

About 9% of birthdays (in the US) are in August. A researcher samples 10,000 people from the US and asks for their birthdays. Estimate the probability that between 850 and 950 of those people were born in August.

## Ball in Boxes

Suppose you have three boxes, \(Box_1,Box_2,Box_3\), such that \(Box_i\) contains \(i\) white balls and one black ball.

You will to select one ball from the boxes. Here are two schemes you could use for selection:

- Select one box uniformly at random. Pull one ball from that box. Or,
- Dump all the balls into one box. Mix them up. Pull out one ball.

Are these two schemes probabilistically equivalent?

Suppose instead of selecting a box uniformly at random, you select \(Box_i\) with probability \(p_i\). Find a list of values for \(p_1, p_2,\) and \(p_3\) that would make this new scheme probabilistically equivalent to scheme 2?

## Mean and Mode of Binomial

Consider a binomial\((10,p)\) distribution. If \(p\) is chosen uniformly at random from the interval \((0,1)\), what is the likelihood that the most likely number of the binomial distribution will be less than the mean of the binomial distribution?

## Shaq Free Throws

Over his career, Shaquille O’Neal made about 53% of his free throws. Assume his probability of making a single free throw is 53%. Suppose Shaq shot a round of 20 free throws and you’re told he made 15 of them.

- What is the likelihood he made the first free throw, given that he made 15?
- What is the likelihood he made at least 1 out of his first 5 free throws, given that he made 15?

## Loaded Dice

You have a pair of fair dice and a pair of loaded dice. But you forgot which pair is which. You do remember that when you bought the loaded dice, the company that makes them claimed the dice would land on a sum of 7 approximately 1/3 of the time.

- You choose one of the pairs at random and roll it once. You get a sum of 7. What is the likelihood that you picked the loaded dice?
- You choose one of the pairs at random and roll the pair three times. You get exactly one sum of 7. What is the likelihood that you picked the loaded dice?

## Outcome space

Let \(\Omega\) be an outcome space with 16 outcomes. \(A\) and \(B\) are events inside of \(\Omega\). Event \(A\) has 10 outcomes and event \(B\) has 10 outcomes.

- Determine all the possible values of \(\# (A\cap B).\)
- Determine all the possible values of \(\# (A\cup B).\)
- Determine all the possible values of \(\#(A^c\cup B^c).\)
- Determine all the possible values of \(\# (A^c\cap B^c).\)

## Dice rolling addition rule

You roll a fair 6-sided die 3 times. What is the likelihood of getting exactly one 4, exactly one 5, or exactly one 6?

## Classroom Surveys

A researcher is collecting data from 10 high school classrooms. Each classroom contains 30 people. The researcher asks each student to fill out a survey. Suppose each student has about a 40% chance of completing the survey (independent of other students). What is the probability that at least 4 classrooms have at least 15 students who complete the survey?

## Coin flipping game

Your friend challenges you to a game in which you flip a fair coin until you get heads. If you flip an even number of times, you win. Let \(A\) be the event that you win. Let \(B\) be the event that you flip the coin 3 or more times. Let \(C\) be the event that you flip the coin 4 or more times.

- Compute \(\mathbb{P}(A)\).
- Are \(A\) and \(B\) independent?
- Are \(A\) and \(C\) independent?

## Repeated Quiz Questions

Each week you get multiple attempts to take a two-question quiz. For each attempt, two questions are pulled at random from a bank of 100 questions. For a single attempt, the two questions are distinct.

- If you attempt the quiz 5 times, what is the probability that within those 5 attempts, you’ve seen at least one question two or more times?
- How many times do you need to attempt the quiz to have a greater than 50% chance of seeing at least one question two or more times?

## Dice Rolling Events

Consider rolling a fair 6-sided die twice. Let \(A\) be the event that the first roll is less than or equal to 3. Let \(B\) be the event that the second roll is less than or equal to 3. Find an event \(C\) in the same outcome space as \(A\) and \(B\) with \(0<\mathbb{P}(C)<1\) and such that \(A\), \(B\) and \(C\) are mutually independent, or show that no such event exists.

## Cognitive Dissonance Among Monkeys

Assume that each monkey has a strong preference between red, green, and blue M&M’s. Further, assume that the possible orderings of the preferences are equally distributed in the population. That is to say that each of the 6 possible orderings ( R>G>B or R>B>G or B>R>G or B>G>R or G>B>R or G>R>B) are found with equal frequency in the population. Lastly assume that when presented with two M&Ms of different colors they always eat the M&M with the color they prefer.

In an experiment, a random monkey is chosen from the population and presented with a Red and a Green M&M. In the first round, the monkey eats the one based on their personal preference between the colors. The remaining M&M is left on the table and a Blue M&M is added so that there are again two M&M’s on the table. In the second round, the monkey again chooses to eat one of the M&M’s based on their color preference.

- What is the chance that the red M&M is not eaten in the first round?
- What is the chance that the green M&M is not eaten in the first round?
- What is the chance that the Blue M&M is not eaten in the second round?

[Mattingly 2022]

## Which deck is rigged ?

Two decks of cards are sitting on a table. One deck is a standard deck of 52 cards. The other deck (called the rigged deck) also has 52 cards but has had 4 of the 13 Harts replaced by Diamonds. (Recall that a standard deck has 4 suits: Diamonds, Harts, Spades, and Clubs. normal there are 13 of each suit.)

- What is the probability one chooses 4 cards from the rigged deck and gets exactly 2 diamonds and no hearts?
- What is the probability one chooses 4 cards from the standard deck and gets exactly 2 diamonds and no hearts?
- You randomly chose one of the decks and draw 4 cards. You obtain exactly 2 diamonds and no hearts.
- What is the probability you chose the cards from the rigged deck?
- What is the probability you chose the cards from the standard deck?
- If you had to guess which deck was used, which would you guess? The standard or the rigged ?

## Getting your feet wet numerically

Simulate the following stochastic differential equations:

- \[ dX(t) = – \lambda X(t) dt + dW(t) \]
- \[ dY(t) = – \lambda Y(t) dt +Y(t) dW(t) \]

by using the following Euler type numerical approximation

- \[X_{n+1} = X_n – \lambda X_n h + \sqrt{h} \eta_n\]
- \[Y_{n+1} = Y_n – \lambda Y_n h + \sqrt{h} Y_n\eta_n\]

where \(n=0,1,2,\dots\) and \(h >0\) is a small number that give the numerical step side. That is to say that we consider \( X_n \) as an approximation of \(X( t) \) and \( Y_n \) as an approximation of \(Y( t) \) each with \(t=h n\). Here \(\eta_n\) are a collection of mutually independent random variables each with a Gaussian distribution with mean zero and variance one. (That is \( N(0,1) \).)

Write code to simulate the two equations using the numerically methods suggested. Plot some trajectories. Describe how the behavior changes for different choices of \(\lambda\). Can you conjecture where it changes ? Compare and contrast the behavior of the two equations.

Tell your story with pictures.

## Match play golf problem

This problem is motivated by the new format for the PGA match play tournament. The 64 golfers are divided into 16 pools of four players. On the first three days each golfer plays one 18 hole match against the other three in his pool. After 18 holes the game continues until there is a winner. At the end of these three days the possible records of the golfers in a pool could be: (a) 3-0, 2-1, 1-2, 0-3; (b) 3-0, 1-2, 1-2, 1-2; (c) 2-1, 2-1, 1-2, 1-2; (d) 2-1, 2-1, 2-1, 0-3. What are the probabilities for each of these four possibilities?

## Handing back tests

A professor randomly hands back test in a class of \(n\) people paying no attention to the names on the paper. Let \(N\) denote the number of people who got the right test. Let \(D\) denote the pairs of people who got each others tests. Let \(T\) denote the number of groups of three who none got the right test but yet among the three of them that have each others tests. Find:

- \(\mathbf{E} (N)\)
- \(\mathbf{E} (D)\)
- \(\mathbf{E} (T)\)

## Up by two

Suppose two teams play a series of games, each producing a winner and a loser, until one time has won two more games than the other. Let \(G\) be the number of games played until this happens. Assuming your favorite team wins each game with probability \(p\), independently of the results of all previous games, find:

- \(P(G=n) \) for \(n=2,3,\dots\)
- \(\mathbf{E}(G)\)
- \(\mathrm{Var}(G)\)

[Pittman p220, #18]

## Population

A population contains \(X_n\) individuals at time \(n=0,1,2,\dots\) . Suppose that \(X_0\) is distributed as \(\mathrm{Poisson}(\mu)\). Between time \(n\) and \(n+1\) each of the \(X_n\) individuals dies with probability \(p\) independent of the others. The population at time \(n+1\) is comprised of the survivors together with a random number of new immigrants who arrive independently in numbers distributed according to \(\mathrm{Poisson}(\mu)\).

- What is the distribution of \(X_n\) ?
- What happens to this distribution as \(n \rightarrow \infty\) ? Your answer should depended on \(p\) and \(\mu\). In particular, what is \( \mathbf{E} X_n\) as \(n \rightarrow \infty\) ?

[Pittman [236, #18]

## Weights of Pennies

The distribution of weights of US pennies is approximately normal with a mean of 2.5 grams and a standard deviation of 0.03 grams.

(a) What is the probability that a randomly chosen penny weighs less than 2.4 grams?

(b) Describe the sampling distribution of the mean weight of 10 randomly chosen pennies.

(c) What is the probability that the mean weight of 10 pennies is less than 2.4 grams

(d) Sketch the two distributions (population and sampling) on the same scale.

[From OpenIntro Statistics, Second Edition, Problem 4.39]

## Benford’s Law

Assume that the population in a city grows exponentially at rate \(r\). In other words, the number of people in the city, \(N(t)\), grows as \(N(t)=C e^{rt}\), where \(C<10^6\) is a constant.

1. Determine the time interval \(\Delta t_1\) during which \(N(t)\) will be between 1 and 2 million people.

2. For \(k=1,…,9\), determine the time interval \(\Delta t_k\) during which \(N(t)\) will be between k and k+1 million people.

3. Calculate the total time \(T\) it takes for \(N(t)\) to grow from 1 to 10 million people.

4. Now pick a time \(\hat t \in [0,T]\) uniformly at random, and use the above results to derive the following formula (also known as Benford’s law) $$p_k=\mathbb P(N(\hat t) \in [k, k+1] \,million)=\log_{10}(k+1)-\log_{10}(k).$$

## A modified Wright-Fisher Model

Consider the ODE

\[ \dot x_t = x_t(1-x_t)\]

and the SDE

\[dX_t = X_t(1-X_t) dt + \sqrt{X_t(1-X_t)} dW_t\]

- Argue that \(x_t\) can not leave the interval \([0,1]\) if \( x_0 \in (0,1)\).
- What is the behavior of \(x_t\) as \(t \rightarrow\infty\) if if \( x _0\in (0,1)\) ?
- Can the diffusion \(X_t\) exit the interval \( (0,1) \) ? Prove your claims.
- What do you think happens to \(X_t\) as \(t \rightarrow \infty\) ? Argue as best you can to support your claim.

## No Explosions from Diffusion

Consider the following ODE and SDE:

\[\dot x_t = x^2_t \qquad x_0 >0\]

\[d X_t = X^2_t dt + \sigma |X_t|^\alpha dW_t\qquad X_0 >0\]

where \(\alpha >0\) and \(\sigma >0\).

- Show that \(x_t\) blows up in finite time.
- Find the values of \(\sigma\) and \(\alpha\) so that \(X_t\) does not explode (off to infinity).

[ From Klebaner, ex 6.12]

## Cox–Ingersoll–Ross model

The following model has SDE has been suggested as a model for interest rates:

\[ dr_t = a(b-r_t)dt + \sigma \sqrt{r_t} dW_t\]

for \(r_t \in \mathbf R\), \(r_0 >0\) and constants \(a\),\(b\), and \(\sigma\).

- Find a closed form expression for \(\mathbf E( r_t)\).
- Find a closed form expression for \(\mathrm{Var}(r_t)\).
- Characterize the values of parameters of \(a\), \(b\), and \(\sigma\) such that \(r=0\) is an absorbing point.
- What is the nature of the boundary at \(0\) for other values of the parameter ?

## SDE Example: quadratic geometric BM

Show that the solution \(X_t\) of

\[ dX_t=X_t^2 dt + X_t dB_t\]

where \(X_0=1\) and \(B_t\) is a standard Brownian motion has the representation

\[ X_t = \exp\Big( \int_0^t X_s ds -\frac12 t + B_t\Big)\]

## Practice with Ito and Integration by parts

Define

\[ X_t =X_0 + \int_0^t B_s dB_s\]

where \(B_t\) is a standard Brownian Motion. Show that \(X_t\) can also be written

\[ X_t=X_0 + \frac12 (B^2_t -t)\]

## Discovering the Bessel Process

Let \(W_t=(W^{(1)}_t,\dots,W^{(n)}_t) \) be an \(n\)-dimensional Brownian motion with \( W^{(i)}_t\) standard independent 1-dim brownian motions and \(n \geq 2\).

Let

\[X_t = \|W_t\| = \Big(\sum_{i=1}^n (W^{(i)}_t)^2\Big)^{\frac12}\]

be the norm of the brownian motions. Even though the absolute value is not differentiable at zero we can still apply Itos formula since Brownian motion never visits the origin if the dimension is greater than zeros.

- Use Ito’s formula to show that \(X_t\) satisfies the Ito process

\[ dX_t = \frac{n-1}{2 X_t} dt + \sum_{i=1}^n \frac{W^{(i)}_t }{X_t} dW^{(i)}_t \] - Using the Levy-Doob Theorem show that

\[Z_t =\sum_{i=1}^n \int_0^t \frac{W^{(i)}_t }{X_t} dW^{(i)}_t \]

is a standard Brownian Motion. - In light of the above discussion argue that \(X_t\) and \(Y_t\) have the same distribution if \(Y_t\) is defined by

\[ dY_t = \frac{n-1}{2 Y_t} dt + dB_t\]

where \(B_t\) is a standard Brownian Motion.

Take a moment to reflect on what has been shown. \(W_t\) is a \(\mathbf R^n\) dimensional Markov Process. However, there is no guarantee that the one dimensional process \(X_t\) will again be a Markov process, much less a diffusion. The above calculation shows that the distribution of \(X_{t+h}\) is determined completely by \(X_t\) . In particular, it solves a one dimensional SDE. We were sure that \(X_t\) would be an Ito process but we had no guarantee that it could be written as a single closed SDE. (Namely that the coefficients would be only functions of \(X_t\) and not of the details of the \(W^{(i)}_t\)’s.

## One dimensional stationary measure

Consider the one dimensional SDE

\[dX_t = f(X_t) dt + g(X_t) dW_t\]

which we assume has a unique global in time solution. For simplicity let us assume that there is a positive constant \(c\) so that \( 1/c < g(x)<c\) for all \(x\) and that \(f\) and \(g\) are smooth.

A stationary measure for the problem is a probability measure \(\mu\) so that if the initial distribution \(X_0\) is distributed according to \(\mu\) and independent of the Brownian Motion \(W\) then \(X_t\) will be distributed as \(\mu\) for any \(t \geq 0\).

If the functions \(f\) and \(g\) are “nice” then the distribution at time \(t\) has a density with respect to Lebesgue measure (“dx”). Which is to say there is a function \(p_x(t,y)\) do that for any \(\phi\)

\[\mathbf E_x \phi(X_t) = \int_{-\infty}^\infty p_x(t,y)\phi(y) dy\]

and \(p_\phi(t,y)\) solves the following equation

\[\frac{\partial p_\phi}{\partial t}(t,y) = (L^* p_\phi)(t,y)\]

with \( p_\phi(0,y) = \phi(y)\) where \(\phi(z)\) is the density with respect to Lebesgue of the initial density. (The pdf of \(X_0\) .)

\(L^*\) is the formal adjoint of the generator \(L\) of \(X_t\) and is defined by

\[(L^*\phi)(y) = – \frac{\partial\ }{\partial y}( f \phi)(y) + \frac12 \frac{\partial^2\ }{\partial y^2}( g^2 \phi)(y) \]

Since we want \(p_x(t,y)\) not to change when it is evolved forward with the above equation we want \( \frac{\partial p}{\partial t}=0\) or in other words

\[(L^* p_\phi)(t,y) =0\]

- Let \(F\) be such that \(-F’ = f/g^2\). Show that \[ \rho(y)=\frac{K}{g^2(y)}\exp\Big( – 2F(y) \Big)\] is an invariant density where \(K\) is a normalization constant which ensures that

\[\int \rho(y) dy =1\] - Find the stationary measure for each of the following SDEs:

\[dX_t = (X_t – X^3_t) dt + \sqrt{2} dW_t\]\[dX_t = – F'(X_t) dt + \sqrt{2} dW_t\] - Assuming that the formula derived above make sense more generally, compare the invariant measure of

\[ dX_t = -X_t + dW_t\]

and

\[ dX_t = -sign(X_t) dt + \frac{1}{\sqrt{|X_t|}} dW_t\] - Again, proceding fromally assuming everything is well defined and makes sense find the stationary density of \[dX_t = – 2\frac{sign(X_t)}{|X_t|} dt + \sqrt{2} dW_t\]