Home » Articles posted by Jonathan Mattingly (Page 5)
Author Archives: Jonathan Mattingly
Random Digit
Let \(D_i\) be a random digit chosen uniformly from \(\{0,1,2,3,4,5,6,7,8,9\}\). Assume that each of the \(D_i\) are independent.
Let \(X_i\) be the last digit of \(D_i^2\). So if \(D_i=9\) then \(D_i^2=81\) and \(X_i=1\). Define \(\bar X_n\) by
\[\bar X_n = \frac{X_1 + \cdots+X_n}{n}\]
- Predict the value of \(\bar X_n \) when \(n\) is large.
- Find the number \(\epsilon\) such that for \(n=10,000\) the chance that you prediction is off by more than \(\epsilon\) is about 1/200.
- Find approximately the least value of \(n\) such that your prediction of \(\bar X_n\) is correct to within 0.01 with probability at least 0.99 .
- If you just had to predict the first digit of \(\bar X_{100}\), what digit should you choose to maximize your chance of being correct, and what is that chance ?
[Pitman p206, #30]
Games with Black and White Balls
Consider the following gambling game for two players, Black and White. Black puts \(b\) black balls and White puts \(w\) white balls in a box. Black and White take turns at drawing randomly from the box, with replacement between draws until either Black wins by drawing a black ball or White wins by drawing a white ball. Suppose Black gets to draw first.
- Calculate \(\mathbf{P}(\text{Black wins})\) and \(\mathbf{P}(\text{White wins})\) in terms of \(p=b/(b+w)\).
- What value of \(p\) makes the game fair (equal chances of wining) ?
- Is the game ever fair ?
- What is the least total number of balls in the game, \((b+w)\), such that neither player has more that that \(51\%\) chance of winning ?
[Pitman P219, #13]
The Coupon-Collector
Suppose that there are \(N\) different types of coupons. Each box contains one coupon. The type of the coupon is chosen uniformly among \(\{1,2,\cdots,N\}\).
- If we open \(k\) boxes what is the expected number of different coupons thus we will find ? What is the limit of this quantity as \(k \rightarrow \infty\) ?
- Let \(T_k\), for \(k=1,\cdots,N\), be the number of boxes needed to obtain the \(k\)th unique type of coupon. Clearly \(T_1=1\) . For future reference define \(\tau_1=1\) and \(\tau_k=T_k- T_{k-1}\) for \(k=2,3,\cdots\).
- What is the distribution of \(\tau_k\) ?
- What is the expected value of \(T_N\) ? What is it approximately for \(N\) large ?
- What is the variance of \(\mathbf{Var}(T_N) \)?
- Show that \(\mathbf{SD}(T_N) < cn \) from some constant \(c>0\).
- Use Chebychev’s inequality to show that the probability that for large \(N\), \(T_N\) differs from \(N\log(N)\) by at most only a small multiple of \(N\) with high probabilty.
- (***) What is the distribution of \( (T_N- N\log(N) \,)/N\) as \(N\rightarrow \infty\) ? Hint: It is not normal !
Memorylessness and the Geometric Distribution
Let \(X\) be a random variable with range \(\{0,1,2,3,\dots\}\) and distributed geometrical with probability \(p\).
- Show that for every \(n \geq 0\) and \(j\geq 0\),
\[ \mathbf{P}(X-n=j \mid X\geq n) = \mathbf{P}(X=j)\] - If \(X \) is the time to the failure of a machine, then \(\mathbf{P}( X\geq n) \) is the event that the machine has not failed by time \(n\). Why is the above property called Memorylessness ?
- Show that the geometric distribution is the only random variable with range equal to \(\{0,1,2,3,\dots\}\) with this property.
Algebras and Conditioning
Consider two draws from a box with replacement contain 1 red ball and 3 blue balls. Let \(X\) be number of red balls. Let \(Y\) be 1 if the two balls are the same color and 0 otherwise. Let \(Z_i\) be the random variable which returns 1 if the \(i\)-th ball is red.
- What is the sample space.
- Write down the algebra of all events on this sample space.
- What is the algebra of events generated by \(X\) ?
- What is the algebra of events generated by \(Y\) ?
- What is the algebra of events generated by \(Z_1\) ?
- What is the algebra of events generated by \(Z_2\) ?
- Which random variables are determined by an another of the random variables. Why ? How is this reflected in the algebras ?
- (*) What pair of random variables are independent ? How is this reflected in the algebras ?
Only Pairwise Independence
Let \(X_1\) and \(X_2\) be two independent tosses of a fair coin. Let \(Y\) be the random variable equal to 1 if exactly one of those coin tosses resulted in heads, and 0 otherwise. For simplicity, let 1 denote heads and 0 tails.
- Write down the joint probability mass function for \((X_1,X_2,Y)\).
- Show that \(X_1\) and \(X_2\) are independent.
- Show that \(X_1\) and \(Y\) are independent.
- Show that \(X_2\) and \(Y\) are independent.
- The three variables are mutually independent if for any \(x_1 \in Range(X_1), x_2\in Range(X_2), y \in Range(Y)\) one has
\[ \mathbf{P}(X_1=x_1,X_2=x_2,Y=y) = \mathbf{P}(X_1=x_1) \mathbf{P}(X_2=x_2) \mathbf{P}(Y=y) \]
Show that \(X_1,X_2,Y\) are not mutually independent.
Expected Value and Mean Error
Let \(X\) be a random variable with \(\mu_1=\mathbf{E}(X)\) and \(\mu_2=\mathbf{E}(X^2)\). For any number \(a\) define the mean squared error
\[J(a)=\mathbf{E}\big[(X-a)^2\big] \]
and the absolute error
\[K(a)=\mathbf{E}\big[|X-a|\big] \]
- Write \(J(a)\) in terms of \(a\), \(\mu_1\), and \(\mu_2\) ?
- Use the above answer to calculate \(\frac{d J(a)}{d\, a}\) .
- Find the \(a\) which is the solution to \(\frac{d J(a)}{d\, a}=0 ?\) Comment on this answer in light of the name “Expected Value” and argue that it is actually a minimum.
- Assume that \(X\) only takes values \(\{x_1,x_2,\dots,x_n\}\). Use the fact that
\[ \frac{d\ }{d a} |x-a| = \begin{cases} -1 & \text{if \(a < x\)}\\
1 & \text{if \(a > x\)}\end{cases}
\]
to show that as long as \(a \not\in \{x_1,x_2,\dots,x_n\}\) one has
\[ \frac{d K(a)}{d\, a} =\mathbf{P}(X<a) – \mathbf{P}(X>a)\] - Now show that if \( a \in (x_k,x_{k+1})\) then \(\mathbf{P}(X<a) – \mathbf{P}(X>a) = 2\mathbf{P}(X \leq x_k) – 1\).
- The median is any point \(a\) so that both \(\mathbf{P}(X\leq a) \geq \frac12 \) and \(\mathbf{P}(X\geq a) \geq\frac12\). Give an example where the median is not unique. (That is to say there is more than one such \(a\).
- Use the above calculations to show that if \(a\) is any median (not equal to one of the \(x_k\)), then it solves \(\frac{d K(a)}{d\, a} =0\) and that it is a minimizer.
Three Valued Random Variable
Show that the distribution of a random variable \(X\) with possible values of 0,1 and 2 is determined by \(\mu_1=\mathbf{E}(X)\) and \(\mu_2=\mathbf{E}(X^2)\), by finding a formula for \(\mathbf{P}(X=x)\) in terms of \(\mu_1\) and \(\mu_2\).
[Pitman p. 184. #20]
Indicator Functions
Let \(A\) and \(B\) be independent events. Let \(\mathbf{1}_A\) and \(\mathbf{1}_B\) be the associated indicator random variables.
- Describe the random variable \(\mathbf{1}_A+ \mathbf{1}_B \) in terms of \(\mathbf{P}(A)\) and \(\mathbf{P}(B)\) ?
- Calculate \(\mathbf{E}(\mathbf{1}_A+ \mathbf{1}_B )\).
- Describe the random variable \((\mathbf{1}_A+ \mathbf{1}_B )^2\) in terms of \(\mathbf{P}(A)\) and \(\mathbf{P}(B)\) ?
- Calculate \(\mathbf{E}\big( (\mathbf{1}_A+ \mathbf{1}_B )^2 \big)\).
[Partially inspired by Pitman p182, #10]
Joint Distributions of Uniforms
Let \(X\) and \(Y\) be independent, each uniformly distributed on \(\{1,2,\dots,n\}\). Find:
- \(\mathbf{P}( X=Y)\)
- \(\mathbf{P}( X < Y)\)
- \(\mathbf{P}( X>Y)\)
- \(\mathbf{P}( \text{max}(X,Y)=k )\) for \(k=1,\dots,n\)
- \(\mathbf{P}( \text{min}(X,Y)=k )\) for \(k=1,\dots,n\)
- \(\mathbf{P}( X+Y=k )\) for \(k=2,\dots,2n\)
Blocks of Bernoulli Trials
In \(n+m\) independent \(\text{Bernoulli}(p)\) trials, let \(S_n\) be the number of successes in the first \(n\) trials and \(T_m\) number of successes in the last \(m\) trials.
- What is the distribution of \(S_n\) ? Why ?
- What is the distribution of \(T_m\) ? Why ?
- What is the distribution of \(S_n+T_m\) ? Why ?
- Are \(S_n\) and \(T_m\) independent ? Why ?
[Pitman p. 159, # 10]
Approximation: Rare vs Typical
Let \(S\) be the number of successes in 25 independent trials with probability \(\frac1{10}\) of success on each trial. Let \(m\) be the most likely value of S.
- find \(m\)
- find the probability that \(\mathbf{P}(S=m)\) correct to 3 decimal places.
- what is the normal approximation to \(\mathbf{P}(S=m)\) ?
- what is the Poisson approximation to \(\mathbf{P}(S=m)\) ?
- repeat the first part of the question with the number of trial equal to 2500 rather than 25. Would the normal or Poisson approximation give a better approximation in this case ?
- repeat the first part of the question with the number of trial equal to 2500 rather than 25 and the probability of success as \(\frac1{1000}\) rather that \(\frac1{10}\) . Would the normal or Poisson approximation give a better approximation in this case ?
[Pitman p122 # 7]
Lottery
Suppose that each week you buy a ticket in a lottery which has a chance \(\frac1{100}\) of wining. If you do this every week of a year, approximately what is the chance of getting exactly \(k\) wins for \(k=0,1,2,3\).
[Pittman p122, # 5]
Balls in a Box: Counting
A box contains 20 red balls and 30 black balls. Four balls are chosen without replacement. What is the chance that:
- all balls are red
- exactly three balls are red
- the first red ball appears on the last draw.
- the fist two balls are the same color
Poker Hands: counting
Assume that each of Poker hands are equally likely. The total number of hands is
\[\begin{pmatrix} 52 \\5\end{pmatrix}\]
Find the probability of being dealt each of the following:
- a straight flush ( all cards of the same suit and in order)
- a regular straight (but not a flush)
- two of a kind
- four of a kind
- two pairs (but not four of a kind)
- a full house (a pair and three of a kind)
In all cases, we mean exactly the hand stated. For example, four of a kind does not count as 2 pairs and a full house does not count as a pair or three of a kind.
Airline Overbooking
An airline knows that over the long run, 90% of passengers who reserve seats for a flight show up. On a particular flight with 300 seats, the airline sold 324 reservations.
- Assuming that passengers show up independently of each other, what is the chance that the flight will be overbooked ?
- Suppose that people tend to travel in groups. Would that increase of decrease the probability of overbooking ? Explain your answer.
- Redo the the calculation in the first question assuming that passengers always travel in pairs. Are your answers to all three questions consistent ?
[Pitman p. 109, #9]
Standard Normal Tail Bound
As usual define
\[\Phi(z) = \int_{-\infty}^z \phi(x) dx \quad\text{where} \quad \phi(x)=\frac{1}{2\pi} e^{-\frac12 x^2}\]
Some times it is use full to have an estimate of \(1-\Phi(z)\) which rigorously bounds it from above (since we can not write formulas for \(\Phi(z)\) ).
Follow the following steps to prove that
\[ 1- \Phi(z) < \frac{\phi(z)}{z}\,.\]
First argue that
\[ 1- \Phi(z) < \int^{\infty}_z \frac{x}{z}\phi(x) dx\,.\]
Then evaluate the integral on the right hand side to obtain the bound.
Basic Random Walk
Consider the following “game”: A marker is placed on the real line at the point zero. On each turn a coin is flip which a 1 printed on one side and a -1 printed on the other. If the 1 side lands face up, the marker is moved on unit in the positive direction while if the -1 lands heads up then the marker is moved one unit in the negative direction. If the coin has a probability of \(p\) of landing with the 1 side face up, answer the following questions:
- Let \(p=\frac12\). After 10000 turns if you had to pick one site to find the marker which would you choose ?
- Again let \(p=\frac12\). What is the approximate chance that the marker is further then 100 units from this most likely point after 10000 turns ? What is the approximate chance that the marker is further then 300 units from this most likely point after 10000 turns ?
- Repeat the above questions with \(p=\frac{9}{10}\).
The chance of being English
English and American spellings are rigour and rigor, respectively. An English speaking guest staying at a Paris hotel writes the word and chose a letter at random from his spelling. The letter turns out to be a vowel. (that is any of : e,a,i,o,u). If 40% of the English speaking guests are American and 60% are English, what is the probability that the writer is American ?
[Ross, p. 107 #29]
Chance of an Accident.
An insurance company has 50% urban and 50% rural customers. If every year each urban customer has an accident with probability \(\mu\) and each rural customer has an accident with probability \(\lambda\). Assume that the chance of an accident is independent from year to year and from customer to costumer. This is another way to say, conditioned on being and urban or rural the chance of having an accident each year is independent.
A costumer is randomly chosen. Let \(A_n\) be the chance this customer has an accident in year \(n\). Let \(U\) denote the event that this costumer is urban and \(R\) the event that the customer is rural.
- Find \( \mathbf{P}(A_2|A_1) \).
- Are \(A_1\) and \(A_2\) independent in general ? Are there any conditions when it is true if not in general ?
- Show that \(\mathbf{P}(A_2|A_1) \geq \mathbf{P}(A_2) \).
To answer this question it is useful to know that for any positive \(a\) and \(b\), one has \( (a+b)^2 < 2(a^2 +b^2)\) as long as \(a \neq b\). In the case \(a = b\), one has of course \( (a+b)^2 = 2(a^2 +b^2)\). To prove this inequality, first show that \( (a+b)^2 +(a-b)^2= 2(a^2 +b^2)\) and then use that fact that \( (a-b)^2 >0 \). - Find the probability that a driver has an accident in the 3nd year given that they had one in the 1st and 2nd year.
- Find the probability that a driver has an accident in the \(n\)-th year given that they had one in all of the previous years. What is the limit as \(n \rightarrow \infty\) ?
- Find the probability that a diver is a urban diver given that they had an accident in two successive years.
Stuffing Envelopes
You write a stack of thank you cards for people who gave you presents for your birthday. You address all of the envelopes but before you can stuff them you are called away. A friend tying to help you see the stack of cards and stuffs them in the envelops. Unfortunately they did not realize that each card was personalized and just stick them in the envelops randomly. Assuming there were \(n\) cards and \(n\) envelops, let \(X_{n}\) be the number of cards in the correct envelope.
- Find \(\mathbf{E} (X_n)\).
- Show that the variance of \(X_n\) is the same as \(\mathbf{E} (X_n)\).
- Is there any common distribution which has the above statistics ?
- (**) Show that
\[ \lim_{n\rightarrow \infty} \mathbf{P}(X_n =m) = \frac{1}{e\, m!}\]
Statespace of the NCAA Basketball tourniment (in the old days)
- Until 2000, the NCAA men’s Basketball tournament had 64 teams. They played each other in 6 rounds With the winner of each game moving one to play the winner of another game in a pre-specified order. If we assume the initial table of “who plays who” as well as who each winner plays is specified, give a concise description of the state space of all possible out comes. How many elements does the space have ?
- (*) If there are now \(2^n\) competitors and \(n\) rounds, answer the same questions as before.
[ Inspiration [GS2] p 1, # 3]
De Morgan’s Laws
Let \( A\), \(B\), and \(C_i\) be events.
- Show that \( (A \cup B )^c= A^c \cap B^c\).
- Show that \( (A \cap B)^c = A^c \cup B^c \).
- \[ \Big( \bigcup_{i=1}^n C_i \Big)^c = \bigcap_{i=1}^n C_i^c\]
- \[ \Big( \bigcap_{i=1}^n C_i \Big)^c = \bigcup_{i=1}^n C_i^c\]
- (**) Argue that the above previous two statements hold even when \(n=\infty\).
Is it True (Event Manipulation)?
Which of the following are identically true ? For those that are not, say when they are true.
- \(A \cup (B\cap C) = (A\cup B) \cap (A \cup C)\)
- \(A \cap (B\cap C)= (A \cap B) \cap C\)
- \( A \cup B) \cap C = A \cup (B\cup C) \)
- \( A \setminus (B\cap C) = (A\setminus B) \cup (A \setminus C)\)
[GS2, p.1 # 5]
Areas and circles
Let \(C_1\) be the circle of radius 2 centered at the origin \((0,0)\). Let \(C_2\) be the circle of radius 1 centered at the origin \((1,0)\).
- Draw a picture of matching the above description and shade the region inside of \(C_1\) but outside of \(C_2\).
- What is the area of the region you shaded ?
- What is the ratio of the shaded area to the area inside \(C_1\) ?
- What is the ratio of the area inside \(C_2\) to the area inside \(C_1\) ?
Algebra of events
Consider the sample space created by two coin flips. We will denote the outcome of first a head and then a tail by \((H,T)\). The even that there is exactly one head will be written \(\{(H,T), (T,H)\}\). Let \(\mathcal{A}\) denote the set of all events on this sample space.
- Write down all of the events in \(\mathcal{A}\). How many are there ?
- Pick a few elements of \(\mathcal{A}\) and show that the rules of being an algebra are in fact satisfied. Namely, if \(A,B \in \mathcal{A}\) then \( A \cup B \in \mathcal{A}\) and \(A^c \in \mathcal{A}\). (Here \(A^c\) is the compliment of \(A\) which is simply every event which is not is \(A\).)
- Why did I not ask you to verify by hand this property for all of the sets in \(\mathcal{A}\) ? Hint: can you tell me a lower bound on the number of such conditions you would have to check ?
- (*) Using the definition of an algebra, show that the empty set \(\emptyset\) is in \(\mathcal{A}\).
- (*) Using the definition of an algebra, show that if \(A, B \in\mathcal{A}\) then so must be \(A \cap B\).
- (*) Argue that if \(A_1, A_2,\dots,A_k \in\mathcal{A}\) then so are
\[ A_1\cup A_2\cup\cdots\cup A_k=\bigcup_{j=1}^k A_j\]
\[ A_1\cap A_2\cap\cdots\cap A_k=\bigcap_{j=1}^k A_j\]
Arithmetic sum
Show that
\[\sum_{j=1}^n j =\frac{1}{2}n(n+1)\]
Hint: notice that \(1+n=n+1\), \(2+(n-1)=n+1\), \(3+(n-2)=n+1\) and so on. How many such pairings exist ?
Estimating differences of independent draws
- Show that if \(X\) and \(Y\) are independent random variables, then
\[\mathrm{Var}(X- Y)=\mathrm{Var}(X+Y)\] - Let \(D_1\) and \(D_2\) represents two draws at random with replacement from a population, with \(\mathbf{E}D_1=10\) and \(\mathbf{SD}(D_1)_1=2\). Find a number \(c\) so that
\[\mathbf{P}(|D_1 -D_2| < c) \geq .99\]
From [Pittman p. 203, #15]
Putting expectations together
Suppose \(\mathbf{E}(X^2)=3\), \(\mathbf{E}(Y^2)=4\) and \(\mathbf{E}(XY)=2\). What is \(\mathbf{E}[(X+Y)^2]\) ?
Expection and dice rolls
A standard 6 sided die is rolled three times.
- What is the expected value of the first roll ?
- What is the expected values of the sum of the three rolls ?
- What is the expected number of twos appearing in the three rolls ?
- What is the expected number of sixes appearing in the three rolls ?
- What is the expected number of odd numbers ?
Based on [Pitman, p. 182 #3]
Indicatior functions and expectations
Let \(A\) and \(B\) be independent events and let \(\mathbf{1}_A\) and \(\mathbf{1}_B\) be the associated indicator functions. Answer the following questions in terms of \(\mathbf{P}(A)\) and \(\mathbf{P}(B)\).
- Describe the distribution of \( \mathbf{1}_A\).
- What is \(\mathbf{E} \mathbf{1}_A\) ?
- Describe the distribution of \((\mathbf{1}_A +\mathbf{1}_B)^2\).
- What is \(\mathbf{E}(\mathbf{1}_A +\mathbf{1}_B)^2 \) ?
Canididate: Confidence interval
A pollster wishes to knows the percentage \(p\) of people in a population who intend to vote for a particular candidate. How large must a random sample with replacement be in order to be at least 95% sure that the simple percentage is within one percentage point of \(p\) ?
Dice rolls: Explicit calculation of max/min
Let \(X_1\) and \(X_2\) be the number obtained on two rolls of a fair die. Let \(Y_1=\max(X_1,X_2)\) and \(Y_2=\min(X_1,X_2)\).
- Display the joint distribution tables for \( (X_1,X_2)\).
- Display the joint distribution tables for \( (Y_1,Y_2)\).
- Find the distribution of \(X_1X_2\).
Combination of [Pitman, p. 159 #4 and #5]
Blocks of Bernoulli Trials
In \(n+m\) independent Bernoulli \((p)\) trials, let \(S_n\) be the number of successes in the first \(n\) trials, \(T_n\) the number of successes in the last \(m\) trials.
- What is the distribution of \(S_n\) ? Why ?
- What is the distribution of \(T_m\) ? Why ?
- What is the distribution of \(S_n+T_m\) ? Why ?
- Are \(S_n\) and \(T_m\) independent ? Why ?
- Are \(S_n\) and \(T_{m+1}\) independent ? Why ?
- Are \(S_{n+1}\) and \(T_{m}\) independent ? Why ?
Based on [Pitman, p. 159, #10]
Balls in boxes
A box contains 1000 bulbs, of which 2 are black and the rest are white.
- Which of the following is mostlikely to occur in 1000 draws with replacement from the box ? Fewer than 2 black balls, Exactly two black balls, more than two black balls.
- If two series of 1000 draws are made at random from the box, what is, approximately, is the chance that they produce the same number of black balls ?