Is bold play optimal in football?

It has been 18 months since my last blog post. At that point I was very angry about Trump’s mishandling of the covid pandemic and the fact that people wouldn’t wear masks, while on other days I was saying goodbye to two former colleagues who were mentors, colleagues, and good friends. Not much has changed: now I am angry about people who won’t get vaccinated and I spend my time sticking pins into my Aaron Rodgers voodoo doll hoping that a covid outbreak on his team will keep him from winning the Super Bowl.

To calm myself I have decided to do some math and relax. It is a well-known result (but not easy for an impatient person to find on the internet) that if you are playing a game that is biased against, bold play is optimal. Specifically, if  you want to reach a fixed target amount of money when playing such a game then the optimal strategy is to bet all the money you have until you reach the point where winning will take you beyond your goal and then bet only enough to reach your goal.

For a concrete example, suppose (i) you have $1 and your wife wants you to take her from brunch at the Cheesecake Factory which will cost you $64, and (ii) you want to win the necessary amount of money by betting on black at roulette where you win $1 with probability 18/38 and lose $1 with probability 20/38. A standard calculation, which I’ll omit since it is not very easy to type in Mircosoft Word, (see Example 1.45 in my book Essentials of Stochastic Processes) shows that the probability I will succeed is 1.3116 x 10 -4. In contrast the strategy of starting with $1 and “letting it ride” with the hope that you can win six times in a row has probability (18/38)6 = 0.01130. This 86 times as large as the previous answer, but still a very small probability.

Consider now the NFL football game between the Green Bay Packers and the Baltimore Ravens held on Sunday December 19, 2021. After trailing 31-17 the Ravens scored two touchdowns to bring the score to 31-30. To try to win the game without going to overtime they went for a two-point conversion, failed and lost the game. Consulting google I find that surprisingly 49.4% of two point conversions are successful versus 94.1% of kicks for one point. In this game under consideration the two point conversion would not necessarily win the game, since there were about 45 seconds left on the clock with Green Bay having one time out, so there is some chance (say 30%) that Green Bay could use passes completed near the sideline to get within range to make a field goal and win the game 34-32. Rounding 49.4 to 50, going for two results in a win probability for the Ravens of 35%. With a one-point conversion their win probability is 0.94 x 0.7 x p, where p is the probability of winning in overtime. If p = ½ this is 33%. However, if the 8-6 Ravens feel like the 11-3 Packers have a probability significantly bigger than ½ to win in overtime then the right decision was to go for two points.

A second example is provided by the Music City Bowl game between Tennessee and Purdue, held December 30, 2021. After a fourth quarter that saw each team score two touchdowns in a short amount of time (including a two-point conversion by Purdue to tie the score), each had 45 points. The pre-overtime coin flip determined that Tennessee would try to score first (starting as usual from the opponent’s 25 yard line). Skipping over the nail biting excitement that makes football fun to watch, we fast-forward to Tennessee with fourth down and goal on the 1 yard line. The timid thing to do would be to kick the field goal which has a probability that is essentially 1. In this case if Purdue

(i) scores a touchdown (with probability p7) Tennessee (or T for short) loses

(ii) kicks a field goal (with probability p3), they go to a second overtime period

(iii) does not score (with probability p0) T wins

Using symmetry the probability T wins is p0 + p3/2 = 1 – p7 – p3/2

Case 1. If T fails to score (which is what happened in the game) then the Purdue will win   with high probability, since they only need a field goal. In the actual game, three running plays brought the ball 8 yards closer and then the kicker made a fairly routine field goal.

Case 2. If T scores (with probability q) then Purdue must score a touchdown, an event of probability P7 > p7 so the probability T wins when they try to score a touchdown is q[(1-P7) +P7/2]

There are a few too many unknowns here, but if we equate p7 with scoring a touchdown when the team is in the red zone (inside the 20) then the top 10 ranked teams all have probabilities of > 0.8. If we take q=0.5, set p7=0.8 then the probability T wins in Case 2 is 0.3 versus 0.2 – p3/2 in Case 1 which is 0.15 if p3=0.1 (half the time they don’t score a touchdown they get a field goal.

Admittedly like a student on exam with a question they don’t know the answer to, I have laid down a blizzard of equations in hopes of a better score on the problem. But in simple terms since p7 is close to 1, the Tennessee coach could just skip all the math and assume Purdue would go on to score a touchdown on their possession and realize he needs his team to score a touchdown, which regrettably they did not.

Like many posts I have written, the story ended differently than I initially thought but you have to follow the math where it takes you.

A National Face Mask Law Could End the Pandemic

How do I know this? Because I read an article in the April 2020 issue of the Atlantic Monthly explained the “real reason to wear a mask.”

https://www.theatlantic.com/health/archive/2020/04/dont-wear-mask-yourself/610336/

Medical workers use them and other PPE to avoid ingress, transmission of outside particle to the wearer. However, individuals should wear masks to prevent egress. A key transmission route of COVID-19 is via droplets that fly out of our mouths when we cough, sneeze or even just speak. The purpose of wearing a mask is to avoid you transmitting the virus to others around us.

To develop this article, the magazine assembled an interdisciplinary team of 19 experts and looked at a range of mathematical models and other research. They wrote a scientific paper that was published online

https://www.preprints.org/manuscript/202004.0203/v1

The conclusion was that if 80% of people wore masks that were 60% efficient (easily achievable with cloth masks) the basic reproduction number R0 for the epidemic would be < 1 and the epidemic would die out. A graphic shows that possible combinations of mask wearing percentages and mask efficiencies that would achieve this goal.

I admit that the time scale over which things will happen is somewhat of a guess. Not much reduction will be seen in the first week since many infected people have yet to show symptoms. As the graphic shows the reduction will depend on the percent of people complying with the order and the quality of face masks, which should be much better now than when the article was initially published. On the other hand large numbers of people congregating in bars without wearing face masks could negate the effort.

The effectiveness of masks in containing the virus is not just a theoretical result. There are a number of spectacular examples of success. In Hong Kong only four deaths due to COVID-19 have been recorded since the beginning of the pandemic. Hong Kong health authorities credit their citizens’ near universal mask-wearing as a key factor. Similarly, Taiwan ramped up mask production early on and distributed masks to the population, mandating their use in public transit and recommending their use in public places, a suggestion that was been widely complied with. Their death toll has been 6, and the schools have been open since early February.

While other countries have been smart, the US has not. Thanks in no small part to Trump’s decision to not wear a mask and to have large rallies where very few people wore them, the issue has become politicized. Recently the governor of Georgia sued the mayor of Atlanta to stop her from imposing a mask order. Each weekend in Raleigh, hundreds of young people crowd into restaurants and bars on Glendale South and there is not a mask in sight, a situation that occurs in many parts of the country. This behavior occurs because of the perception that young people rarely get sick and if they do get infected the symptoms are mild. However, in recent weeks 1/3 of the new infected have been under the age of 30.

As Dr. Fauci has Said when Trump has allowed him to be on TV, large gatherings in which face masks are not worn can lead to transmission of the virus from one asymptomatic person to another. It is difficult to determine the extent to which this occurs, but contact tracing data from North Carolina shows that 50% of symptomatic cases are caused by contact with an asymptomatic individual. Another sign of the invisible epidemic is that the CDC estimates that there have been 10 times as many cases as those that have been verified by a COVID-19 test.

Trump has recently worn a mask, and at his corona virus briefing on Tuesday July 21, uttered the words that everyone should wear a mask when they are in a situation where social distancing is impossible. The history of pandemic in America shows that people will not voluntarily do the right thing. It must be mandatory. The president could dramatically improve his chances of being re-elected by signing an executive order to make mask mandatory.

I hate to point the president to a road to re-election, but I do not want to see 90,000 more people die. The IHME web site

https://covid19.healthdata.org/united-states-of-america

projects 224,500 deaths by election day, while the CDC data shows that 140,000 have occurred as of July 21. To get re-elected, Trump must first admit stop lying about the pandemic. The US has 5% of the world’s population but the fraction of deaths that have occurred here is 140,000/617,000 = 22.7%, more than 4 times as many as a typical country. It does not have the lowest death rate in the world.

The US cannot reopen its economy or send students back to school five days a week with a pandemic raging in the streets. The crisis needs to be stopped now. It seems unlikely that sttes will go back into lockdown, so making masks mandatory is our only hope. If hospitalizations continue to spiral out of control (and they are NOT caused by our high level of testing) then the death toll could easily go higher than projected. In April when stay at home orders and other control measures were in place, the IHME projected death toll was roughly 70,000. This means that the premature re-opening of the economy has cost 150,000 lives. If we had followed the lead of Europe and dramatically reduce the number of cases before opening up the country things would be much better now but that opportunity is gone. We need to act now to prevent a complete disaster.

 

Pooled Tests for COVID-19

When one is dealing with a disease that is at a low frequency in the population and one has a large number of people to test, it is natural to do group testing. A fixed number of samples, say 10, are mixed together. If the combined sample is negative, we know all the individuals are. But if a group tests positive then all the samples in the group have to be retested individually.

If the groups are too small then not much work is saved. If the groups are too large then there are too many positive group tests. To find the optimal group size, suppose there are a total of  N individuals, the group size is k, and 1% of the population has the disease. The number of group tests that must be performed is N/k. The probability a group tests positive is k/100. If this happens then we need k more tests. Thus we want to minimize

(N/k)( 1 + k2/100) = N/k + Nk/100

Differentiating we want –N/k2 + N/100=0 or k = 10. In the concrete case N = 1000, the number of tests is 200.

Note: the probability a group test is positive is p = 1 – (1 – 1/100)k but this makes the optimization very messy. When k=10, 1 + kp = 1.956, so the answer does not change by very much.

Recent work reported on in Nature on July 10, 2020 shows that the number of tests needed can be reduced substantially if the individuals are divided into groups in two different ways for group testing before one has to begin testing individuals. To visualize the set-up consider a k by k matrix with one individual in each cell. We will group test the rows and group test the columns . An individual who tests negative in either test can be eliminated. The number of k by k squares is N/k2. For each square there are 2k tests that are always performed. Each of the k2 individuals in the square have their group test positive twice with probability (k/100)2. These events are NOT independent, but that does not matter in computing the expected number of tests

(N/ k2)(2k + k4/10,000) = 2N/k + N k2/10,000

Differentiating we want –2N/k2 + 2Nk/10,000 = 0 or k = (10,000)1/3 = 21.54. In the concrete case N=1000 the expected number of tests is 139.

Practical Considerations:

One could do fewer tests by eliminating the negative rows before testing the columns, but the  algorithm used here allows all the tests to be done at once, avoiding the need to wait for the first round results to come back before  the second round is done.

Larger group sizes will make it harder to detect the virus if only one individual in the group. The Nature article, Sigrum Smola of the Saarland University Medical Center in Homburg has been is quoted as saying he doesn’t recommend grouping more than 30 individuals in one test. Others claim that it is possible to identify the virus when there is one positive individual out of 100.

Ignoring the extra work in creating the group samples, the method described above reduces the cost of test by 86%. The price of $9 per test quoted in the article would be reduced to $1.26, so this could save a considerable amount of money for a university that has to test 6000 undergraduates several times in one semester.

In May, officials in Wuhan used a method of this type to test 2.3 million samples in two weeks.

References

Mutesa, L et al (2020) A strategy for finding people infected with SARS-CoV-2: optimizing pooled testing at low prevalence arXiv: 2004.14934

Malliapaty, Smriti (2020) The mathematical strategy that could transform coronavirus testing. Nature News July 10. https://www-nature-com/articles/d41586-020-02053-6

 

China is NOT to blame for the COVID-19 epidemic in the US

As President, Donald has told tens of thousands of lies. In many cases, he can hide behind the silence of his loyal supporters. However, when it comes to the coronavirus epidemic the details are on TV, in the press, and in publicly available databases for all the world to see.

One of his most egregious lies is that China is to blame for the epidemic. A May 20 story in USA today says “As the political rhetoric blaming China for the pandemic escalates, law enforcement officials and human rights advocates have seen an increasing number of hate crimes and incidents of harassment and discrimination against Asian Americans.” Trump has fanned these flames in his rallies, referring to the virus as the “Kung flu.”

One of the most incredible lies (i.e., too extraordinary and improbable to be believed) is that the corona virus was made in a laboratory in Wuhan. To protect this lie, the White House directed the National Institutes of Health to cancel funding for a project studying how coronaviruses spread from bats to people. The NIH typically only cancels active grant when there is scientific misconduct or improper financial behavior, neither of which it has occurred in this case. The PI on the grant, Dr. Peter Daszak, is President of EcoHealth Alliance, a US-based organization that conducts research and outreach programs on global health, conservation and international development. His research has been instrumental in identifying and predicting the origins and impact of emerging disease, which is very important for avoiding future pandemics..

Early Spread.  A special report published on July 5 in the New York Times gives new information about the early days of the epidemic. In mid-February the official case count was 15 but there is evidence of 2000 other infections. Given what we now know about the spread of the disease, it is natural to guess that many of these cases were asymptomatic. However, as explained in the paper cited in the next paragraph, part of the discrepancy was due to the fact that testing done before March 4, 2020 was only done for symptomatic patients who had recently traveled internationally.

This idea that the corona virus was widespread in the US in January 2020 was discussed in news stories about Alessandro Vepignani’s work. These appeared on the Northeastern web site in April, but the paper has only recently appeared on the medRxiv: Jessica T. Davis et al. Estimating the establishment of local transmission and the cryptic phase of the COVID-19 pandemic in the US. Their conclusions are based on the use of a rather complicated individual-based, stochastic and spatial epidemic model called  GLEAM (GLobal Epidemic and Mobility Model) that divides the global population 3200 subpopulations. See PNAS 166 (2009), 21484-21489 and J. Computational Science 1 (2010), 132-145 for more details.

Origins of the virus in the US.  Recently two genetic sequencing studies published online in Science, have investigated the origins of the corona virus in US. A.S. Gonzalez-Reiche et al, published on May 29, 2020 studied Introductions and early spread of SARS-CoV-2 in the New York City area. Phylogenetic analysis of 84 distinct SARS-CoV-2 genomes from samples taken February 29 – March 18 provided evidence for multiple, independent introduction . Phylogenetic analysis of full length genome sequences suggested that the majority of the introductions came from Europe and other parts of the United States.

A medRxiv preprint by M.T. Maurano et al which reports on the analysis of 864 SARS-CoV-2 sequences, reached the same conclusion: comparisons to global viral sequences showed that early strains was most likely linked to cases from Europe

Deng et al, published on June 8, 2020 studied the introduction of Sars-CoV-2 into Northern California. They studied 36 patients spanning 9 counties and the Grand Princess cruise ship using a method they called MSSPE (Metagenomic Sequencing with Spiked Primer Enrichment) to assemble genomes directly from clinical samples. Phylogenetic analysis described in detail in the paper indicated that 14 were associated with the Washington State WA1 lineage, 10 associated with a Santa Clara outbreak cluster (SCC1), 3 from a Solano County cluster, 5 related to lineages circulating in Europe but only 4 related to lineages from Wuhan. This precision comes from the fact that as of March 20, 2020 when this work was done, there were 789 worldwide genomes in the GISAID database. This wealth of data is possible because coronaviruses are unsegmented single-stranded RNA viruses that are about 30 kilobases in length.

The results in the last three paragraphs demonstrate that most lineages came from Europe not China. In hindsight the fact that Europe is the primary source of coronavirus is the US. Travel from China was banned February 2, but travel from Europe was only ended on March 13.

I have concentrated on the science.  I’ll leave it to you to decide if you want the Senate to vote for Thom Tillis’ 18-point plan in May “to hold China accountable” for what he says is its role in the coronavirus pandemic.

Tom Liggett 1944-2020

Tom Liggett passed away May 12, 2020 in Los Angeles at the age of 76. According to Wikipedia, Tom Liggett moved at the age of two with his missionary parents to Latin America, where he was educated in Buenos Aires (Argentina) and San Juan (Puerto Rico). He graduated from Oberlin College in 1965, where he was influenced towards probability by Samuel Goldberg, an ex-student of William Feller. He went to graduate school at Stanford, taking classes with Kai Lai Chung, and writing a PhD thesis in 1969 with advisor Samuel Karlin. Karlin had 44 students including my adviser Don Iglehart, which means I should call him Uncle Tom.

Tom’s first really impressive result was a 1971 paper with Mike Crandall on nonlinear semigroups.  At about the time this paper was written, Chuck Stone showed Tom a copy of Frank Spitzer’s 1970 paper on Interacting Particle Systems saying “I think you’ll find something interesting in this.” The rest, as they say, is history. In 1972 Tom wrote a paper proving the existence of interacting particle systems using the Hille-Yoshida theorem for linear semigroups. His St. Flour lecture notes published in 1977 helped spread the word about the field to a broad audience of probabilists.

His 1985 book, which has been cited more than 5000 times, helped grow interacting particle systems into a lively and vibrant area.  By the time he wrote his 1999 book, the field had grown so large that he concentrated only three examples: the contact process, voter model, and simple exclusion process. His books and papers are known for their clear and elegant proofs, though for those of us who are not as smart as he was they can take some effort to digest. A fuller account of Tom’s research can be found in the July 2008 article in the IMS Bulletin on his induction into the National Academy of Science.

Tom was an Associate Editor of the Annals of Probability from 1979-1984 and became its editor 1985-1987. He lectured at the International Congress in Math in 1986, gave the Wald Lectures in 1996, and was a Guggenheim fellow from 1997-1998. At UCLA he was administrative vice chair 1978-1981, chair 1991-1994, and undergraduate vice-chair 2004-2006. I departed from UCLA in 1985, but I remember Tom telling me once that you know you are doing a good job as chair if EVERYONE is mad at you.

Tom had only nine Ph.D. students: Norman Matloff (1975), Diane Schwartz (1975), Enrique Andjel (1980) , Dayue Chen (1989), Xijian Liu (1991),  Shirin Handjani (1993), Amber Puha (1998), Paul Jung (2003), and Alexander van den Berg-Rodes (2011). Despite the small number of children his family tree goes deep in Brazil. As they say in the bible Andjel begat Pablo Ferrari, who begat Fabio Machado who combined for a total of 34 descendants. The sociology of how students choose their advisers is mysterious, but as Amber Puha wrote in the report on Tom’s 75th birthday party,(June/July 2019 issue of the IMS Bulleting) she found him to be an ideal adviser.

Over Tom’s career he mentored numerous postdoctoral fellows and young researchers. A glance at his publications since 2000 https://www.math.ucla.edu/~tml/post2000pubs.html shows he had too many collaborators for me to list here. While I was at UCLA we talked a lot about math but we did not do much joint work since we had much different styles. It takes me months to years to pursue an idea to solve a problem, but Tom seemed to go from idea to solution and completed paper in a few days. One paper on which I could work at his speed was On the shape of the limit set in Richardson’s model. I was on my way to the Friday afternoon probability seminar at USC when Tom popped out of his office and said “it has a flat edge.”

Two of our other joint papers come from his solving a problem that I was working on with someone else, which I believe is a common occurrence on his publication list. We no longer have access to his agile mind, but you can still see it at work in 106 papers and his two books on interacting particle systems.

Moneyline Wagering

With the orange jackass (aka widdle Donnie) first declaring the coronova virus a hoax, then telling people to go ahead and go to work if you are sick, and only today tweeting that the people at the CDC are amazed at how much he knows about covid-19, it is time to have some fun.

Tonight (March 7) at 6PM in Cameron Indoor Stadium the Blue Devils with a conference record of 14-5 will take on the UNC Tarheels  who are 6-13 and will end up in lst place if they lose. If you go online to look at the odds for tonight’s Duke-UNC you find the curious looking

Duke -350

UNC 280

What this means is that you have to bet $350 on Duke to win $100, while if you bet $100 on UNC you win $280

Let p be the probability Duke wins.

For the bet on Duke to be fair we need 100p – 350(1-p) = 0 or p = 7/9 = 0.7777

For the bet on UNC to be fair we need -100p + (1-p)280= 0 or p = 0.7368

If 0.7368 < p < 0.7777 both bets are unfavorable.

This suggests that the a priori probability Duke wins is about 3/4.

Another way of looking at this situation is through the money. If a fraction x of people bet on Duke then

When Duke wins the average winnings are 100x – 100(1-x)

When UNC wins the average winnings are -350 x + 280 (1-x)

Setting these equal gives 200 x + 630 x = 100 + 280 or x = 38/83 = 0.4578

If this fraction of people bet on Duke then the average payoff from either wager is -500/83 = -$6.02 and the people who are offering the wager don’t care who wins.

Harry Kesten 1931-2019

 

Harry Kesten at Cornell in 1970 and in his later years

 

 

 

 

 

 

 

On March 29, 2019 Harry Kesten lost a decade-long battle with Parkinson’s disease. His passing is a sad event, so I would like to find solace in celebrating his extraordinary career. In addition I hope you will learn a little more about his work by reading this.

Harry was born in Duisburg Germany on November 19, 1931.  His parents escaped from the Nazis in 1933 and moved to Amsterdam. After studying in Amsterdam, he was a research assistant at the Mathematical Center there until 1956 when he came to Cornell. He received his Ph.D. in 1958 at Cornell University under supervision of Mark Kac.

In his 1958 thesis on Symmetric Random Walks, he showed that the spectral radius equals the exponential decay rate of the return to 0, and the latter is strictly less than 1 if and only if the group is non-amenable  This work has been cited 206 times and is his second most cited publication (according to MathSciNet). Harry was an instructor at Princeton University for one year and at the Hebrew University for two years before returning to Cornell, where he spent the rest of his career. While in Israel, he and Furstenberg wrote their classic paper on Products of Random Matrices.

In the 1960s, he wrote a number of papers that proved sharp or very general results on random walks, branching process, etc. One of the most famous of these is the 1966 Kesten-Stigum theorem which shows that a a normalized branching process Znn has a nontrival limit if and only if the offspring distribution has E(X log+ X) < ∞.  In 1966 he also proved a conjecture of Erdös and Szuzu about the discrepancy between the number of rotations of a point on the unit circle hitting an interval and its length. Foreshadowing his work in physics, he showed in 1963 that the number of self-avoiding walks of length n satisfied σn+2n → μ2 , where μ is the connective constant.

Harry’s almost 200 papers have been cited 3781 times by 2329 authors However, these statistics underestimate his impact. In baseball terms, Harry was a closer. When he wrote a paper about a topic, his results often eliminated the need for future work on it. One of Harry’s biggest weaknesses is that he was too smart. When most of us are confronted with a problem, we need to try different approaches to find a route through the woods to a solution. Harry simply got on his bulldozer and drove over all obstacles. He needed 129 pages in the Memoirs of the AMS to answer the question: “Which processes with stationary independent increments hit points?”, a topic he spoke about at the International Congress at Nice in 1970.

In 1984 Harry gave lectures on first passage percolation at the St. Flour Probability Summer School. This subject dates back to Hammersley’s 1966 paper and was greatly advanced by Smythe and Weirman’s 1978 book. However, Harry’s paper attracted a number of people to work on the subject and it has continued to be a very active area. See 50 years of First Passage Percolation by Auffinger, Damron, and Hanson for more details. You can buy this book from the AMS or download it from the arXiv.  I find it interesting that Harry lists only six papers on his Cornell web page. Five have already been mentioned. The sixth is On the speed of convergence in first-passage percolation, Ann. Appl. Probab. 3(1993), 296–338.

Harry has worked in a large number of areas. There is not enough space for a systematic treatment so I will just tease you with a list of titles. Sums of stationary sequences cannot grow slower than linearly. Random difference equations and renewal theory for products of random matrices. Subdiffusive behavior of a random walk on a random cluster. Greedy lattice animals. How long are the arms of DLA? If you want to try to solve a problem Harry couldn’t, look at his papers on Diffusion Limited Aggregation.

In the late 1990s, Maury Bramson and I organized a conference in honor of Harry’s 66 2/3’s  birthday. (We missed 65 and didn’t want to wait for 70.) A distinguished collection of researchers gave talks and many contributed to a volume of papers in his honor called Perplexing Problems in Probability. The 21 papers in the volume provide an interesting snapshot of research at the time. If you want to know more about Harry’s first 150 papers, you can read my 32 page summary of his work that appears in that volume.

According to math genealogy, Harry supervised 17 Cornell Ph.D. students who received their degrees between 1962-2003. Maury Bramson and Steve Kalikow were part of the Cornell class of 1977 that included Larry Gray and David Griffeath who worked with Frank Spitzer. (Fortunately, I graduated in 1976!). Yu Zhang followed in Harry’s footsteps and made a number of contributions to percolation and first passage percolation. I’ll let you use google to find out about the work Kenji Ichihara, Antal Jarai, Sungchul Lee, Henry Matzinger, and David Tandy.

Another ‘broader impact” of Harry’s work came from his collaborations with a long list of distinguished co-authors: Vladas Sidorovicius (12 papers), Ross Maller (10) , Frank Spitzer (8), Geoffrey Grimmett (7), Yu Zhang (7), Itai Benjamini (6), J.T. Runnenberg (5), Roberto Schonmann (4), Rob van den Berg (4), … I wrote 4 papers with him, all of which were catalyzed by an interaction with another person. In response to a question asked by Larry Shepp, we wrote a paper about an inhomogeneous percolation which was a precursor to work by Bollobas, Janson, and Riordan. Making money from fair games, joint work with Harry and Greg Lawler, arose from a letter A. Spataru wrote to Frank Spitzer. I left it to Harry and Greg to sort out the necessary conditions.

Harry wrote 3 papers with two very different versions of Jennifer Chayes. With a leather-jacketed Cornell postdoc, her husband Lincoln Chayes, Geoff Grimmett and Roberto Schonmann, he studied “The correlation length for the high density phase.” With the manager of the Microsoft research group, her husband Christian Borgs, and Joel Spencer he wrote two papers, one on the birth of the infinite component in percolation and another on conditions implying hyperscaling.

As you might guess from my narrative, Kesten received a number of honors. He won the Brouwer medal in 1981. Named after L.E.J. Brouwer it is The Netherlands’ most prestigious award in mathematics. In 1983 he was elected to the National Academy of Science. In 1986 he gave the IMS’ Wald Lectures. In 1994 he won the Polya Prize from SIAM. In 2001 he won the AMS’ Steele Prize for lifetime achievement.

Being a devout orthodox Jew, Harry never worked on the Sabbath. On Saturdays in Ithaca, I would often drive past him taking a long walk on the aptly named Freese Road, lost in thought. Sadly Harry is now gone, but his influence on the subject of probability will not be forgotten.

Jonathan Mattingly’s work on Gerrymandering

My last two posts were about a hurricane and a colonscopy, so I thought it was time to write about some math again.

For the last five years Mattingly has worked on a problem with important political ramifications: what would a typical set of congressional districts (say the 13 districts in North Carolina) look like if they were chosen at “random” subject to the restrictions that they contain a roughly equal number of voters, are connected, and minimize the splitting of counties. The motivation for this question can be explained by looking at the current congressional districts in North Carolina. The tiny purple snake is district 12. It begins in Charlotte goes up I40 to Greensboro and then wiggles around to contain other nearby cities producing a district with a large percentage of Democrats.

To explain the key idea of gerrymandering, suppose, to keep the arithmetic simple, that a state has 2000 Democrats and 2000 Republicans. If there are four districts and we divide voters

District           Republicans       Democrats

1                           600                            400

2                           600                            400

3                           600                            400

4                           200                            800

then Republicans will win in 3 districts out of 4. The last solution extends easily to create 12 districts where the Republicans win 9. With a little more imagination and the help of a computer one can produce the outcome of the 2016 election in North Carolina election in which 10 Republicans and 3 Democrats were elected, despite the fact that the split between the parties is roughly 50-50.

The districts in the North Carolina map look odd, and the 7th district in Pennsylvania (named Goofy kicks Donald Duck) look ridiculous, but this is not proof of malice.

Mattingly with a group of postdocs, graduate students, and undergraduates has developed a statistical approach to this subject. To explain this we will consider a simple problem that can be analyzed using material taught in a basic probability or statistics class. A company has a machine that produces cans of tomatoes. On the average the can contains a pound of tomatoes (16 ounces), but the machine is not very precise, so the weight has a standard deviation (A statistical measure of the “typical deviation” from the mean) of 0.2 ounces. If we assume the weight of tomatoes follows the normal distribution then 68% of the time the weight will be between 15.8 and 16.2 ounces. To see if the machine is working properly an employee samples 16 cans and finds an average weight of 15.7 pounds.

To see if something is wrong we ask the question: if the machine was working properly then what is the probability that the average weight would be 15.7 pounds or less. The standard deviation of one observation is 0.2 but the standard deviation of the average of 16 observations is 0.2/(16)1/2  = 0.005. The observed average is 0.3 below the mean or 6 standard deviations. Consulting a table of the normal distribution or using a calculator we see that if the machine was working properly then the probability of an average of 15.7 or less would occur with probability less than 1/10,000.

To approach the gerrymandering, we ask a similar question: if the districts were drawn without looking at party affiliation what is the probability that we would have 3 or fewer Democrats elected? This is a more complicated problem since one must generate a random sample from the collection of districts with the desired properties. To do this Mattingly’s team has developed methods to explore the space of possibilities and then making successive small changes in the maps. Using this approach one has make a large number of changes before you have a map that is `independent.” In a typical analysis they generate 24,000 maps. They found that using the randomly generated maps and retallying the votes, ≤3 Democrats were elected in fewer than 1% of the scenarios. The next graphic shows results for the 2012, 2016 maps and one drawn by judges.

Mattingly has also done analyses of congressional districts in Wisconsin and Pennsylvania, and has helped lawyers prepare briefs for cases challenging voting. His research has been cited in many decisions including the three judge panel who ruled in August 2018 that the NC congressional district were unconstitutional. For more details see the Quantifying Gerrymandering blog

https://sites.duke.edu/quantifyinggerrymandering/author/jonmduke-edu/

Articles about Mattingly’s work have appeared in

(June 26, 2018) Proceedings of the National Academy of Science 115 (2018), 6515–6517

(January 17, 2018)  Nature 553 (2018), 250

(October 6, 2017) New York Times  https://www.nytimes.com/2017/10/06/opinion/sunday/computers-gerrymandering-wisconsin.html

The last article is a good (or perhaps I should bad) example of what can happen when your work is written about in popular press. The article, written by Jorden Ellenberg is, to stay within the confines of polite conversation, simply awful. Here I will confine my attention to its two major sins.

  1. Ellenberg refers several times to the Duke team but never mentions them by name. I guess our not-so-humble narrator does not want to share the spotlight with the people who did the hard work. The three people who wrote the paper are Jonathan Mattingly, professor and chair of the department, Greg Herschlag, a postdoc, and Robert Ravier, one of our better grad students. The paper went from nothing to fully written in two weeks in order to get ready for the court case. However, thanks to a number of late nights they were able to present clear evidence of gerrymandering. It seems to me that they deserve to be mentioned in the article, and it should have mentioned that the paper was available on the arXiv, so people could see for themselves.
  2. The last sentence of the article says “There will be many cases, maybe most of them, where it’s impossible, no matter how much math you do, to tell the difference between innocuous decision making and a scheme – like Wisconsin’s – designed to protect one party from voters who might prefer the other.” OMG. With many anti-gerrymandering lawsuits being pursued across the country, why would a “prominent” mathematician write that in most cases math cannot be used to detect gerrymandering?

Two faces of Hurricane Florence

The title is supposed to evoke images of a woman who is a soccer mom by day and serial killing hooker by night. The coastal part of the state of North Carolina saw the second side of Florence. In Durham, which is 150 miles from Wilmington, we mostly  her softer side.

In the days leading up to Sunday September 9, the storm was hyped by the weather channel and the local news stations. Florence started as category 1 in the eastern half of the Atlantic Ocean but it was predicted that as she reached the warmer weather in the Atlantic, she would grow to category 4 and smash into coast somewhere in the Carolinas (and she did). The message of a coming disaster was reinforced by our neighborhood listserv. People who lived through Hurricane Fran in 1996, saw many trees go down, and lived a week or more without power, and they were not anxious for a repeat performance. Emails were filled with long lists of things we should do in order to prepare.

Monday September 10. At my house we had our semi-annual inspection of the heating/cooling system. It is hard to get work done while the technician is in the house, so I googled “How do you prepare for a hurricane?” The answers were similar to what I had seen on the listserv, with one new funny one: remove all of the coconuts from your trees, since they can become cannonballs in a storm. When the inspection was finished about 11:30 (with no repairs needed!), I went to the grocery store with my little list: water and nonperishable foods. The store was a zoo and water was flying off the shelves, but I came home with 48 quart bottles of Dasanti, canned fruit, canned soup, bread, peanut butter, energy bars, etc.

Tuesday September 11  was the 17th anniversary of a day that changed the world. Donald and Melania visited the new memorial to the plane that went down in Pennsylvania. I came down with a cold and stayed home from school. By this time, there was more weather than news on news. I recall hearing one weatherman pontificate “at this point we are as good with predictions 72 hours out as we used to be at 24 hours.” Then he showed a track that had the eye of a category 1 storm over Durham about five days later. Tuesday afternoon Duke announced that all classes were cancelled as of 5PM Wednesday. , with no classes on Thursday and Friday. UNC and NCState also closed and in a sign of looming disaster cancelled their football games. For UNC this was a blessing. Having lost to Cal and to the East Carolina in the two previous weeks, they were happy to escape from another loss at the hands of U of Central Florida.

Wednesday September 12. The morning weather forecast announced a dramatic change in the track of the storm. It was now predicted to turn left after land fall, hang out at the beach for a couple of days and then head off to Atlanta. The new storm track was great news for Durham but not for Myrtle Beach South Carolina which was given a short deadline to evacuate. Wednesday was more or less a normal day. I met with a graduate student at 11, had lunch, taught my class, and announced the shifted schedule for the homework and exams due to the cancellation of class on Friday.

Thursday September 13 was the calm before the storm.  My wife and I took a walk around the neighborhood in the morning, went out to a Mexcian Lunch at La Hacienda at the northern edge of Chapel Hill, and grilled some chicken for dinner. (In NC barbecuing refers to cooking a large piece of meat over a slow fire until you can pull it apart with your hands. Grilling gets the job done in 15-20 minutes.) Feeling more confident about the future we scaled back from cooking four chicken breasts to have leftovers that could be eaten over the next few days, to making only two.

Friday September 14. Florence made land fall at Wilmington as a category 2 hurricane, which is definitely more than half as strong as a category 4. It was amusing to see several weathermen competing to see who could be filmed in the eye of the hurricane, where the winds suddenly drop to 0. A woman on the neighborhood listserv described it as an eerie experience. Most of the other things that happened along the coast were not at all funny. New Bern was about 30 miles from the coast, but it was on the banks of the Neuse River. Rains plus hurricane winds resulted in flooding that left hundreds of people (who stupidly chose not to evacuate) in need of rescue. Up in Durham things were much more sedate. There was very little wind or rain. However since the weatherman had told us to shelter in place, we stayed in for most of the day, as did most of our neighbors.

Saturday September 15 was more or less a repeat of Friday. Twisting a line from Big Bang in which Penny is talking about here relationship with Leonard. “This is a new boring kind of hurricane.”   I don’t know what normal people do during a hurricane, but it is a great chance to get some work done. On Friday I read a couple of papers from the arXiv and thought up a new problem for one of my grad students to work on. Saturday I decided I would use the lull to finally finish up the 5th edition of Probability; Theory & Examples, so I sifted through emails I had saved from readers and corrected some typos. Not everything I do is math. I watched the third round of the Evian Championship, the fifth major of the LPGA season. In more keeping with more manly pursuits I watched parts of Duke’s 40-27 victory over Baylor in Waco, Texas. The win was remarkable because Duke’s starting quarterback, a junior who might make it to the pros, was out. In addition, I watched Texas beat the USC Trojans, 37-14. It’s not just that my son is a CS professor in Austin. Having been at UCLA for nine years, when Peter Carroll was there, I love to see USC lose.

Sunday September 16. Duke’s severe weather policy (which covers not only the university but also the hospitals) ended at 7AM, so we figured that we had reached the end of the hurricane. We got up and went to the Hope Valley Diner for breakfast at about 7:30. When we first started going there it was called Rick’s. However, the owner got tired of having a restaurant named after her ex-husband. In keeping with the intellectual climate of Durham, our usual waitress is a second year medical student at UNC Greensboro. Weather was light rain as it had been for the last couple of days, so we went to the Southpoint Mall to get out of the house. Being in a Christian region almost all the stores don’t open until noon. At the mall Susan bought some Crocs, and I bought a couple of books that I read before going to sleep. However, mostly we walked around like many of the families who came there with their small kids.

Monday September 17. Our hopes of nicer weather were quickly dashed. It was raining very hard when we woke up. Curiously the direction of the flow was now from SW to NE in contrast with the last few days of SE to NW. Almost immediately there was a tornado alert on TV accompanied with its loud obnoxious noise, and robophone call to tell us of the event, which came a few minutes after channel 14 told us that the warning had expired.  The tornado warning came from an area well north of us, so we weren’t really worried when soon there was a second one even further north. This was too much excitement for Duke, so they cancelled classes until noon, irritating people who traveled through awful weather to get to their 8:30AM classes. Soon after the despair of facing another day of rain set it, the sun came out and Susan and I took a walk.

Tuesday and Beyond. Unfortunately the end of the rain does not bring the end of the misery for people near the coast, as we learned with Hurricane Matthew. Many of the rivers there have large basins (of attraction). Many will only crest Wednesday or Thursday. The Cape Fear River will reach 60+ feet compared to its usual 20. But don’t worry. Trump will be coming soon to inspect the damage. When he came he clumsily read from a prepared statement, that soon we will be getting lots and lots of money. I guess his advisers didn’t tell him that the state is so heavily gerrymandered that he will probably see 10 Republican congressmen elected from 13 districts.

A Tale of Two Colonoscopies

I’ve had two colonoscopies; one on January 11, 2007 and one on March 13, 2018. In the spirit of an English writing assignment, I will compare and contrast the two experiences. In addition I will offer some advice that I think will be useful for those who have yet to have had the experience. Since I am a math professor you should not view this as medical advice.

The main event “Cleaning the Area for Viewing” has not changed much. On Day -1, you only have clear fluids (see below for definition) . At 3:00 PM you take some DulcoLax (stool softener). At 5:00PM you begin to drink from a mixture of 64 oz of Gatorade and one 255gram bottle of Miralax. Eight ounces every 15 minutes until it is gone. In 2007, I was amused to see that the directions on the can said to “never under any circumstances take more than one capful.” In 2018, the laxative comes in a brightly colored plastic container, which brags that it contains 14 daily doses.

At about 6PM the party gets started, and regular trips to the bathroom continue until about midnight, when I was brave enough to try to go to sleep. In 2007 there was nothing new to do the next day, except to drink clear liquids stopping two hours before the procedure. In 2018, there is a 10 ounce bottle of Magnesium Citrate to be drunk four hours before the procedure. Fortunately, this corresponds to the standard dose and it gets its magic done in less than 2 hours.

What is a clear liquid? In 2007 the list included Coffee and Tea (no milk or cream). In 2018 these items were gone leaving water, soft drinks, Gatorade, fruit juices without pulp, chicken or beef broth, plain jello, popsicles (no sherbert or fruit bars). In short “any fruit you can see through and has no pulp” is acceptable as long as it is not RED or PURPLE for obvious reasons. A colonscopy is not a test on which you want to get a false positive!

Clear liquids are to keep you hydrated, but also to give you enough calories to get through the day.(See disclaimer above.) In this regard, popsicles and jello are worthless since they have 10-30 calories. Thinking it might be some sort of substitute for coffee in the morning, I tried some canned chicken broth warmed in the microwave. But after I had a few sips I noticed that the can said it had 30 calories per serving, 30 of which were from fat. The white grape juice at 150 calories for 8 ounces was sickeningly sweet, but a good source of calories, as was some 80 calorie lemonade (which had no pulp but taste like plastic) and to a lesser extent non-diet soda.

The biggest change in the prep routine came from the rules for a restricted diet on days -5 to -2. In 2007 the rule was just do not eat nuts, seeds, popcorn and corn. By 2018 this list has gotten huge. No non-tender meats, gristle, hot dogs, salami, cold cuts. No raw vegetables or salads, no artichokes, asparagus, broad beans, broccoli, Brussel sprouts, cabbage, cauliflower, mushrooms, onions, peas, sauerkraut, spinach, summer or winter squash, tomatoes, zucchini. No raw fruit (except for bananas), canned fruit, dried fruit, berries, melons, cranberry sauce, avocado, coconut. No bread with whole wheat, etc, etc

In short, you can eat tender cooked fish, poultry, and meat, served with green beans, cooked carrots, beets, apple sauce, ripe bananas, and cooked fruit (peaches, pears, apricots, and apples) if the skin has been removed. The exclusions listed above will test your culinary creativity. Only refined pasta is allowed but I interpreted this to mean that Stouffer’s Fettucini Alfredo was OK. While cauliflower was off the list I figured that it was OK to eat mashed cauliflower from the frozen foods aisle, which Oprah peddles as low calorie alternative to mashed potatoes. Rachel Ray’s recipe for cooked carrots are popular according to her internet site but went over like a Lead Zeppelin.

To try to end this rant on a happy note, let me talk about Day 0. As many veterans of the procedure will tell you, after going through the prep on Day -1, and now a low fiber diet on Days -5 to -2, the procedure is not bad at all. One of the reasons for this is that they give you something that makes you forget the whole thing.  I once made a joke at a conference in Canada that this amnesia makes the procedure more fun than a faculty meeting. After the talk, a faculty member from York came up and told me that in Canada they don’t give you that drug. Damn socialized medicine.

I think that the drug they give you now has changed. In 2008 it was something like Rohypnol (aka roofies, the date rape drug). My wife Susan took one of her friends, Toni, to get her colonoscopy. One of the first things Toni said after the procedure was that Susan should see the movie Awakenings. Then a few minutes later she said it again, and then again, and again. In 2007, this made me very anxious about the procedure. I was afraid that after it was over would I suddenly wax philosophical about a woman I had me that had a “balcony you could do Shakespeare from.”

In 2018 they gave me Fentanyl. Yes that is the opioid you have heard about the news that is more deadly than heroin, but the nurse was giving me the injection. Wikipedia says it “is an opioid that is used as a pain medication and together with other medications for anesthesia. It has a rapid onset and effects generally last less than an hour or two.” It had the desired effect during the procedure, but when I left the office I was clear headed enough to give Susan driving directions to get home from a very unfamiliar part of route 54, where curiously 234 and 249 are on the same side of the street.

Hopefully, reliving my experiences has been amusing and told novices more about what to expect. This time the post has a bit if a point or to be a precise a small question for doctors: “The addition of the four days of pre-prep undoubtedly makes diagnoses more accurate but does that justify the time spent on a very restrictive and unpleasant low-fiber diet?” Couldn’t we compromise on two days, if I promise that everyone in the country will follow the directions.