A 0-1 law for eclipses

A 0-1 law in probability is a result that says in certain situations, for example, when we consider the asymptotic behavior of sums of independent and independently distributed random variables or the short time behavior of Brownian motion, then all events are trivial, i.e., have probability 0 or 1.

Yesterday I learned that law applies to eclipses. For months we have been told that on August 21, 2017 in Durham there would be solar eclipse that will at its peak at 2:45PM it will cover 93% of the sun. That turns out to be about as exciting as being 93% pregnant or have 93% of a proof. The shade of the trees in our front yard seemed a little darker but the sky never did. Turns out that having 7% of the sun exposed is more than enough to be able to see well.

About a month ago I ordered “eclipse glasses” from Amazon, so I could look at the sun without burning my retinas. However, as I learned about a week ago, the glasses were advertised as  ISO 12312-2 certified, but they were not. Amazon was the one who told me and they sent me a refund, but I ended up without glasses. On the big day I made myself a pinhole viewer by sticking the point of a pencil through a note card. When I held it out I did see a light spot on the ground that looked like a circle with a piece missing, but then I wondered if it was due to the fact that my hole was not round. However, soon after that moment of doubt, I noticed that there were a large number of crescent shaped light objects on the ground. In short, the overlaps between leaves in the trees that made hundreds of pinhole cameras. For the long version see:

https://petapixel.com/2012/05/21/crescent-shaped-projections-through-tree-leaves-during-the-solar-eclipse/

For a few minutes I wandered around looking at the light shapes on my driveway and in the street in front of my house before I got bored and went in, leaving my neighbors to wonder no doubt what I was doing wandering around in the street holding my smart phone.

In summary, when 2024 rolls around and the eclipse goes from Texas to Maine, either get yourself to where the eclipse is total or take off for Myrtle Beach where the moon will not block the sun and hotel rooms will be discounted.

Rereading Thurston: What is a proof?

As regular readers of this blog can guess, the inspiration for writing this column came from yet another referee’s report which complained that in my paper “the style of writing is too informal.” Fortunately for you, the incident reminded me of an old article written by Bill Thurston and published in the Bulletin of the AMS in April 1994 (volume 30, pages 161-177) and that will be my main topic.

The background to our story begins with the conjecture Poincare made in 1900, which states that every simply connected, closed 3-manifold is homeomorphic to the 3-sphere (i.e., the boundary of the ball in four dimensional space). As many of you already know, after nearly a century of effort by mathematicians, Grigori Perelman presented a proof of the conjecture in three papers made available in 2002 and 2003 on arXiv. He later turned down a Field’s medal in 2006 and a $1,000,000 prize from the Clay Mathematics Institute in 2010.

Twenty years before the events in the last paragraph Thurston’s stated his geometrization conjecture. It is an analogue of the uniformization theorem for two-dimensional surfaces, which states that every connected Riemann surface can be given one of three geometries (Euclidean, spherical, or hyperbolic). Roughly, the geometrization conjecture states that every closed three manifold can be decomposed in a canonical way into pieces that each have one of eight types of geometric structure.

In the 1980s Thurston published a proof in the special case of “Haken manifolds.” In a July 1993 article in the Bulletin of the AMS (volume 29, pages 1-13)  Arthur Jaffe and Frank Quinn criticized his work as “A grand insight delivered with beautiful but insufficient hints. The proof was never fully published. For many investigators this unredeemed claim became a roadblock rather than an inspiration.”

This verbal salvo was launched in the middle of an article that asked the question “Is speculative mathematics dangerous? Recent interactions between physics and mathematics pose the question with some force: traditional mathematical norms discourage speculation, but it is the fabric of theoretical physics.” They go on to criticize work being done in string theory, conformal field theory, topological quantum field theory, and quantum gravity.  It seems to me that some of these subjects saw spectacular successes in the 21st century, but pursuing that further would take me away from my main topic.

In what follows I will generally use Thurston’s own words but will edit them for the sake of brevity. His 17 page article is definitely worth reading in full. He begins his article by saying “It would NOT be good to start with the question

How do mathematicians prove theorems?

To start with this would be to project two hidden assumptions: (1) that there is uniform, objective and firmly established theory and practice of mathematical proof, and (2) that progress made by mathematicians consists of proving theorems.”

Thurston goes on to say “I prefer: How do mathematicians advance human understanding of mathematics?”

“Mathematical knowledge can be transmitted amazingly fast within a sub-field. When a significant theory is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another. The same proof would be communicated and generally understood in an hour talk. It would be the subject of a 15 or 20 page paper which could be read and understood in a few hours or a day.

Why is there such a big expansion from the informal discussion to the talk to the paper. One-on-one people use gestures, draw pictures and make sound effects. In talks people are more inhibited and more formal. In papers people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.

People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on, the same patterns are not very illuminating; they are often even misleading”

Turning to the topic in our title, Section 4 is called what is a proof? Thurston’s philosophy here is much different from what I was taught in college. At Emory you are not allowed to quote a result unless you understand its proof.

“When I started as a graduate student at Berkeley … I didn’t really understand what a proof was. By going to seminars, reading papers and talking to other graduate students I gradually began to catch on. Within any field there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper you refer to these without proof. You look at other papers and see what facts they quote without proof, and what they cite in their bibliography. Then you are free to quote the same theorem and cite the same references. Many of the things that are generally known are things for which there may be no written source. As long as people in the field are comfortable the idea works, it doesn’t need to have a formal written source.”

“At first I was highly suspicious of this process. I would doubt whether a certain idea was really established. But I found I could ask people, and they could produce explanations or proofs, or else refer me to other people or two written sources. When people are doing mathematics, the flow of ideas and the social standard of validity is much more reliable than formal documents. People are not very good in checking formal correctness of proofs, but they are quite good at detecting potential weaknesses or flaws in proofs.”

There is much more interesting philosophy in the paper but I’ll skip ahead to Section 6 on “Personal Experiences.” There Thurston recounts his work on the theory of foliations. He says that the results he proved were documented in a conventional formidable mathematician’s style but they depended heavily on readers who shared certain background and certain insights. This created a high entry barrier. Many graduate students and mathematicians were discouraged that it was hard to learn and understand the proofs of key theorems.

Turning to the geometrization theorem: “I’d like to spell out more what I mean when I say I proved the theorem. I meant that I had a clear and complete flow of ideas, including details, that withstood a great deal of scrutiny by myself and others. My proofs have turned out to be quite reliable. I have not had trouble backing up claims or producing details for things I have proven. However, there is sometimes a huge expansion factor in translating from the encoding in my own thinking to something that can be conveyed to someone else.”

Thurston goes on to explain that his result went against the trends in topology for the preceding 30 years and it took people by surprise. He gave many presentations to groups of mathematicians but “at the beginning, the subject was foreign to almost everyone … the infrastructure was in my head, not in the mathematical community.” At the same time he began writing notes on the geometry and topology of 3-manifolds. The mailing list for these notes grew to about 1200 people. People ran seminars based on his notes and gave him lots of feedback. Much of it ran something like “Your notes are inspiring and beautiful, but I have to tell you that in our seminar we spent 3 weeks working out the details of …”

Thurston’s description of the impact his work had on other fields, is I sharp contrast to Jaffe and Quinn’s assessment. To see who was right I turned to Wikipedia which says “The geometrization theorem has been called Thurston’s Monster Theorem, due to the length and difficulty of the proof. Complete proofs were not written up until almost 20 years later. (which would be 2002, almost 10 years after jaffe-Quinn). The proof involves a number of deep and original insights which have linked many apparently disparate fields to 3-manifolds.”

Thurston was an incredible genius. He wrote only 73 papers but they have been cited 4424 times by 3062 different people. His career took him from Princeton to Berkeley, where he was director of MSRI for several years, then to Davis, and ended his career at Cornell 2003-2012. I never really met him but I could sense the impact he had on the department. Sadly he died at the age of 65 as a result of metastatic melanoma. A biography and reminiscences’ can be found in the

http://www.ams.org/notices/201511/rnoti-p1318.pdf

A Rainy Monday in Austin

 

After two great days of hiking, sightseeing, eating and drinking with my younger son (a CS assistant professor at UT Austin) and his girl friend who works for My Fitness Pal, the 98 degree heat was replaced by a steady rain. Trapped inside our hotel room, my wife read the New York Times and did a crossword puzzle, while I wrote a couple of referee’s reports on papers that were worse than the weather.

While it is not fun to be forced inside by the rain it is a good time to reflect on what I’ve seen while visiting Austin. Saturday afternoon we went to the LBJ museum on the UT campus. He served as president for five years after JFK was assassinated in 1963. Before that he was elected to the House or Representatives in 1937 and to the Senate in 1948.

His War on Poverty helped millions of Americans rise above the poverty line during his administration. Civil rights bills that he signed into law banned racial discrimination in public facilities, interstate commerce, the workplace, and housing. The Voting Rights Act prohibited certain laws southern states used to disenfranchise African Americans. With the passage of the Immigration and Nationality Act of 1965, the country’s immigration system was reformed, encouraging greater immigration from regions other than Europe. In short, the Republican agenda times -1.

On Sunday afternoon, we went to the Bullock Texas State History Museum. The most interesting part for me was the story of Texas in the early 1800s. In 1821 Texas won its independence from Spain and became part of Mexico. Between 1821 and 1836 an estimated 38,000 settlers, on promise of 4,000 acres per family for a small fee, trekked from the United States into the territory. The Mexican government grew alarmed at the immigration threatening to engulf the province. Military troops were moved to the border to enforce the policy but illegal immigrants crossed the border easily. Hopefully the parallel with the current situation ends there, since there were revolts in Texas 1832, leading to war with Mexico in 1834, and to the independence of Texas in 1836.

My third fun fact is a short one: Austin City Limits was a TV show for 40 before it became a music festival. Haven’t seen either one but Austin is a great place to visit.

Duke grads vote on Union

According to the official press release: “Of the 1,089 ballots cast, 691 voted against representation (“NO”) by SEIU and 398 for representation by SEIU (“YES”). There were, however, 502 ballots challenged based on issues of voter eligibility. Because the number of challenged ballots is greater than the spread between the “YES” and “NO” votes, the challenges could determine the outcome and will be subject to post-election procedures of the NLRB.”

The obvious question is what is the probability this would change the outcome of the election? If the NO’s lose 397 votes and hence the YES lose 015 on the recount the outcome will be 294 NO, 293 YES. A fraction 0.6345 of the votes were NO. We should treat this as an urn problem but to get a quick answer you can suppose the YES votes lost are Binomial(502,0.3655). In the old days I would have to trot out Stirling’s formula and compute for an hour to get the answer but now all I have to do is type into my vintage TI-83 calculator

Binompdf(502,0.3655,105) = 2.40115  X 10-14

i.e., this is the probability of fewer than YES votes lost.

Regular reader of this blog will remember that I made a similar calculation to show that there was a very small probability that the 62,500 provisional ballots would change the outcome of the North Carolina election since before they were counted Cooper had a 4772 vote lead over McCrory. If we flip 62,500 coins then the standard deviation of the change in the number of votes is

{62,500(1/4) 1 / 2 = 125

So McCrory would need 33,636 votes = 2386 above the mean = 19.08 standard deviations. However, as later results showed this reasoning was flawed: Cooper’s lead to a more than 10,000 votes. This is due to the fact that, as I learned later, provisional ballot have a greater tendency to be Democratic while absentee ballots tend to be Republican.

Is this all just #fakeprobability? Let’s turn to a court case de Martini versus Power. In a close electionin a small town, 2,656 people voted for candidate A compared to 2,594 who voted for candidate B, a margin of victory of 62 votes. An investigation of the election found that 136 of the people who voted in the election should not have. Since this is more than the margin of victory, should the election results be thrown out even though there was no evidence of fraud on the part of the winner’s supporters?

In my wonderful book Elementary Probability for Applications, this problem is analyzed from the urn point of view. Since I was much younger when I wrote the first version of its predecessor in 1993, I wrote a program to add up the probabilities and got 7.492 x 10 -8. That computation supported the Court of Appeals decision to overturn a lower court ruling that voided the election in this case.If you want to read the decision you can find it at

http://law.justia.com/cases/new-york/court-of-appeals/1970/27-n-y-2d-149-0.html

Jordan Ellenberg don’t know stat

A couple of nights ago I finished John Grishan’s the Rouge Lawyer so I started reading Jordan Ellenberg’s “How not to be wrong. The power of mathematical thinking.” The cover says “a math-world superstar unveils the hidden beauty and logic of the world and puts math’s power in our hands.”

The book was only moderately annoying until I got to page 65. There he talks about statistics on brain cancer deaths per 100,000. The top states according to his data are South Dakota, Nebraska, Alaska, Delaware, and Maine. At the bottom are Wyoming, Vermont, North Dakota, Hawaii and the District of Columbia.

He writes “Now that is strange. Why should South Dakota be brain cancer center and North Dakota nearly tumor free? Why would you be safe in Vermont but imperiled in Maine.”

“The answer: … The five states at the top have something in common, and the five states at the bottom do too. And it’s the same thing: hardly anyone lives there.” There follows a discussion of flipping coins and the fact that frequencies have more random variation when the sample size is small, but he never stops to see if this is enough to explain the observation.

My intuition told me it did not, so I went and got some brain cancer data.

https://www.statecancerprofiles.cancer.gov/incidencerates/

In the next figure the x-axis is population size, plotted on a log scale to spread out the points and the y-axis is the five year average rate per year per 100,000 people. Yes there is less variability as you move to the right, and little Hawaii is way down there, but there are also some states toward the middle that are on the top edge. The next plots shows 99% confidence intervals versus state size. I used 99%  rather than 95% since there are 49 data points (nothing for Nevada for some reason).

brain_cancer_fig1

In the next figure the horizontal line marks the average 6.6. The squares are upper end points of the confidence intervals. When they fall below the line, this suggests that the mean is significantly lower than the national average. From left to right: Hawaii, New Mexico, Louisiana and California. When the little diamond marking the lower end of the confidence interval is above the line, we suspect that the rate for that state is significantly higher than the mean. There are eight states in that category: New Hampshire, Iowa, Oregon, Kentucky, Wisconsin, Washington, New Jersey, and Pennsylvania.

brain_cancer_fig2

So yes there are 12 significant deviations from the mean (versus 5 we would get if all 49 states had mean 6.6)  but they are not the ones at the top or the bottom of the list, and the variability of the sample mean has nothing to do with the explanation. So Jordan, welcome to world of APPLIED math, where you have to look at data to test your theories. Don’t feel bad the folks in the old Chemistry building at Duke will tell you that I don’t know stat either.  For aa more professional look at the problem see

http://www.stat.columbia.edu/~gelman/research/published/allmaps.pdf

North Carolina Gubernatorial Election

Tuesday night after the 4.7 million votes had been counted from all 2704 precincts Roy Cooper had a 4772 vote lead over Pat McCrory. Since there could be as many as 62,500 absentee and provisional ballots, it was decided to wait until these were counted to declare a winner. The question addressed here is: What is the probability that the votes will change the outcome?

The do the calculation we need to make an assumption:  the addition votes are similar to the overall population so they are like flipping coins. In order to change the outcome of the election Cooper would have to get fewer than 31,250 – (4772)/2 = 28,864 votes. The standard deviation of the number of heads in 62,500 coin flips is (62,250 x ¼) 1 / 2 = 125, so this represents 19.09 standard deviations below the mean.

One could use be brave and use the normal approximation. However, all this semester while I have been teaching Math 230 (Elementary Probability) people have been asking why do this when we can just use our calculator?

Binomcdf(40000, 0.5, 28864) = 1.436 x 10-81

In contrast if we use the normal approximation with the tail bound (which I found impossible to type using equation editor) we get 1.533 x 10-81.

We can’t take this number too seriously since the probability our assumption is wrong is larger than that but it suggests that we will likely have a new governor and House Bill 2 will soon be repealed.

Teaching Statistics using Donald Trump.

Recently, in Pennsylvania Donald Trump said “The only way they can beat me in my opinion, and I mean this 100 percent, if in certain sections of the state they cheat.”  He never said how he determined that. If it is on the basis of the people he talked to as he campaigned, he had  a very biased sample.

At about the time of Trump’s remarks, there was a poll showing 50% voting for Clinton, 40.6% for Trump and others undecided or not stating an opinion. Let’s look at the poll result through the eyes of an elementary statistic class. We are not going to give a tutorial on that subject here, so if you haven’t had the class, you’ll have to look online or ask a friend.

Suppose we have 8.2 million marbles (representing the registered voters in PA) in a really big bowl. Think of one of those dumpsters they use to haul away construction waste. Suppose we reach in and pick out 900 marbles at random, which is the size of a typical Gallup poll. For each blue Hillary Clinton marble we add 1 to our total, for each red Donald Trump marble we subtract 1, and for each white undecided marble we add 0.

The outcomes of the 900 draws are independent. To simplify the arithmetic, we note that since our draws only take the values -1, 0, and 1 they have variance less than 1. Thus when add up the 900 results and divided by 900 the standard deviation of the average is (1/900)1/2 = 1/30. By the normal approximation (central limit theorem) about 95% of the time the result will be within 2/30 = 0.0666 of the true mean. In the poll results above the average is 0.5-0.406 = 0.094, so by Statistics 101 reasoning we are 95% confident that there are more blue marbles than red marbles in the “bowl.”

That analysis is over simplified in at least two ways. First of all, when you draw a marble out of the bowl you get to see what color is. If you ask a person who they are going to vote for then they may not tell you the truth. It is for this reason that use of exit polls have been discontinued. If you ask people how they voted when they leave the polling place, what you estimate is the fraction of blue voters among those willing to talk to you, not the faction of people who voted for blue. A second problem with our analysis is that people will change their opinions over time.

A much more sophistical analysis of polling data can be found at FiveThirtyEight.com, specifically at http://projects.fivethirtyeight.com/2016-election-forecast/ There if you hover your mouse over on Pennsyllvania (today is August 16) you find that Hillary has an 89.3% chance of winning Pennsylania versus Donald Trump’s 10.7%, which is about the same as the predictions for the overall winner of the election.

The methodology used is described in detail at

http://fivethirtyeight.com/features/a-users-guide-to-fivethirtyeights-2016-general-election-forecast/

In short they use a weighted average of the results of about 10 polls with weights based on how well the polls have done in the past. In addition they are conservative in the early going, since surprises can occur.

Nate Silver, the founder of 538.com, burst onto the scene in 2008 when correctly predicted the way 49 of 50 state voted in the 2008. In 2012, while CNN was noting Obama and Romney were tied at 47% of the popular vote, he correctly predicted that Obama receive more than 310 electoral votes, and easily win the election.

So Donald, based on the discussion above, I can confidently say that  no cheating is needed for you to lose Pennsylvania. Indeed, at this point in time, it would take a miracle for you to win it.

The odds of a perfect bracket are roughly a billion to 1

This time of year it is widely quoted that odds of picking a prefect bracket are 9,2 quintillion to one. In scientific notation that is 9.2 x 1018 or if you like writing out all the digits it is 9,223,372,036,854,775,808 to 1. That number is 263, i.e., the chance that you succeed if you flip a coin to make every pick.

If you know a little then you can do much better than this, by say taking into account the fact that a 16 seed has never beaten a one-seed. In a story widely quoted last year “Duke math professor Jonathan Mattingly calculated the odds of picking all 32 games correctly is actually one in 2.4 trillion.” He doesn’t give any details, but I don’t know why I should trust a person who doesn’t know there are 63 games in the tournament.

Using a different approach, DePaul mathematician Jay Bergen  calculated the odds at one in 128 billion. His youtube video from four years ago https://www.youtube.com/watch?v=O6Smkv11Mj4 is entertaining but light on details.

Here I will argue that the odds are closer to one billion to 1. The key to my calculation of the probability of a perfect bracket is use data from outcomes of the first round games for 20 years of NCAA 64 team tournaments. The columns give the match up, the number of times the two teams won and the percentage

1-16                 80-0                 1

2-15                 76-4                 0.95

3-14                 67-13               0.8375

4-13                 64-16               0.8

5-12                 54-26               0.675

6-11                 56-24               0.7

7-10                 48-32               0.6

8-9                   37-43               0.5375

From this we see that if we pick the 9 seed to “upset” the #8 but in all other case pick the higher seed then we will pick all 8 games correctly with probability 0.09699 or about 0.1, compared to the 1/256 chance you would have by guessing.

Not having data for the other seven games, I will make the rash but simple assumption that picking these seven games is also 0.1. Combining our two estimates, we see that the probability of perfectly predicting a regional tournament is 0.01. All four regional tournaments can then be done with probability 10-8. There are three games to pick the champion from the final four. If we simply guess at this point we have a 1 in 8 chance ad a final answer of about 1 in a billion.

To argue that this number is reasonable, lets take a look at what happened in the 2015 bracket challenge. 320 points are up for grabs in each round: 10 points for each 32 first round games (the play in or “first four games” are ignored), 20 for each of the 16 second round games, and so on until picking the champion gives you 320 points. The top ranked bracket had

27 x 10 + 14 x 20 + 8 x 40 + 4 x 80 + 2 x 160 + 1 x 320 = 1830 points out of 1920.

This person missed 5 first round and 2 second round games. There are a number of other people with scores of 1800 or more, so it is not too far fetched to believe if the number of entries was increased by 27 = 128 we might have a perfect bracket. The last calculation is a little dubious but if the true odds were 4.6 trillion to one or event 128 billion to 1, it is doubtful one of 11 million entrants would get this close.

With some more work one could collect data on how often an ith seed beats a jth seed when they meet in a regional tournament or perhaps you could convince ESPN to see how many of its 11 million entrants managed to pick a regional tournament correctly. But that is too much work for a lazy person like myself on a beautiful day during Spring Break.

Probability and the Florida Lottery

The usual probability story in this context is something like the following. A New Jersey woman, Evelyn Adams, won the lottery twice within a span of four months raking in a total of 5.4 million dollars. She won the jackpot for the first time on October 23, 1985 in the Lotto 6/39 in which you pick 6 numbers out of 39. Then she won the jackpot in the new Lotto 6/42 on February 13, 1986. Lottery officials calculated the probability of this as roughly one in 17.1 trillion, which is probability that one preselected person won the lottery on two preselected dates.

When one realizes that (i) somebody won the October 23, 1985 lottery. (ii) We would have been equally impressed if this happened twice within a one year period. (100 twice weekly drawings) (iii) Many people who play the lottery buy more than one ticket. Taking these three things into account he probability ends up to be about is now about 1/200. If we take into account the number of states with lotteries. For more examples of things that aren’t as surprising as they seem look at http://www.math.duke.edu/~rtd/Talks/Emory.pdf.

A recent paper on the arXiv:1503.02902v1 by Rich Arratia, Skip Garibaldi, Lawrence Mower, and Philip B. Stark tells a different type of story. In Florida’s Play 4 game you pick a four digit number like 3782 and if all four digits match you win $5000. The fact that this event has probability 1/10000 and hence nets you 0.50 average, says either that (i) people can’t think or (ii) they have utility functions that value a large sum disproportionately more than the $1 you use to play the game.

Some people however are very good at winning this gamble. An individual that we will call LJ has won 57 times. Now that by itself is not proof of guilt. If he bought 570,000 tickets he would end up with about this many wins. However that seems a little unlikely. If he only bought 250,000 tickets the probability of 57 wins is 1.22 x 10-8. (Exercise for the reader.)

Arratia et al give a very nice calculation that shows something funny must be going on. Skipping the math, the bottom line is that if the 19 million people that live in Florida all sold their houses, and took the $175,000 in proceeds (this is the average house value) and bought lottery tickets (reinvesting the winnings) until they ran out of money, the probability that someone would win 57 times or more is 1 in a million.

How did LJ get so lucky? Well there are three common schemes. (i) A clerk can scratch the ticket with a pin revealing enough of the bar code to be able to scan it to see if it is a winner. (ii) Sometimes a customer will ask the clerk if the ticket was a winner. If so the clerk may lie about the ticket being a winner and keep the money himself. (iii) Sometimes the winner may be an illegal immigrant or owe child support or back taxes, and will sell the ticket to an aggregator who pays half price for it and later claims the prize. This is a good scheme for people who want to launder money.

It would be nice if I could tell you that probability helped catch a criminal but at least it wasn’t involved in a miscarriage of justice like Sally Clark experienced. She was convicted of murder based on the calculation that the odds were 73 million to 1 against two of her children dying of what is called cot death in the UK.

Two movies about Alan Turing

The Imitation Game (IG) is a great movie which has brought a lot of attention to Alan Turing but if you like it, you should also watch the 2013 film Codebreaker (CB), which can be streamed on Netflix. Remarkably these two films give almost disjoint accounts of his life. I guess at this point I should give a SPOILER ALERT that I am about to describe some pivotal events in his life, which are revealed in the movie. A couple of the revelations might spoil your enjoyment but you have been warned.

CB spends a fair amount of time on Turing’s work on computability. The movie even shows a copy of the 1928 article by David Hilbert’s paper “On Computable Numbers, with an Application to the Entscheidungsproblem (Decision Problem).” It goes into some detail describing what a Turing machine is. As many of you know, he proved that such a machine would be capable of performing any conceivable mathematical computation if it were representable as an algorithm. The movie doesn’t go on to mention that Turing showed that the halting problem for Turing machines was undecidable, but that’s OK since Von Neumann acknowledged that the central concept of the modern computer was due to this paper. Not a bad result for an undergraduate at King’s College.

IG spends a lot of telling the story of Turing’s role in decrypting messages sent by the Enigma machine. At Bletchley Park, Turing built an electromechanical machine, the bombe, that could help break Enigma more effectively than the Polish bomba kryptologiczna, from which its name was derived.

The bombe searched for possible correct settings used for an Enigma message (i.e. rotor order, rotor settings and plugboard settings), using a suitable crib: a fragment of probable plaintext. For each possible setting of the rotors, which had of the order of 10 19 states, or 1022 for the four-rotor U-boat variant, the bombe performed a chain of logical deductions based on the crib, implemented electrically. It detected when a contradiction had occurred, and ruled out that setting, moving on to the next. A brute force search of such a large space was not practical. According to the movie, a breakthrough came when they realized that the Germans sent out a weather report each day at 6AM that ended with the phrase Heil Hitler.

Both movies mention the fact that in 1941, Turing proposed marriage to Hut 8 co-worker Joan Clarke (played by Keira Knightley in IG), a fellow mathematician and cryptanalyst, but their engagement was short-lived. After admitting his homosexuality to his fiancée, who was reportedly “unfazed” by the revelation, Turing decided that he could not go through with the marriage. (In real life Joan Clark was somewhat less attractive.)

Despite some scenes of Turing running long distances (in CB if I recall correctly) neither movie mentions that while working at Bletchley, Turing, who was a talented long-distance runner, occasionally ran the 40 miles to London when he was needed for high-level meetings. In addition, he was capable of world-class marathon standards. Turing tried out for the 1948 British Olympic team. His time for the marathon was only 11 minutes slower than British silver medalist Thomas Richards’ Olympic race time of 2 hours 35 minutes.

Going back in time, a third key event in Turing’s life occurred at Sherborne School, which Turing entered in 1926 at the age of 13. At Sherborne, Turing formed an important friendship with fellow pupil Christopher Morcom, which provided inspiration in Turing’s future endeavours. However, the friendship was cut short by Morcom’s death in February 1930 from complications of bovine tuberculosis contracted after drinking infected cow’s milk some years previously. This event shattered Turing’s religious faith and he became an atheist

Neither movie has anything to say about his ground breaking paper on The Chemical Basis of Morphogensis published in 1952, which put forth his ideas about pattern formation in development. However, both movies cover the fact that in January 1952, Turing, then 39, started a relationship with Arnold Murray, a 19-year old unemployed man. A burglary brought the police to his house, the police discovered their relationship, and the fact that a valuable watch was missing was forgotten.

Homosexual acts were criminal in the UK at that time.  Turing was convicted and given a choice between imprisonment and probation, which would be conditional on his agreement to undergo hormonal treatment designed to reduce libido. He accepted the option of treatment via injections of stilboestrol, (CB shows you the bottle of tablets), a synthetic estrogen. The treatment rendered Turing impotent and caused gynaecomastia, growing female breasts.

On 8 June 1954, Turing’s housekeeper found him dead. He had died the previous day. A post-mortem examination established that the cause of death was cyanide poisoning. When his body was discovered, an apple lay half-eaten beside his bed, and although the apple was not tested for cyanide, it was speculated that this was the means by which a fatal dose was consumed.

CB spends more time on the impact of estrogen therapy than IG, which has one brief scene with Turing and Joan Clarke one year after his conviction, in which he shows tremors in his movements. CB makes the point that the hormones did more stop his sex drive they also affected his ability to think. This part of the story is told in IG through a conversation between Turing and policeman, which explains the title Imitation Game.  In “Computing machinery and intelligence,” Turing addressed the problem of artificial intelligence, and proposed an experiment which became known as the Turing test, an attempt to define a standard for a machine to be called “intelligent”. The idea was that a computer could be said to “think” if a human interrogator could not tell it apart, through conversation, from a human being.

The achievements listed above do not exhaust all the extraordinary things Turing did in his 41 years. IG portrays him as an overbearing individual who could not understand other people’s feelings. Today we would say he was on the autism spectrum. CB tells the story of a brilliant man who just happened to be gay. Independent of which of these (if either) is true, IG says in its closing moments that cracking the Enigma Code shortened the war by two years and SAVED 14 MILLION LIVES.

Given this and his impressive intellectual achievements, the decision to chemically castrate Turing, which caused his death was insane, as is the fact that it took until 2009 for the British Government to apologize. With an important decision on Gay Marriage looming in the Supreme Court, this is an important example to keep in mind.  When homophobes and bigots quote the Bible to justify that marriage is only allowed between one man and one woman, we should ask “What would Jesus do?”

For more on Turing you could buy the book Alan Turing: The Enigma written by Andrew Hodges and Douglas Hofstadter or visit www.turing.ord.uk/, maintained by Hodges.