Jonathan Mattingly’s work on Gerrymandering

My last two posts were about a hurricane and a colonscopy, so I thought it was time to write about some math again.

For the last five years Mattingly has worked on a problem with important political ramifications: what would a typical set of congressional districts (say the 13 districts in North Carolina) look like if they were chosen at “random” subject to the restrictions that they contain a roughly equal number of voters, are connected, and minimize the splitting of counties. The motivation for this question can be explained by looking at the current congressional districts in North Carolina. The tiny purple snake is district 12. It begins in Charlotte goes up I40 to Greensboro and then wiggles around to contain other nearby cities producing a district with a large percentage of Democrats.

To explain the key idea of gerrymandering, suppose, to keep the arithmetic simple, that a state has 2000 Democrats and 2000 Republicans. If there are four districts and we divide voters

District           Republicans       Democrats

1                           600                            400

2                           600                            400

3                           600                            400

4                           200                            800

then Republicans will win in 3 districts out of 4. The last solution extends easily to create 12 districts where the Republicans win 9. With a little more imagination and the help of a computer one can produce the outcome of the 2016 election in North Carolina election in which 10 Republicans and 3 Democrats were elected, despite the fact that the split between the parties is roughly 50-50.

The districts in the North Carolina map look odd, and the 7th district in Pennsylvania (named Goofy kicks Donald Duck) look ridiculous, but this is not proof of malice.

Mattingly with a group of postdocs, graduate students, and undergraduates has developed a statistical approach to this subject. To explain this we will consider a simple problem that can be analyzed using material taught in a basic probability or statistics class. A company has a machine that produces cans of tomatoes. On the average the can contains a pound of tomatoes (16 ounces), but the machine is not very precise, so the weight has a standard deviation (A statistical measure of the “typical deviation” from the mean) of 0.2 ounces. If we assume the weight of tomatoes follows the normal distribution then 68% of the time the weight will be between 15.8 and 16.2 ounces. To see if the machine is working properly an employee samples 16 cans and finds an average weight of 15.7 pounds.

To see if something is wrong we ask the question: if the machine was working properly then what is the probability that the average weight would be 15.7 pounds or less. The standard deviation of one observation is 0.2 but the standard deviation of the average of 16 observations is 0.2/(16)1/2  = 0.005. The observed average is 0.3 below the mean or 6 standard deviations. Consulting a table of the normal distribution or using a calculator we see that if the machine was working properly then the probability of an average of 15.7 or less would occur with probability less than 1/10,000.

To approach the gerrymandering, we ask a similar question: if the districts were drawn without looking at party affiliation what is the probability that we would have 3 or fewer Democrats elected? This is a more complicated problem since one must generate a random sample from the collection of districts with the desired properties. To do this Mattingly’s team has developed methods to explore the space of possibilities and then making successive small changes in the maps. Using this approach one has make a large number of changes before you have a map that is `independent.” In a typical analysis they generate 24,000 maps. They found that using the randomly generated maps and retallying the votes, ≤3 Democrats were elected in fewer than 1% of the scenarios. The next graphic shows results for the 2012, 2016 maps and one drawn by judges.

Mattingly has also done analyses of congressional districts in Wisconsin and Pennsylvania, and has helped lawyers prepare briefs for cases challenging voting. His research has been cited in many decisions including the three judge panel who ruled in August 2018 that the NC congressional district were unconstitutional. For more details see the Quantifying Gerrymandering blog

https://sites.duke.edu/quantifyinggerrymandering/author/jonmduke-edu/

Articles about Mattingly’s work have appeared in

(June 26, 2018) Proceedings of the National Academy of Science 115 (2018), 6515–6517

(January 17, 2018)  Nature 553 (2018), 250

(October 6, 2017) New York Times  https://www.nytimes.com/2017/10/06/opinion/sunday/computers-gerrymandering-wisconsin.html

The last article is a good (or perhaps I should bad) example of what can happen when your work is written about in popular press. The article, written by Jorden Ellenberg is, to stay within the confines of polite conversation, simply awful. Here I will confine my attention to its two major sins.

  1. Ellenberg refers several times to the Duke team but never mentions them by name. I guess our not-so-humble narrator does not want to share the spotlight with the people who did the hard work. The three people who wrote the paper are Jonathan Mattingly, professor and chair of the department, Greg Herschlag, a postdoc, and Robert Ravier, one of our better grad students. The paper went from nothing to fully written in two weeks in order to get ready for the court case. However, thanks to a number of late nights they were able to present clear evidence of gerrymandering. It seems to me that they deserve to be mentioned in the article, and it should have mentioned that the paper was available on the arXiv, so people could see for themselves.
  2. The last sentence of the article says “There will be many cases, maybe most of them, where it’s impossible, no matter how much math you do, to tell the difference between innocuous decision making and a scheme – like Wisconsin’s – designed to protect one party from voters who might prefer the other.” OMG. With many anti-gerrymandering lawsuits being pursued across the country, why would a “prominent” mathematician write that in most cases math cannot be used to detect gerrymandering?

Two faces of Hurricane Florence

The title is supposed to evoke images of a woman who is a soccer mom by day and serial killing hooker by night. The coastal part of the state of North Carolina saw the second side of Florence. In Durham, which is 150 miles from Wilmington, we mostly  her softer side.

In the days leading up to Sunday September 9, the storm was hyped by the weather channel and the local news stations. Florence started as category 1 in the eastern half of the Atlantic Ocean but it was predicted that as she reached the warmer weather in the Atlantic, she would grow to category 4 and smash into coast somewhere in the Carolinas (and she did). The message of a coming disaster was reinforced by our neighborhood listserv. People who lived through Hurricane Fran in 1996, saw many trees go down, and lived a week or more without power, and they were not anxious for a repeat performance. Emails were filled with long lists of things we should do in order to prepare.

Monday September 10. At my house we had our semi-annual inspection of the heating/cooling system. It is hard to get work done while the technician is in the house, so I googled “How do you prepare for a hurricane?” The answers were similar to what I had seen on the listserv, with one new funny one: remove all of the coconuts from your trees, since they can become cannonballs in a storm. When the inspection was finished about 11:30 (with no repairs needed!), I went to the grocery store with my little list: water and nonperishable foods. The store was a zoo and water was flying off the shelves, but I came home with 48 quart bottles of Dasanti, canned fruit, canned soup, bread, peanut butter, energy bars, etc.

Tuesday September 11  was the 17th anniversary of a day that changed the world. Donald and Melania visited the new memorial to the plane that went down in Pennsylvania. I came down with a cold and stayed home from school. By this time, there was more weather than news on news. I recall hearing one weatherman pontificate “at this point we are as good with predictions 72 hours out as we used to be at 24 hours.” Then he showed a track that had the eye of a category 1 storm over Durham about five days later. Tuesday afternoon Duke announced that all classes were cancelled as of 5PM Wednesday. , with no classes on Thursday and Friday. UNC and NCState also closed and in a sign of looming disaster cancelled their football games. For UNC this was a blessing. Having lost to Cal and to the East Carolina in the two previous weeks, they were happy to escape from another loss at the hands of U of Central Florida.

Wednesday September 12. The morning weather forecast announced a dramatic change in the track of the storm. It was now predicted to turn left after land fall, hang out at the beach for a couple of days and then head off to Atlanta. The new storm track was great news for Durham but not for Myrtle Beach South Carolina which was given a short deadline to evacuate. Wednesday was more or less a normal day. I met with a graduate student at 11, had lunch, taught my class, and announced the shifted schedule for the homework and exams due to the cancellation of class on Friday.

Thursday September 13 was the calm before the storm.  My wife and I took a walk around the neighborhood in the morning, went out to a Mexcian Lunch at La Hacienda at the northern edge of Chapel Hill, and grilled some chicken for dinner. (In NC barbecuing refers to cooking a large piece of meat over a slow fire until you can pull it apart with your hands. Grilling gets the job done in 15-20 minutes.) Feeling more confident about the future we scaled back from cooking four chicken breasts to have leftovers that could be eaten over the next few days, to making only two.

Friday September 14. Florence made land fall at Wilmington as a category 2 hurricane, which is definitely more than half as strong as a category 4. It was amusing to see several weathermen competing to see who could be filmed in the eye of the hurricane, where the winds suddenly drop to 0. A woman on the neighborhood listserv described it as an eerie experience. Most of the other things that happened along the coast were not at all funny. New Bern was about 30 miles from the coast, but it was on the banks of the Neuse River. Rains plus hurricane winds resulted in flooding that left hundreds of people (who stupidly chose not to evacuate) in need of rescue. Up in Durham things were much more sedate. There was very little wind or rain. However since the weatherman had told us to shelter in place, we stayed in for most of the day, as did most of our neighbors.

Saturday September 15 was more or less a repeat of Friday. Twisting a line from Big Bang in which Penny is talking about here relationship with Leonard. “This is a new boring kind of hurricane.”   I don’t know what normal people do during a hurricane, but it is a great chance to get some work done. On Friday I read a couple of papers from the arXiv and thought up a new problem for one of my grad students to work on. Saturday I decided I would use the lull to finally finish up the 5th edition of Probability; Theory & Examples, so I sifted through emails I had saved from readers and corrected some typos. Not everything I do is math. I watched the third round of the Evian Championship, the fifth major of the LPGA season. In more keeping with more manly pursuits I watched parts of Duke’s 40-27 victory over Baylor in Waco, Texas. The win was remarkable because Duke’s starting quarterback, a junior who might make it to the pros, was out. In addition, I watched Texas beat the USC Trojans, 37-14. It’s not just that my son is a CS professor in Austin. Having been at UCLA for nine years, when Peter Carroll was there, I love to see USC lose.

Sunday September 16. Duke’s severe weather policy (which covers not only the university but also the hospitals) ended at 7AM, so we figured that we had reached the end of the hurricane. We got up and went to the Hope Valley Diner for breakfast at about 7:30. When we first started going there it was called Rick’s. However, the owner got tired of having a restaurant named after her ex-husband. In keeping with the intellectual climate of Durham, our usual waitress is a second year medical student at UNC Greensboro. Weather was light rain as it had been for the last couple of days, so we went to the Southpoint Mall to get out of the house. Being in a Christian region almost all the stores don’t open until noon. At the mall Susan bought some Crocs, and I bought a couple of books that I read before going to sleep. However, mostly we walked around like many of the families who came there with their small kids.

Monday September 17. Our hopes of nicer weather were quickly dashed. It was raining very hard when we woke up. Curiously the direction of the flow was now from SW to NE in contrast with the last few days of SE to NW. Almost immediately there was a tornado alert on TV accompanied with its loud obnoxious noise, and robophone call to tell us of the event, which came a few minutes after channel 14 told us that the warning had expired.  The tornado warning came from an area well north of us, so we weren’t really worried when soon there was a second one even further north. This was too much excitement for Duke, so they cancelled classes until noon, irritating people who traveled through awful weather to get to their 8:30AM classes. Soon after the despair of facing another day of rain set it, the sun came out and Susan and I took a walk.

Tuesday and Beyond. Unfortunately the end of the rain does not bring the end of the misery for people near the coast, as we learned with Hurricane Matthew. Many of the rivers there have large basins (of attraction). Many will only crest Wednesday or Thursday. The Cape Fear River will reach 60+ feet compared to its usual 20. But don’t worry. Trump will be coming soon to inspect the damage. When he came he clumsily read from a prepared statement, that soon we will be getting lots and lots of money. I guess his advisers didn’t tell him that the state is so heavily gerrymandered that he will probably see 10 Republican congressmen elected from 13 districts.

A Tale of Two Colonoscopies

I’ve had two colonoscopies; one on January 11, 2007 and one on March 13, 2018. In the spirit of an English writing assignment, I will compare and contrast the two experiences. In addition I will offer some advice that I think will be useful for those who have yet to have had the experience. Since I am a math professor you should not view this as medical advice.

The main event “Cleaning the Area for Viewing” has not changed much. On Day -1, you only have clear fluids (see below for definition) . At 3:00 PM you take some DulcoLax (stool softener). At 5:00PM you begin to drink from a mixture of 64 oz of Gatorade and one 255gram bottle of Miralax. Eight ounces every 15 minutes until it is gone. In 2007, I was amused to see that the directions on the can said to “never under any circumstances take more than one capful.” In 2018, the laxative comes in a brightly colored plastic container, which brags that it contains 14 daily doses.

At about 6PM the party gets started, and regular trips to the bathroom continue until about midnight, when I was brave enough to try to go to sleep. In 2007 there was nothing new to do the next day, except to drink clear liquids stopping two hours before the procedure. In 2018, there is a 10 ounce bottle of Magnesium Citrate to be drunk four hours before the procedure. Fortunately, this corresponds to the standard dose and it gets its magic done in less than 2 hours.

What is a clear liquid? In 2007 the list included Coffee and Tea (no milk or cream). In 2018 these items were gone leaving water, soft drinks, Gatorade, fruit juices without pulp, chicken or beef broth, plain jello, popsicles (no sherbert or fruit bars). In short “any fruit you can see through and has no pulp” is acceptable as long as it is not RED or PURPLE for obvious reasons. A colonscopy is not a test on which you want to get a false positive!

Clear liquids are to keep you hydrated, but also to give you enough calories to get through the day.(See disclaimer above.) In this regard, popsicles and jello are worthless since they have 10-30 calories. Thinking it might be some sort of substitute for coffee in the morning, I tried some canned chicken broth warmed in the microwave. But after I had a few sips I noticed that the can said it had 30 calories per serving, 30 of which were from fat. The white grape juice at 150 calories for 8 ounces was sickeningly sweet, but a good source of calories, as was some 80 calorie lemonade (which had no pulp but taste like plastic) and to a lesser extent non-diet soda.

The biggest change in the prep routine came from the rules for a restricted diet on days -5 to -2. In 2007 the rule was just do not eat nuts, seeds, popcorn and corn. By 2018 this list has gotten huge. No non-tender meats, gristle, hot dogs, salami, cold cuts. No raw vegetables or salads, no artichokes, asparagus, broad beans, broccoli, Brussel sprouts, cabbage, cauliflower, mushrooms, onions, peas, sauerkraut, spinach, summer or winter squash, tomatoes, zucchini. No raw fruit (except for bananas), canned fruit, dried fruit, berries, melons, cranberry sauce, avocado, coconut. No bread with whole wheat, etc, etc

In short, you can eat tender cooked fish, poultry, and meat, served with green beans, cooked carrots, beets, apple sauce, ripe bananas, and cooked fruit (peaches, pears, apricots, and apples) if the skin has been removed. The exclusions listed above will test your culinary creativity. Only refined pasta is allowed but I interpreted this to mean that Stouffer’s Fettucini Alfredo was OK. While cauliflower was off the list I figured that it was OK to eat mashed cauliflower from the frozen foods aisle, which Oprah peddles as low calorie alternative to mashed potatoes. Rachel Ray’s recipe for cooked carrots are popular according to her internet site but went over like a Lead Zeppelin.

To try to end this rant on a happy note, let me talk about Day 0. As many veterans of the procedure will tell you, after going through the prep on Day -1, and now a low fiber diet on Days -5 to -2, the procedure is not bad at all. One of the reasons for this is that they give you something that makes you forget the whole thing.  I once made a joke at a conference in Canada that this amnesia makes the procedure more fun than a faculty meeting. After the talk, a faculty member from York came up and told me that in Canada they don’t give you that drug. Damn socialized medicine.

I think that the drug they give you now has changed. In 2008 it was something like Rohypnol (aka roofies, the date rape drug). My wife Susan took one of her friends, Toni, to get her colonoscopy. One of the first things Toni said after the procedure was that Susan should see the movie Awakenings. Then a few minutes later she said it again, and then again, and again. In 2007, this made me very anxious about the procedure. I was afraid that after it was over would I suddenly wax philosophical about a woman I had me that had a “balcony you could do Shakespeare from.”

In 2018 they gave me Fentanyl. Yes that is the opioid you have heard about the news that is more deadly than heroin, but the nurse was giving me the injection. Wikipedia says it “is an opioid that is used as a pain medication and together with other medications for anesthesia. It has a rapid onset and effects generally last less than an hour or two.” It had the desired effect during the procedure, but when I left the office I was clear headed enough to give Susan driving directions to get home from a very unfamiliar part of route 54, where curiously 234 and 249 are on the same side of the street.

Hopefully, reliving my experiences has been amusing and told novices more about what to expect. This time the post has a bit if a point or to be a precise a small question for doctors: “The addition of the four days of pre-prep undoubtedly makes diagnoses more accurate but does that justify the time spent on a very restrictive and unpleasant low-fiber diet?” Couldn’t we compromise on two days, if I promise that everyone in the country will follow the directions.

Abelian Sand Pile Model

Today is January 7, 2018. I am tired of Trump bragging that he is a “very stable genius.” Yes he made a lot of money (or so he says) but he doesn’t know what genius looks like. Today’s column is devoted to work of Wesley Pegden (and friends) on the Abelian Sand Pile Model. Why this topic. Well he is coming to give a talk on Thursday in the probability seminar.

This system was introduced in 1988 by Bak, Tang, and Wiesenfeld (Phys Rev A 38, 364).  The simplest version of the model takes place on a square subset of the two dimensional integer lattice. Grains of sand are dropped at random. The number of grains at a point is ≥ 4 the pile topples and one grain is sent to each neighbor. This may cause other sites to topple setting off an avalanche.

The word Abelian refers to the property that the state after n grains have landed is independent of the order in which they are dropped. The reason that physicists are interested is that the system “self-organizes itself into a critical state” in which avalanche sizes have a power law. The Abelian sand pile has been extensively studied, and there are connections to many branaches of mathematics, but for that you’ll have to go to the Wikipedia page or to the paper “What is … a sandpile?” written by Lionel Levine and Jim Propp which appeared in the Notices of the AMS 57 (2010), 976-979.

In a 2013 article in the Duke Math Journal [162, 627-642] Wesley Pegden and Charles Smart studied what happened when you put n grains of sand at the origin on the infinite d-dimensional lattice and let the system go until it reaches its final state. They used PDE techniques to show that when space is scaled by n 1/d then the configuration converges weakly to a limit, i.e, integrals against a test function converge. As Fermat once said the proof won’t fit in the margin, but in a nutshell what they do is to who used viscosity solution theory to identify the continuum limit of the least action principle of Fey–Levine–Peres (J. Stat. Phys. 138 (2010), 143-159). A picture is worth several hundred words.

 

In a 2016 article in Geometric and Functional Analysis, Pegden teamed up with Lionel Levine (now at Cornell) to study the fractal structure of the limit. The solution is somewhat intricate involving solutions of PDE and Apollonian triangulations that generalize Apollonian circle packings.

A 0-1 law for eclipses

A 0-1 law in probability is a result that says in certain situations, for example, when we consider the asymptotic behavior of sums of independent and independently distributed random variables or the short time behavior of Brownian motion, then all events are trivial, i.e., have probability 0 or 1.

Yesterday I learned that law applies to eclipses. For months we have been told that on August 21, 2017 in Durham there would be solar eclipse that will at its peak at 2:45PM it will cover 93% of the sun. That turns out to be about as exciting as being 93% pregnant or have 93% of a proof. The shade of the trees in our front yard seemed a little darker but the sky never did. Turns out that having 7% of the sun exposed is more than enough to be able to see well.

About a month ago I ordered “eclipse glasses” from Amazon, so I could look at the sun without burning my retinas. However, as I learned about a week ago, the glasses were advertised as  ISO 12312-2 certified, but they were not. Amazon was the one who told me and they sent me a refund, but I ended up without glasses. On the big day I made myself a pinhole viewer by sticking the point of a pencil through a note card. When I held it out I did see a light spot on the ground that looked like a circle with a piece missing, but then I wondered if it was due to the fact that my hole was not round. However, soon after that moment of doubt, I noticed that there were a large number of crescent shaped light objects on the ground. In short, the overlaps between leaves in the trees that made hundreds of pinhole cameras. For the long version see:

https://petapixel.com/2012/05/21/crescent-shaped-projections-through-tree-leaves-during-the-solar-eclipse/

For a few minutes I wandered around looking at the light shapes on my driveway and in the street in front of my house before I got bored and went in, leaving my neighbors to wonder no doubt what I was doing wandering around in the street holding my smart phone.

In summary, when 2024 rolls around and the eclipse goes from Texas to Maine, either get yourself to where the eclipse is total or take off for Myrtle Beach where the moon will not block the sun and hotel rooms will be discounted.

Rereading Thurston: What is a proof?

As regular readers of this blog can guess, the inspiration for writing this column came from yet another referee’s report which complained that in my paper “the style of writing is too informal.” Fortunately for you, the incident reminded me of an old article written by Bill Thurston and published in the Bulletin of the AMS in April 1994 (volume 30, pages 161-177) and that will be my main topic.

The background to our story begins with the conjecture Poincare made in 1900, which states that every simply connected, closed 3-manifold is homeomorphic to the 3-sphere (i.e., the boundary of the ball in four dimensional space). As many of you already know, after nearly a century of effort by mathematicians, Grigori Perelman presented a proof of the conjecture in three papers made available in 2002 and 2003 on arXiv. He later turned down a Field’s medal in 2006 and a $1,000,000 prize from the Clay Mathematics Institute in 2010.

Twenty years before the events in the last paragraph Thurston’s stated his geometrization conjecture. It is an analogue of the uniformization theorem for two-dimensional surfaces, which states that every connected Riemann surface can be given one of three geometries (Euclidean, spherical, or hyperbolic). Roughly, the geometrization conjecture states that every closed three manifold can be decomposed in a canonical way into pieces that each have one of eight types of geometric structure.

In the 1980s Thurston published a proof in the special case of “Haken manifolds.” In a July 1993 article in the Bulletin of the AMS (volume 29, pages 1-13)  Arthur Jaffe and Frank Quinn criticized his work as “A grand insight delivered with beautiful but insufficient hints. The proof was never fully published. For many investigators this unredeemed claim became a roadblock rather than an inspiration.”

This verbal salvo was launched in the middle of an article that asked the question “Is speculative mathematics dangerous? Recent interactions between physics and mathematics pose the question with some force: traditional mathematical norms discourage speculation, but it is the fabric of theoretical physics.” They go on to criticize work being done in string theory, conformal field theory, topological quantum field theory, and quantum gravity.  It seems to me that some of these subjects saw spectacular successes in the 21st century, but pursuing that further would take me away from my main topic.

In what follows I will generally use Thurston’s own words but will edit them for the sake of brevity. His 17 page article is definitely worth reading in full. He begins his article by saying “It would NOT be good to start with the question

How do mathematicians prove theorems?

To start with this would be to project two hidden assumptions: (1) that there is uniform, objective and firmly established theory and practice of mathematical proof, and (2) that progress made by mathematicians consists of proving theorems.”

Thurston goes on to say “I prefer: How do mathematicians advance human understanding of mathematics?”

“Mathematical knowledge can be transmitted amazingly fast within a sub-field. When a significant theory is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another. The same proof would be communicated and generally understood in an hour talk. It would be the subject of a 15 or 20 page paper which could be read and understood in a few hours or a day.

Why is there such a big expansion from the informal discussion to the talk to the paper. One-on-one people use gestures, draw pictures and make sound effects. In talks people are more inhibited and more formal. In papers people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.

People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on, the same patterns are not very illuminating; they are often even misleading”

Turning to the topic in our title, Section 4 is called what is a proof? Thurston’s philosophy here is much different from what I was taught in college. At Emory you are not allowed to quote a result unless you understand its proof.

“When I started as a graduate student at Berkeley … I didn’t really understand what a proof was. By going to seminars, reading papers and talking to other graduate students I gradually began to catch on. Within any field there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper you refer to these without proof. You look at other papers and see what facts they quote without proof, and what they cite in their bibliography. Then you are free to quote the same theorem and cite the same references. Many of the things that are generally known are things for which there may be no written source. As long as people in the field are comfortable the idea works, it doesn’t need to have a formal written source.”

“At first I was highly suspicious of this process. I would doubt whether a certain idea was really established. But I found I could ask people, and they could produce explanations or proofs, or else refer me to other people or two written sources. When people are doing mathematics, the flow of ideas and the social standard of validity is much more reliable than formal documents. People are not very good in checking formal correctness of proofs, but they are quite good at detecting potential weaknesses or flaws in proofs.”

There is much more interesting philosophy in the paper but I’ll skip ahead to Section 6 on “Personal Experiences.” There Thurston recounts his work on the theory of foliations. He says that the results he proved were documented in a conventional formidable mathematician’s style but they depended heavily on readers who shared certain background and certain insights. This created a high entry barrier. Many graduate students and mathematicians were discouraged that it was hard to learn and understand the proofs of key theorems.

Turning to the geometrization theorem: “I’d like to spell out more what I mean when I say I proved the theorem. I meant that I had a clear and complete flow of ideas, including details, that withstood a great deal of scrutiny by myself and others. My proofs have turned out to be quite reliable. I have not had trouble backing up claims or producing details for things I have proven. However, there is sometimes a huge expansion factor in translating from the encoding in my own thinking to something that can be conveyed to someone else.”

Thurston goes on to explain that his result went against the trends in topology for the preceding 30 years and it took people by surprise. He gave many presentations to groups of mathematicians but “at the beginning, the subject was foreign to almost everyone … the infrastructure was in my head, not in the mathematical community.” At the same time he began writing notes on the geometry and topology of 3-manifolds. The mailing list for these notes grew to about 1200 people. People ran seminars based on his notes and gave him lots of feedback. Much of it ran something like “Your notes are inspiring and beautiful, but I have to tell you that in our seminar we spent 3 weeks working out the details of …”

Thurston’s description of the impact his work had on other fields, is I sharp contrast to Jaffe and Quinn’s assessment. To see who was right I turned to Wikipedia which says “The geometrization theorem has been called Thurston’s Monster Theorem, due to the length and difficulty of the proof. Complete proofs were not written up until almost 20 years later. (which would be 2002, almost 10 years after jaffe-Quinn). The proof involves a number of deep and original insights which have linked many apparently disparate fields to 3-manifolds.”

Thurston was an incredible genius. He wrote only 73 papers but they have been cited 4424 times by 3062 different people. His career took him from Princeton to Berkeley, where he was director of MSRI for several years, then to Davis, and ended his career at Cornell 2003-2012. I never really met him but I could sense the impact he had on the department. Sadly he died at the age of 65 as a result of metastatic melanoma. A biography and reminiscences’ can be found in the

http://www.ams.org/notices/201511/rnoti-p1318.pdf

A Rainy Monday in Austin

 

After two great days of hiking, sightseeing, eating and drinking with my younger son (a CS assistant professor at UT Austin) and his girl friend who works for My Fitness Pal, the 98 degree heat was replaced by a steady rain. Trapped inside our hotel room, my wife read the New York Times and did a crossword puzzle, while I wrote a couple of referee’s reports on papers that were worse than the weather.

While it is not fun to be forced inside by the rain it is a good time to reflect on what I’ve seen while visiting Austin. Saturday afternoon we went to the LBJ museum on the UT campus. He served as president for five years after JFK was assassinated in 1963. Before that he was elected to the House or Representatives in 1937 and to the Senate in 1948.

His War on Poverty helped millions of Americans rise above the poverty line during his administration. Civil rights bills that he signed into law banned racial discrimination in public facilities, interstate commerce, the workplace, and housing. The Voting Rights Act prohibited certain laws southern states used to disenfranchise African Americans. With the passage of the Immigration and Nationality Act of 1965, the country’s immigration system was reformed, encouraging greater immigration from regions other than Europe. In short, the Republican agenda times -1.

On Sunday afternoon, we went to the Bullock Texas State History Museum. The most interesting part for me was the story of Texas in the early 1800s. In 1821 Texas won its independence from Spain and became part of Mexico. Between 1821 and 1836 an estimated 38,000 settlers, on promise of 4,000 acres per family for a small fee, trekked from the United States into the territory. The Mexican government grew alarmed at the immigration threatening to engulf the province. Military troops were moved to the border to enforce the policy but illegal immigrants crossed the border easily. Hopefully the parallel with the current situation ends there, since there were revolts in Texas 1832, leading to war with Mexico in 1834, and to the independence of Texas in 1836.

My third fun fact is a short one: Austin City Limits was a TV show for 40 before it became a music festival. Haven’t seen either one but Austin is a great place to visit.

Duke grads vote on Union

According to the official press release: “Of the 1,089 ballots cast, 691 voted against representation (“NO”) by SEIU and 398 for representation by SEIU (“YES”). There were, however, 502 ballots challenged based on issues of voter eligibility. Because the number of challenged ballots is greater than the spread between the “YES” and “NO” votes, the challenges could determine the outcome and will be subject to post-election procedures of the NLRB.”

The obvious question is what is the probability this would change the outcome of the election? If the NO’s lose 397 votes and hence the YES lose 015 on the recount the outcome will be 294 NO, 293 YES. A fraction 0.6345 of the votes were NO. We should treat this as an urn problem but to get a quick answer you can suppose the YES votes lost are Binomial(502,0.3655). In the old days I would have to trot out Stirling’s formula and compute for an hour to get the answer but now all I have to do is type into my vintage TI-83 calculator

Binompdf(502,0.3655,105) = 2.40115  X 10-14

i.e., this is the probability of fewer than YES votes lost.

Regular reader of this blog will remember that I made a similar calculation to show that there was a very small probability that the 62,500 provisional ballots would change the outcome of the North Carolina election since before they were counted Cooper had a 4772 vote lead over McCrory. If we flip 62,500 coins then the standard deviation of the change in the number of votes is

{62,500(1/4) 1 / 2 = 125

So McCrory would need 33,636 votes = 2386 above the mean = 19.08 standard deviations. However, as later results showed this reasoning was flawed: Cooper’s lead to a more than 10,000 votes. This is due to the fact that, as I learned later, provisional ballot have a greater tendency to be Democratic while absentee ballots tend to be Republican.

Is this all just #fakeprobability? Let’s turn to a court case de Martini versus Power. In a close electionin a small town, 2,656 people voted for candidate A compared to 2,594 who voted for candidate B, a margin of victory of 62 votes. An investigation of the election found that 136 of the people who voted in the election should not have. Since this is more than the margin of victory, should the election results be thrown out even though there was no evidence of fraud on the part of the winner’s supporters?

In my wonderful book Elementary Probability for Applications, this problem is analyzed from the urn point of view. Since I was much younger when I wrote the first version of its predecessor in 1993, I wrote a program to add up the probabilities and got 7.492 x 10 -8. That computation supported the Court of Appeals decision to overturn a lower court ruling that voided the election in this case.If you want to read the decision you can find it at

http://law.justia.com/cases/new-york/court-of-appeals/1970/27-n-y-2d-149-0.html

Jordan Ellenberg don’t know stat

A couple of nights ago I finished John Grishan’s the Rouge Lawyer so I started reading Jordan Ellenberg’s “How not to be wrong. The power of mathematical thinking.” The cover says “a math-world superstar unveils the hidden beauty and logic of the world and puts math’s power in our hands.”

The book was only moderately annoying until I got to page 65. There he talks about statistics on brain cancer deaths per 100,000. The top states according to his data are South Dakota, Nebraska, Alaska, Delaware, and Maine. At the bottom are Wyoming, Vermont, North Dakota, Hawaii and the District of Columbia.

He writes “Now that is strange. Why should South Dakota be brain cancer center and North Dakota nearly tumor free? Why would you be safe in Vermont but imperiled in Maine.”

“The answer: … The five states at the top have something in common, and the five states at the bottom do too. And it’s the same thing: hardly anyone lives there.” There follows a discussion of flipping coins and the fact that frequencies have more random variation when the sample size is small, but he never stops to see if this is enough to explain the observation.

My intuition told me it did not, so I went and got some brain cancer data.

https://www.statecancerprofiles.cancer.gov/incidencerates/

In the next figure the x-axis is population size, plotted on a log scale to spread out the points and the y-axis is the five year average rate per year per 100,000 people. Yes there is less variability as you move to the right, and little Hawaii is way down there, but there are also some states toward the middle that are on the top edge. The next plots shows 99% confidence intervals versus state size. I used 99%  rather than 95% since there are 49 data points (nothing for Nevada for some reason).

brain_cancer_fig1

In the next figure the horizontal line marks the average 6.6. The squares are upper end points of the confidence intervals. When they fall below the line, this suggests that the mean is significantly lower than the national average. From left to right: Hawaii, New Mexico, Louisiana and California. When the little diamond marking the lower end of the confidence interval is above the line, we suspect that the rate for that state is significantly higher than the mean. There are eight states in that category: New Hampshire, Iowa, Oregon, Kentucky, Wisconsin, Washington, New Jersey, and Pennsylvania.

brain_cancer_fig2

So yes there are 12 significant deviations from the mean (versus 5 we would get if all 49 states had mean 6.6)  but they are not the ones at the top or the bottom of the list, and the variability of the sample mean has nothing to do with the explanation. So Jordan, welcome to world of APPLIED math, where you have to look at data to test your theories. Don’t feel bad the folks in the old Chemistry building at Duke will tell you that I don’t know stat either.  For aa more professional look at the problem see

http://www.stat.columbia.edu/~gelman/research/published/allmaps.pdf

North Carolina Gubernatorial Election

Tuesday night after the 4.7 million votes had been counted from all 2704 precincts Roy Cooper had a 4772 vote lead over Pat McCrory. Since there could be as many as 62,500 absentee and provisional ballots, it was decided to wait until these were counted to declare a winner. The question addressed here is: What is the probability that the votes will change the outcome?

The do the calculation we need to make an assumption:  the addition votes are similar to the overall population so they are like flipping coins. In order to change the outcome of the election Cooper would have to get fewer than 31,250 – (4772)/2 = 28,864 votes. The standard deviation of the number of heads in 62,500 coin flips is (62,250 x ¼) 1 / 2 = 125, so this represents 19.09 standard deviations below the mean.

One could use be brave and use the normal approximation. However, all this semester while I have been teaching Math 230 (Elementary Probability) people have been asking why do this when we can just use our calculator?

Binomcdf(40000, 0.5, 28864) = 1.436 x 10-81

In contrast if we use the normal approximation with the tail bound (which I found impossible to type using equation editor) we get 1.533 x 10-81.

We can’t take this number too seriously since the probability our assumption is wrong is larger than that but it suggests that we will likely have a new governor and House Bill 2 will soon be repealed.