Marginal Box-Plots: Summarizing what is Typical

One of the principle visualizations we have used to explore and and communicate our results is the Marginal Box-plot. Marginal box plots were one of the principle graphics presented in Redistricting: Drawing the line , Evaluating Partisan Gerrymandering in Wisconsin, and the ~~group’s~~ testimony in Common Cause v. Rucho.

The box-plots give a way to visually spot anomalous properties in a given redistricting plan by summarizing the structure of a typical plan, drawn without overt partisan considerations. For example, they can help identify what districts have been packed or cracked, showing which districts have many more or many less votes for a certain party than expected. The marginal box-plot give a baseline with which a given map should be compared.

Two prototypical examples of marginal box-plots are giving below. They summarize what we would expect from redistricting of North Carolina in to 13 Congressional districts and viewed through the lens of the actual votes cast in the 2012 and 2016 congressional elections.

Box-plot summary of districts ordered from most Republican to most Democratic, for the congressional voting data from 2012 (left) and 2016 (right).

The Ensemble of Redistricting Plans

The marginal pox-plots summarize the typical partisan makeup of the districts under consideration using a particular set of votes. In the case above we are considering the 13 congressional districts using the 2012 and 2016 congressional votes. As described in Quantifying Gerrymandering in NC, we use a MCMC algorithm to sample a non-partisan probability measure placed on the space of redistricting plans. In this case, we generated an ensemble of just over 24,000 redistricting plans.

Constructing the Marginal Box Plot

For each redistricting plan we determine the partisan make of up the congressional delegation using the same fixed collection votes and the districts dictated by the given map. Thus for each redistricting map, we have not only winner in each district but the fraction obtained by each party. Choosing one party, say the Democratic Party, we recorded the percentage of votes cast for that party in each district. We then order these fractions in increasing order. This produces a vector of 13 numbers, one for each of the congressional districts. The first entry gives the percentage of Democrats in the most Republican district for that map. Similarly, the last entry in the vector gives the percentage of Democratic votes.

We then consider the marginal distributions of these 13 dimensional vectors. We organize the marginals in 13 box and whisker plots. Each box and whisker plot consists of box which contains 50% of the data, a line central at the median, a whiskers extend to the max and min values and have tics at the 97.5%, 90%, 10% and 2.5% quantiles.

Identifying Outliers

We now use the box-plot to evaluate the maps drawn by the the NC General Assembly for the 2012 and 216 elections as well as a plan determined by a bipartisan panel of retire judges in the Beyond Gerrymandering Project. To do so we plot the ordered collection of vote fractions of each map on top of the marginal box-plots.

Box-plot summary of districts ordered from most Republican to most Democratic, for the congressional voting data from 2012 (left) and 2016 (right). We compare our statistical results with the three redistricting plans of interest.

This allow us to see in which districts are there anomaly many or anomaly few votes for a given party.

Packing and Cracking

Looking at the box-plots, one sees that the 1st to 3nd most Democratic districts seem to have an abnormally high number of Democrats compared to the typical plan when viewed though the marginals. Similarly, the 8th, 9th and 10th most republican districts (the 3nd-6th most Democratic districts) seem to have abnormally few Democrats in them.

To explore this, we tabulate the total number of Democrats these two grouping of districts and compare it with what is typical in the ensemble. We find that when using the either the 2012 votes or the 2016 votes, none of the plans within the ensemble have as many votes as the NC2012 and NC2016 plans in the three most Democratic districts. Similarly, when considering both sets of votes, none of the plans in the ensemble have as few Democratic votes as the 8th, 9th and 10th most democratic districts do under the NC2012 and NC2016 plans.

One can then localize to these observations to specific districts to infer local harm and standing to bring a claim. In the NC2012 map the 8th, 9th and 10th most democratic districts corresponded to District 8, 9 and 7 on the NC2012 map labeling. Similarly the 11th, 12th and 13th most democratic districts corresponded to District 4, 1 and 12 from the the NC2012 map labeling. Analogously, the in the NC2016 map the 8th, 9th and 10th (resp. ) most democratic districts corresponded to district 9, 2, and 13 (resp. 12, 4and 1) in the actual NC2016 map labeling. This was a fundamental part of the expert testimony presented on the Quantifying Gerrymandering group’s work presented in Common Cause v. Rucho.

Constructing the Marginal Box Plot

Identifying Outliers

Packing and Cracking

Leave a Reply Cancel reply