This post is co-authored by Daniel Laurison and Dan Hirschman.
There has been a compelling pie chart circulating on Facebook and Twitter, showing the percentage who voted for Trump or Clinton, or who didn’t vote, or weren’t eligible. (Dan even went so far as to include the image in this past week’s Sunday Morning Sociology link round-up, contributing to that circulation.) The problem is … well, there are a couple problems. First of all, the chart mostly circulated without an associated story or link, just some vague source info that couldn’t be traced back to any explanation of what the pie chart really meant. The closest to the original we could easily find was this, where the chart is reproduced (as below) with no contextual information.
But the second, related, problem was bigger, and was driving both of us nuts: the mystery of the denominator. The image shows that only about 41% of Americans voted, but turnout estimates we’d seen said 55% – 60% of eligible voters voted. More surprisingly, it showed that nearly 29% of Americans weren’t eligible to vote. We know felon disenfranchisement is a problem (see Uggen et al’s work here), and of course that there are immigrants in the US who aren’t citizens. But those two populations aren’t anywhere near a third or even a quarter of the US.
We both guessed that that 28.6% must include kids, but usually we don’t think of children as “ineligible” to vote in the same way that disenfranchised felons are. So we didn’t think the chart was right – or at least we were sure it was confusing – and so we made our own. The file with sources and links to those sources is available here.
For our first chart, we attempted to replicate the WaPo chart, and the two are quite similar. Our denominator is the total US population estimated by the census on November 8th 2016, plus the 4.7 million US citizens eligible to vote but living outside the country; about 328 million people total. To the extent that we can tell, it looks like the numbers for Clinton & Trump votes in the WaPo image were from before all the votes were counted (approximately mid-November, we later learned). But the most important takeaway is that a big portion of those “ineligible to vote” in the first image are in fact children: about 22% of the total. Out of the Washington Post’s 28.6% ineligible, only about 6% are adults who were not eligible to vote.
Next, here’s the chart that many people thought that they were seeing, one whose denominator was the percentage of the US-Resident and/or Citizen Voting Age Population. 7.6% of US-resident adults are not eligible to vote. That’s about 19.5 million people; most of those are residents of the US who are not citizens (immigrants who have not naturalized); the rest, about 3.2 million people or 1.27% of voting age adults, are disenfranchised felons – people who are either currently in prison, or on probation or parole, or have served their time fully, in states where voting rights are taken away from them. (These estimates come from the United States Elections Project, run by political scientist Michael McDonald, one of the sources for the original WaPo chart.)
Finally, here’s the breakdown for just eligible voters. Nearly 42% of those eligible to vote did not. We can’t know from these sorts of numbers exactly who those people are or why they didn’t vote, but there is a lot of research on who votes and who doesn’t, so we can make an educated guess that the non-voters are on average younger, less educated, and less well-off than the voters for either party. We also know that new, restrictive voter id laws had some effect in turning away eligible voters, disproportionately affecting minority voters.
What lessons can we learn from this? First, in fractions, the denominator makes a big difference! Second, after a bit more digging, we found what seems to be the first instance where the chart appeared in the Washington Post, in this article, which explains the chart in more detail (including noting that children make up most of the ineligible category). But the chart had become de-linked from the metadata, so to speak. So another takeaway is that you have to be very careful when you create and circulate data visualizations. You can’t assume that the visualization will remain in its original context, especially when it has viral appeal and touches on a subject people think they understand. Third and finally, if you want to suck up a lot of geeky social scientist time, post some nearly-accurate but slightly misleading numbers with poor sourcing, and there goes our evening.