Color-balancing vote margins and vote totals in the US election map

cs702 · on Nov 3, 2020

Fantastic work.

The final map for the US election in 2016 is both the most accurate and the easiest to understand I've seen so far:

https://stemlounge.com/content/images/2019/10/muddy_america_...

In this map, saturation and lightness indicate vote density and color indicates the winning party, in each county.

IMHO, this is the kind of map that should be used by every media outlet to show election results.

Anon4Now · on Nov 3, 2020

Depends on the purpose. If you want to show the nuances, then, yes, it is effective. But, if I didn't know the results of the election, I don't think I could even take a guess looking at that map.

I don't believe there is a Grand Unified Theory of Election Maps. Use different maps to convey different facets of information.

jefftk · on Nov 3, 2020

If you want the map to show you the results of the election, then a county by county map is not useful anyway. Instead you'd want to know who won each state, and how many electoral votes each state counts for (or just color the whole country by the winner).

nostrademons · on Nov 3, 2020

> But, if I didn't know the results of the election, I don't think I could even take a guess looking at that map.

Like he said, "most accurate and easiest to understand". ;-)

boublepop · on Nov 4, 2020

Looking at the typical blue/red state map would not allow you to know the outcome either. You might feel that way, but only because it’s always accompanied with a direct display of the outcome above or below it.

This is a better representation of the map part, not an attempt at conveying aggregated result.

jpadkins · on Nov 3, 2020

you can really see the urban blue centers surrounded by grey suburbs surrounded by red exurbs. East and mid atlantic show this in several metro areas. US is not red state / blue state, but blue core / red rings.

throwaway0a5e · on Nov 3, 2020

You even see the popular destinations the people from the urban areas retire to.

dmurray · on Nov 3, 2020

But the county-level results aren't what's important in the election. For showing results, it makes sense to show all of Texas in red as soon as it's confirmed to have gone 50.1% to Trump, or whatever.

This map is useful in visualizing demographics and expected voter distribution. It's more valuable for showing election predictions than newsworthy results.

munificent · on Nov 4, 2020

> it makes sense to show all of Texas in red as soon as it's confirmed to have gone 50.1% to Trump, or whatever.

That only makes sense if a state's area is the same as the number of electors it has, which isn't the case.

If you want to show an electoral college map, you'd show each state in red or blue, with lightness based on the number of electors the state provides.

But, obviously, the intent of a county-level map is not to show electors, it's to show something about the demographic makeup of voter preferences.

cs702 · on Nov 4, 2020

> But, obviously, the intent of a county-level map is not to show electors, it's to show something about the demographic makeup of voter preferences.

Exactly: The map is meant to show "where the votes came from" without distorting any shapes.

mygo · on Nov 3, 2020

thank you =)

nathancahill · on Nov 3, 2020

Thanks for using gray rather than purple.

cryptofistMonk · on Nov 3, 2020

I wonder how much difference it would make if vote "density" was used to determine the saturation, rather than total number of votes.

By my thinking, a geographically large county with 500,000 votes appears much more significant than a smaller county with the same number of votes in this map, and adjusting for density could potentially correct that?

vilhelm_s · on Nov 3, 2020

I was thinking the same thing. According to Wikipedia, the county areas vary by three orders of magnitude, from San Bernardino County, California (51,947 km^2) to Kalawao County, Hawaii (31 km^2), so it seems it can be very misleading.

For example, North Carolina and California have similiar population densities (80 people/km^2 versus 95 people/km^2), but on the "muddy" map North Carolina looks almost unpopulated while California is very emphasised---I guess this is because North Carolina is divided into smaller counties.

I guess in the case of San Bernardino it's a giant grey rectangle, so that particular one doesn't shift the red-blue impression very much, but still... :)

mygo · on Nov 3, 2020

something like votes per square mile may address this. Looks like there are some follow-up maps to be made =)

jrussino · on Nov 4, 2020

I agree - this was my one minor nit-pick after reading the post as well.

I'd love to see muddy maps for other past presidential elections if you can get a hold of the data (and this one too, once all the votes have been counted).

Really cool work, and well-explained!

vilhelm_s · on Nov 4, 2020

I experimented a bit with the source code at https://github.com/tuttlepower/VotingMap , you can add an area correction by changing the line

    fillOpacity: (us_votes[i].total_votes / 59828),

to something like

    fillOpacity: (us_votes[i].total_votes / (59828 * 50 * (feature.properties.AREALAND / 5.195e10))),

But I think in practice, this effect is swamped by the "upper fence" effect described in the original post. In other words, the scale is very far from linear anyway, by changing the magic number 59828 above you can make it look dramatically different. I should really calculate what the (Q3 + 1.5 * IQR) value is, but above I just put in a "* 50" to make the overall impression of the map similar.

The resulting image: https://imgur.com/3K8Wwan

mygo · on Nov 4, 2020

yep the upper fence for the vertical scale is derived from the vote totals statistics. If the set of vote totals is now in votes per square mile, then there would likely be a different upper fence. Might be worth the number crunching to figure out what that number is.

BTW just to double-check, I'm assuming that in this given fillOpacity formula, that a fillOpacity > 1 just resolves to 1. So like fillOpacity : min(votesPerSquareMile/upperFence, 1).

vilhelm_s · on Nov 4, 2020

Yeah, the Esri fillOpacity property just becomes an SVG property, and the SVG specification requires that it gets clamped to [0,1] before rendering.

In addition to computing the quartiles correctly, I also realized you should probably use (AREALAND+AREAWATER) rather than just AREALAND... in the above image the great lakes counties look suspiciously overemphasized. :)

mygo · on Nov 5, 2020

let me know how it turns out! Do you have a link to your fork?

Symbiote · on Nov 3, 2020

I've seen maps showing population rather than area in several places in recent British elections.

Here's a recent example: https://odileeds.org/projects/hexmaps/constituencies/

And here, where the first graphic is from 1895 and uses this approach: https://www.geog.ox.ac.uk/research/transformations/gis/paper... ... and Figure 31 (p28) has an American example.

mygo · on Nov 3, 2020

Yeah, I'm on the same page. An idea for a follow-up map has been to do votes per thousand. I'm curious to find out what that resulting map would look like

notacoward · on Nov 3, 2020

My thoughts exactly. Giving the same color to the same population over different areas seems like a mistake, or at least a missed opportunity to provide more useful information.

brandmeyer · on Nov 3, 2020

EC-weighted, too.

contravariant · on Nov 3, 2020

That opens one hell of a can of worms though. Sure weighting by EC sounds reasonable, but frankly you'd really want to have it such that it looks mostly blue when the blue party wins and mostly red when the red party wins, which is nigh impossible because that's an incredibly nonlinear process.

Much better to optimize to make it look mostly red/blue when the red/blue party wins the popular vote, as that's actually a linear process.

brandmeyer · on Nov 3, 2020

> mostly blue when the blue party wins and mostly red when the red party wins, which is nigh impossible because that's an incredibly nonlinear process

The 2016 neutralizing map (right below the purple map) does this. I think it more closely matches people's perceptions about how their community aligns politically, too.

Literally white-washing (well, hue-desaturating) less populous areas out communicates something different. If you want to communicate impact on election outcome, then you just need to weight the vote per person based on people per elector instead of totaling the voting population in each area.

contravariant · on Nov 4, 2020

Technically white-wash is a more accurate term for it than desaturating, though personally I just view it as decreasing opacity against a white background.

Is people per elector the right measure for voting power though? There is an argument to be made (successfully in some cases [1]) that voting power is inversely proportional to the square root of the population. And of course the house seats are distributed in a different which minimizes the relative differences in voters per house seat between states [2].

Point being, voting power is a tricky thing to determine.

[1]: https://en.wikipedia.org/wiki/Penrose_method

[2]: https://en.wikipedia.org/wiki/Huntington%E2%80%93Hill_method

cryptofistMonk · on Nov 3, 2020

Yes, I was just thinking that, if you really wanted to control for the relative power of votes in a different counties.

cryptofistMonk · on Nov 3, 2020

I guess I'm really talking about the "lightness", not saturation

an_opabinia · on Nov 3, 2020

A red-and-blue map is more like a brand logo for election news. The thumbnail for the Facebook or Twitter story. The saturated red and blue colors have an almost astrological meaning to people, it's got nothing to do with information.

Maps people struggle with this, they're always using maps to try to visualize a piece of data when almost always a short table would be better.

Data graphics people are themselves a subset of a family of wonks that spend 50% of their day rehashing the same tired stories, and the other 50% lamenting how innumerate people are.

Did you ever consider that maybe the reason the maps are stupid is because they're stupid as a whole, not because there's something wrong with the reader or the designer?

vharuck · on Nov 3, 2020

In my experience of reporting data to non-data-saavy people, I've learned people love maps. Maybe because they like finding their home state/county/city. For whatever reason, they love maps.

I've also learned the hardest part of my job is getting people to read and remember the data. Replacing their preconceived notions is difficult. I'll use any cheap trick (short of showing data inaccurately) to do this. So I make them the maps they love. I use the colors they expect. If I show tables, they're used by experts for further analysis and ignored by the people who actually make decisions.

>The saturated red and blue colors have an almost astrological meaning to people, it's got nothing to do with information.

That meaning is information. Is it the best thing for a nuanced data graphic? No. But showing a nuanced graphic is like handing somebody Principia Marhematica when they ask what 1 + 1 is. Sure, it had the answer. And sure, it's great for people with mathematical skill. But it's overkill most of the time.

cycomanic · on Nov 3, 2020

I can tell you that in pretty much every presentation context a graph (no matter what graph) beats a table. We are much more accustomed to visually processing things than through numbers. I mean take the example of the given map (and let's keep it to the "simple" winner takes all map), the table for this map would definitely be so big that you could not easily tell which party won more counties. With the map it would be one glance.

Yes as the author points out it does not tell you anything about the number of people in the county, but trying to convey information in graphics well is difficult (and when people get it right it can be truly amazing, i.e. take Roslings talks, which really visualised public health effects and enabled many people to understand for the first time). However, everyone who regularly has to present data to people should invest time into learning about how to graph and visualise your data.

mygo · on Nov 3, 2020

I kind of see where an_opabinia is coming from. Being able to tell which candidate won more counties could be accomplished via a 2x2 table.

But it would miss out on other information, which I think people are interested in exploring as well. The geospatial relationships are better presented in a map, and are preserved in a choropleth map in a way that hex maps and cartogram maps would distort.

jeffbee · on Nov 3, 2020

If you are going to insist on showing counties, why not also show boroughs of Alaska, and at their realistic sizes? If you do this the usual "OMG red land" depiction of 2016 gets flipped on its head, because the vast and empty regions of Alaska voted for Clinton.

If you want to argue against showing the boroughs of Alaska because nobody lives there, then we need to talk about all of those empty divisions of the lower 48, too.

mygo · on Nov 3, 2020

Thanks for the comment. Yep Alaska was a tough one. Although Alaska is separated by boroughs, their voting data is not. Researchers come up estimates for how each borough likely voted, but there can be different estimates depending on the methods used. Because of this, it felt more accurate to go with the concrete data that we had, even if it was not at the borough level. Although, I think there is merit to the argument that borough estimates should be used, even if they may not be accurate down to the last recorded vote.

jeffbee · on Nov 3, 2020

I think your map appropriately deemphasizes Alaska according to its near-complete absence of voters. My remark was addressed more toward the organizations who insist on going with the bright-red/bright-blue county map.

pwinnski · on Nov 3, 2020

Pretty beautiful, without the shape distortions that usually plague population-based electoral maps.

Land doesn't vote, people do.

gpm · on Nov 3, 2020

This is cool.

I suspect that the value (white to dark scale) should be proportionate to votes/square km (or even better votes/pixel, since the projection isn't perfect at preserving area) instead of just number of votes though. In the current formula a huge county with 1000 is as dark as a tiny county with 1000 votes, but is visually much larger.

st1ck · on Nov 4, 2020

Exactly. It'd only make sense when 2 counties, each with area S and N votes would have same color as a county with area 2 * S and 2 * N votes.

mygo · on Nov 4, 2020

agreed. on it =)

vmception · on Nov 3, 2020

I had used similar data to speculate on the minimum number of people needed to move from the cities to the various counties in order to make the whole country a single party, in theory. A reality is that one party has the numbers of people necessary while the other does not.

I was curious to how much that would cost so it would be clear whether a super PAC could fund it. It would involve housing people long enough to be eligible to be registered in that state.

But then the pandemic happened and people did it on their own. We'll see!

lopmotr · on Nov 3, 2020

One party has the numbers they need because they need those numbers to be competitive in the current system and have made whatever concessions they needed to to achieve them. If the demographics changed then the parties would adapt so that again they both just barely have what they need. The parties would adapt by campaigning to different groups of voters or changing their policies to attract the other party's voters. It's a waste of resources and concessions to gain more voters than are useful for winning.

liminal · on Nov 3, 2020

Bivariate color ramps are usually a disaster, but they work beautifully here. Nice job!

tantalor · on Nov 3, 2020

> The statistical upper fence can be calculated using the formula Q3 + 1.5 * IQR

What is Q3? IQR?

chongli · on Nov 3, 2020

Third quartile plus 1.5 times the interquartile range (Q3 - Q1). The Q1, Q2, Q3 quartiles correspond to the 25th, 50th, and 75th percentiles. Q2 is also known as the median.

s17n · on Nov 3, 2020

It's a nice attempt, but capping the county population at 60000 means the map is effectively ignoring the majority of the population (if you consider the number of "ignored people" to be the sum over all counties of the county's actual population minus 60000)

aabhay · on Nov 3, 2020

It’s only capping the saturation, so that counties with > 60000 residents are shown in brightest color. The color itself though still varies from red to blue based on percentage vote margin. In my mind, this is a great tradeoff, with the only exception that dark Grey counties (large counties with roughly 50:50 split) are way more important than they look since humans tend to de emphasize greys. I would have preferred a brown or something. Purple is not used as it has an imbalanced perceptual effect.

s17n · on Nov 3, 2020

Right, so basically everybody above the 60000 cap is ignored. Basically the difference between a county with 30000 and 60000 is a lot less important than the difference between a county with 60000 and 600000 - this map is truncating most of the population.

drc500free · on Nov 3, 2020

Really nice work, but also quite misleading. The "real" map isn't interesting enough, so 97% of votes are excluded from the most populous liberal counties using an artificially low ceiling.

cs702 · on Nov 3, 2020

As the author of the OP explains, those counties are shown with 100% saturation. Moreover, the author also shows a map without the upper fence[a] and explains why he thinks that map is misleading.

[a] https://stemlounge.com/content/images/2019/10/muddy_no_upper...

tene · on Nov 3, 2020

I've failed to find any text in the article that discusses why that map is misleading.

There's a section that discusses why it would be visually misleading to use a logarithmic scale, but I can't find anything that justifies the use of the statistical upper fence.

I'd be very interested if anyone could comment on why the statistical upper fence is considered less misleading for visualizing election results.

mygo · on Nov 4, 2020

It’s kind of like the exposure on a photograph where there are some extremely bright spots, as bright as the sun. If you set the exposure to not clip those bright spots, you won’t be able to see the rest of the photo. Due to not being able to find a scale that faithfully reproduced the entire dynamic range (without visually equating population mountains to population plains), some clipping had to occur; a compromise. We could clip the few outliers using a standard statistical convention and retain the highs, mids, and lows that make up the rest of the image, or keep the statistical outliers and end up clipping pretty much everything else as a result. I included both renditions.

You’re correct that I didn’t say the overexposed graph is misleading. Neither graph is wrong, they both tell truths.

An article could probably be written about the one you’re inquiring about, that discusses how Los Angeles County, Cook County, and ~eight other counties are unlike any other county in the US in terms of population. Completely in a league of their own. As distant as they are, they may be more alike with each other than the counties surrounding them.

But there is also a lot of information to be gleaned from the photograph with the upper fence that leaves the non-outliers within gamut.

LeifCarrotson · on Nov 3, 2020

As the article describes, the 'artificial' ceiling is the standard quantity known in statistics as the 'upper fence':

https://en.wikipedia.org/wiki/Outlier#Tukey's_fences

I agree that it would be nice to have a scaled version of the top few counties - maybe an area-corrected inset of the most populous counties over the Pacific, Atlantic, Gulf, or Lake Michigan.

Would Cook county fit in the lake, or would it need to overlay Canada? It's half the population of the state of Michigan...

Regardless, hoping this gets updated for 2020.

vmception · on Nov 3, 2020

Now do one with tooltips on hover or tap so we can see what the margin is!

mygo · on Nov 3, 2020

I've got just the thing. An Ohio State economics student made this adaptation: https://tuttlepower.github.io/VotingMap/?fbclid=IwAR0McVXHOI...

vmception · on Nov 3, 2020

Nice, and to move the goal post just a little further, what is the minimum number of people in the minimum number of counties where people from the populous blue cities could move to flip every state blue.

Only saying it this way because the population of blue is larger.

mygo · on Nov 4, 2020

Since electoral college votes are dispersed at a state level, it doesn't matter if those blue votes come from blue counties, so long as they go from a blue state to a red state.

Unless you're talking about the displacement that would be required for every county to be colored blue?

vmception · on Nov 4, 2020

I’m talking about flipping every state blue since most states arent “red” by a large majority or however that state happens to count