I really have no idea where this idea that the polls failed comes from. There were only two bad calls this cycle and every other outcome was within the margin of error. It was pretty much the same case in 2016 where people who had no idea what they were talking about suddenly decided they were certain polling failed because they are unable to grasp the concept of margin of error and sample size.
Polls can only guess about turnout and the try to work backwards from there. The turnout estimates were wrong but not shockingly so, and as a consequence a lot of polls ended up having the result be at the far end of their margin of error. Nothing went wrong. Polling is hard. Get over this idea that you can have some sort of certainty regarding an election until we actually hold the election.
If you dig into the the margins they aren't looking very good, even if they got a fair number of eventual winners correct.
Predicted vs Actual (FiveThirtyEight's averages vs NYTimes' current tally; Biden's margin is positive)
PA: +4.7% vs +.5% (-4.2%)
FL: +2.5 vs -3.4 (-5.9)
TX: -1.5 vs -5.9 (-4.4)
OH: -0.6 vs -8.1 (-7.5)
PA: +4.7 vs +0.5 (-4.2)
IA: -1.5 vs -8.2 (-6.7)
NC: +1.7 vs -1.4 (-3.1)
WI: +8.3 vs +0.6 (-7.7)
GA: +0.9 vs +0.1 (-.8)
MI: +8.0 vs +2.6 (-5.4)
AZ: +2.6 vs +0.6 (-2.0)
NV: +6.2 vs +2.0 (-4.2)
They pretty consistently overpredicted Biden's margin by about 4-7% in almost all of the swing states. Even if that's within, or close to, the margin of error; there's a systemic issue if it's happening in almost every state.
At some point, reporting on polls when they have this bad of a track-record serves no journalistic purpose and just confuses the public. Like, the discussions people were having in the days before the election about Biden's strength barely resembles reality.
I respect FiveThirtyEight and their work, and I do think they're intellectually honest and generally speaking good at their jobs. But they can only be as good as their sources, and when the sources are this terrible, they shouldn't be reported on this, and not given the statistical and scientific sheen of authenticity 538 gives them. Like, that site is starting to have a net negative influence on the world, and when that is the case with a journalistic institution, what are you even doing?
One of the things that's annoying about 538 is that they don't admit they're wrong. The excuse they keep repeating is that "1 in 10 chance Trump can win. That's about the same chance as rain in LA, but it does rain in LA!". 538 seems to have forgotten in their condenscencion is that we are not look at you to explain the odds. We are looking at you to provide an accurate probability estimate. Their defense is completely ridiculous. I follow betting odds which had Biden/Trump at 60/40 split before the election day - what's the point of polling when you can look at people putting money on the line, let the incentives work for you.
I'm not sure that's quite fair. They do write some excellent retrospectives and admit when they made mistakes, at least more so than almost any other news outlet I've seen:
I have noticed them getting a bit more defensive recently which is irritating, but I think they are honest. There's only so much "garbage in garbage out" they can compensate for. If this ends up at +74 EV, which is where it looks to be heading, it'll be "Z=-.6" prediction (not actually normally distributed, but it's the easy calculation) prediction, meaning 27% of the outcomes were more Trump. It's not great, but I appreciate them giving a realistic model.
I don't see him posting how the Bettfair was way more accurate than his polls.
I also get that Nate Silver doesnt poll himself, but he aggreates polling information, assign grades and mixes it into his formula. I just have zero trust in it. It's not like we get to test his predictions often. 1 sample size every 4 years.
Biden got out to around $4-$5 during election night. I wouldn’t put much weight in the betting when it appeared to fail to account for the expected composition of postal votes. Trump should have been at best a slight favorite before the effect of the postals started showing in the counts.
And why do you think they failed to provide an accurate probability estimate?
If I tell you that the odds of your dice roll being a 6 is only 16%, and you roll a 6, does that mean I failed to provide an accurate probability estimate?
The only way to judge the accuracy of a probabilistic model, is by quantifying the weighted error across numerous predictions. Not by looking at two yes/no outcomes. This is the exact point they are trying to get across, which you wrote off as annoying.
You do raise an interesting point about the betting markets - I too am very curious about whether they have a better track record than people like Nate Silver. Looking forward to someone analyzing their relative accuracy over a large sample of independent elections.
I saw predictit.org swing from pro Biden the morning of the election, to Trump in the afternoon, to Biden the next morning, so it seemed they were as confused as anyone.
They also didn't seem very aware that mail votes (leaning Biden) would likely be counted after in person votes (leaning Trump), which 538 had been predicting would cause a temporary pro-Trump lean for ages.
If you roll that same die 10 times and you get a 6 each time, then we can say that your probability estimate was wrong. But that's what happened with the polling of these states. The fact that the estimates consistently overestimated the Biden vote points towards a bias in the model used to estimate likely voters. If this were just a matter of the margin of error at work, each pull at the lever (each state) should be randomly distributed around the estimate. But that wasn't the case.
> If you roll that same die 10 times and you get a 6 each time, then we can say that your probability estimate was wrong. But that's what happened with the polling of these states.
The 538 model is explicitly based on the assumption that the state-polling errors are correlated to one another to some extent. Ie, if Trump performs better-than-expected in OH, he will likely perform better-than-expected in FL as well. This is why they rated Trump's chances as being 0.1, not 0.1^8. Given how polling works, this is exactly the right assumption to make.
Hence my earlier comment that if you want to evaluate the accuracy of 538 or any other model, you need to evaluate it across numerous different elections/events, over an extended period of time. Not a single day of elections in a single country.
This is kind of missing the point. The issue is the bias in the state-level polls. 538's model is to predict the winner of the election based on state polling data from other sources. The state level predictions are just averages of polls presumably weighted by quality. But this averaging cannot remove bias in the polls if that same bias is in many polls.
538 is a poll aggregator, not a pollster. If the polls are systemically wrong, which is what is being alleged above, there is <explicative> all that 538 can do to fix that.
Yes, I’ve been following 538 for a while now. They also grade each pollster.
The point is - 538’s input is a bunch of polls, their output is a prediction. Whether they do the polls or aggregate them is not relevant - they’re analysts whose job is to provide accurate estimates.
You can’t shift the blame on inaccurate underlying polls - Nate has time and again said, they look at many aspects in their estimates. Not just polls.
They look at many aspects at the beginning of the race. By the end these other aspects, the “fundamentals”, are purposefully dialed down to 0 and all that remains is an aggregate of polls. This is based on a theory that polls should be more accurate the closer you get to Election Day because voters have less time people have to change their minds. If those polls are systemically wrong, then there is nothing 538 can do to fix that; it’s a literal garbage in garbage out moment.
Looking at trends, which is easy to do on real clear politics, you could have seen the Senate was going to be extremely close, and the presidency, while not the blowout everyone expected thanks to 538, would still favor Biden. Here's one forecast that was arguably closer than 538: https://www.270towin.com/
The polls did tighten at the end, but you can't just look at a snapshot (even near the end) to account for the trend. In all of these states the polls narrowed in some, but the final results were within MoE:
You can go state by state[1] to see which polls were more/less accurate within MoE compared to the final result and the trends in each state. 538 had a better chance of Biden winning by 400+! EVs [2], than the 306 he's likely to win with. It's this distribution that could have been better at accounting for severe polling errors in some states.
538 did do CYA posts[3], but here while bringing forward the error from 2016 in Ohio seems right, it still projects a win of 335+ EVs. Optimistic for Biden is the kind way of saying what the final 538 projection were, severely more wrong than individual polls is more accurate. If their distribution in [2] was better, I would be more willing to give them a pass.
They provided the probability distribution. The fact that you can’t handle math and need some sort of absolute certainty for a future event is not 538’s problem.
That's a bit strong for what Gelman said. I'm a big fan of Gelman (and learned from his books!), but he specifically mentioned that both Gelman et al's Model and 538's Model did indeed capture the outcomes in their probability distributions, but that to improve performance going forward it was much better to predict closer to the median than closer to the tails. (And funny enough, Gelman gave 538 some grief earlier on making a model with very wide tails.) This is a nuanced but very fair criticism, and taking a Twitter-style summary of it I think is overly reductionist.
Ah yes. Mr 'let me tell you why Nate is wrong' Gelman, who is now Mr 'let me tell you why the fact that I missed bigger than Nate is not my fault and in fact is entirely the fault of these other people' Gelman. Forgive me if I find his excuses laughable, but I guess if it makes him feel better about himself we can humour him. He even manages to choke his first rant by missing once again on EV and vote percentages.
it's not one single election -- it's consistent failure over multiple state elections, by large margins, all in the same direction -- which falls beyond any reasonable probability
I don't think much of evgen's unreasonable personal attack. But 538 isn't necessarily claiming that the per-state error will be normally distributed around their predictions.
I don't know the specifics of their model, but probably they are claiming "with these polls, the probability of this outcome is...". The polls being consistently biased doesn't tell us much about 538s model. They said Biden would almost surely win and despite a massive surprise in favour of Trump, Biden won.
And even if Biden lost, 10% upsets in presidential are expected to happen once every 10 elections like this one.
> If per-state error isn't normally distributed, that's evidence of bias, or bad polling.
No!
Assuming the per-state error would be normally distributed in some neutral world is making huge assumptions about the nature of the electorate, polling, and the correlations of errors between states, you can't do that! You would specifically /not/ expect per-state error to be evenly distributed because the nature of the error would have similar impacts on similar populations and there are similar populations of people that live in different states in differing numbers.
You should review the literature about the nature of the (fairly small) polling misses that impacted the swing states and thus disproportionately the outcome in the 2016 election. You will probably find it interesting.
There are unavoidable, expected, sampling errors which are, by definition, random. That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.
Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason. Maybe you relied on landlines only, maybe you spoke with too many men, or too many young people, asked bad questions, miscalculated "likely voter," whatever. Accurate, valid, trusted polls don't have these flaws, the ONLY errors are small, random, expected sampling errors.
> Accurate, valid, trusted polls don't have these flaw
Yes, they do. Because (among many other reasons) humans have a choice whether or not to respond, you can't do an ideal random sample subject to only sampling error for a poll. All polls have non-sampling error on top of sampling error, it is impossible not to.
when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll. Ask different questions, find new ways of obtaining respondents from all demographics, adjust raw data, etc. A professional pollster doesn't just get to say, hey, some people didn't want to talk to me ¯\_(ツ)_/¯
> when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll.
Pollsters do that for continuously, and there were definite recalibrations in the wake of 2016.
OTOH, the conditions which produce non-sampling errors aren't static, and it's impossible to reliably even measure the aggregate of non-sampling error in any particular event (because sampling error exists, and while it's statistical distribution can be computed the actual error attributable to it in a by particular event can't be, so you never no how much actual error is due to non-sampling error much less any particular source of non-sampling error.)
> That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.
That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.
> Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason.
Or the model was inaccurate. Perhaps the priors were too specific. Perhaps the data was missing, misrecorded, not tabulated properly, who knows. Again, the results fell within the CI of most models, the problem was simply that the result fell too close to the mean for most statisticians' comfort.
>That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.
The CI is due to sampling error, not model error. If the error of the estimate is due to sampling error, the estimate should be randomly distributed about true value. When the estimate is consistently biased in one direction, that's modelling error, which the CI does not capture.
> If the error of the estimate is due to sampling error
What does "estimate" mean here? Gelman's model is a Bayesian one, and 538 uses a Markov Chain model. In these instances, what would the "estimate" be? In a frequentist model, yes, you come up with an ML (or MAP or such) estimate, and if the ML estimate is incorrect, then there probably is an issue with the model, but neither of these models use a single estimate. Bayesian methods are all about modelling a posterior, and so the CI is "just" finding which parts of the posterior centered around the median contain the area of your CI.
I'm not saying that there isn't model error or sampling error or both. I'm just saying we don't know what caused it yet.
> Landed within the confidence interval? Are you kidding? CI is generally 2-4 points in these election polls.
The models and their data are public. The 538 model predicted an 80%CI of electoral votes for Biden as: 267-419, with the CI centered around 348.49 EVs. That means that Biden had an 80% chance of landing in the above confidence interval. Things seem to be shaking out to Biden winning with 297 EVs. Notice that this falls squarely within the CI of the model, but much further from the median of the CI than expected.
So yes, the results fell within the CI.
Drilling into Florida specifically (simply because I've been playing around with Florida's data), the 538 model predicts an 80%CI of Biden winning 47.55%-54.19% of the vote. Biden lost Florida, and received 47.8% of the vote. Again, note that this is on the left side of this CI but still within it. The 538 model was correct, the actual results just resided in its left tail.
Dude, you're gaslighting by using the national results as evidence instead of the individual states, which is what this has always been about since my original comment. Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval (BTW, who uses 80%? and not 90-95%?), on the same side. A bit closer to the mean in AZ and GA but same side, over-estimating Biden's margin of victory. Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.
Many political handicappers had predicted that the Democrats would pick up three to 15 seats, growing their 232-to-197 majority
Most nonpartisan handicappers had long since predicted that Democrats were very likely to win the majority on November 3. "Democrats remain the clear favorites to take back the Senate with just days to go until Election Day," wrote the Cook Political Report's Senate editor Jessica Taylor on October 29.
> Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval
While I haven't checked each and every individual state, I'm pretty sure they all fell within the CI. Tail end yes, but within the CI.
> (BTW, who uses 80%? and not 90-95%?)
... The left edge of the 80% CI shows a Biden loss. The point was 538's model was not any more confident than that about a Biden win. So yeah, not the highest confidence.
> Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.
Posting a bunch of media articles doesn't prove anything. I'm not saying there isn't systemic bias here, but your argument is simply that you wanted the polls to be more accurate and you wanted the media to write better articles about uncertainty. There's no rigorous definition of "systemic bias" here that I can even try to prove through data, all you've done is post links. You seem to be more angry at the media coverage than the actual model, but that's not the same as the model being incorrect.
Anyway I think there's no more for us to gain here by talking. Personally, I never trust the media on anything even somewhat mathematical. They can't even get pop science right, how can they get something as important as an election statistical model correct.
Not necessarily. Errors, like outcomes, are not independently distributed in US elections. Politics are intertwined and expecting errors and votes to be independent on a state (or even county) basis is overly simplistic. This is also what makes modelling US elections so difficult.
Sampling errors are random, and expected. Other types of misses are not simple "errors" but polling flaws, like sampling a non-representative group, ignoring non-responders or assuming they break the same as the responders, asking poorly-worded questions, etc.
Occasional flaws in polling is understandable and tolerated. But when those misses repeatedly line up the same way, and are rather sizeable, that's evidence of either systematic flaws, or outright bias.
I'm not sure what a "sampling error" is. To echo the sibling poster, per-state sentiment is not normally distributed. For example, we know Trump is more popular among white men than other demographics. This means that if we were to create a random variable that reflected the sentiment of white men throughout the US, we would (probably though I'd have to dig deeper into the data) presume to see a higher median vote count in this demographic. However, we cannot say that Trump's popularity in Massachusetts is independent from his popularity in New York, because his popularity in the white male demographic is the dependent variable between both random variables.
I was discussing in good faith, so I'm not sure why you chose to be snarky. Let's clarify here, I'm not sure what "sampling error" in this case would be, such that it is distinct from electoral trends at large. The random variables in question _are_ demographic groups. How is it meaningful to discuss sampling error if your assumption is that state and county data is independently distributed? The poll data that Gelman et al used is public data, I urge you to take a look and work with it.
The inputs it uses to spit out probabilities is known to be bad. Any scientist or researcher who claimed to get valid results from known bad inputs would be ridiculed.
To offer a concrete example here, survery respondents are often biased based on who actually sees and fills out a survey. A common technique used to overcome this non-representative sample is called post-stratification. There are, of course, limits to post stratification especially in instances of low sample sizes, but techniques to overcome issues with data are well known.
Science does not require an unbiased sample from a normal distribution to work. Bias is a technical term that the field of statistics is very comfortable working with. Scientists can also often get good results out of biased inputs.
538 has corrections for bias already. They seem to have worked in this instance - I repeat myself but: massive surprise, Biden still president.
You are pointing at evidence that 538 correctly called 11/12 races using statistics, and their confident call on a Biden president withstood a 4-7% swing (!!).
The existence of bias doesn't invalidate their predictions. Everyone knows that polls can be badly off target in a biased way - that isn't a new phenomenon.
When they talk about X% chance of Y being president they should be optimising to the outcome, not the margins.
It's not like we do elections every month to test out their probability distribution against empirical data. The distribution collapses into a binary outcome at the end.
I have a dice. I claim the distribution is of equal outcome for each side. Well...we don't get to test the dice more than once. 1 sample size does not prove that the 538's predictions were right (or wrong).
Thanks for assuming I can't do math, no way to argue with someone but I am actually pretty bad at it. :-)
Everyone is bad at probability and statistical distributions, not just you. The problem with modeling elections is that there are so few of them and the data is very noisy and until quite recently rather suspect. Let's not pretend that this was a normal election, either in the candidates running or in the manner in which the campaign and election was conducted.
As to the question of why bother, it is because bad polling is better than no polling at all. Campaigns are now multi billion dollar enterprises managing tens of thousands of temporary employees for the creation of a product that will only be sold once and in 18+ months from when they start the process. Any data is better than nothing.
The fact that the public has become obsessed with polls is probably due to the ongoing nationalization of politics.
> I respect FiveThirtyEight and their work, and I do think they're intellectually honest and generally speaking good at their jobs.
But 538 chooses how to weight the polls. For example, they only gave Rasmussen (which was far closer on these) a C rating, preferring less accurate polls.
FWIW, their articles clearly lean left [1], IDK if that affects their analysis/forecasts.
I am reminded of all the times in the last four years (really my entire adult life) where during policy discussions of any magnitude the ultimate “kill” for an idea is “but it’s only popular with $small_number of the vote” or “but poll after poll shows it as a losing position.” It’s used as a cudgel to push out things that would actually make people feel their government is doing things with impunity. Donald Trump’s approval rating rarely if ever dipped below ~43% (with the usual and huge error bars) and consequently republicans reported they couldn’t go against him ever because of the polls. “Spending political capital” is a concept where a ruling party takes unpopular actions/enacts unpopular law because it’ll only push their approvals toward their long run averages.
I think this needs to be a serious topic of discussion. After this administration, “governing by polling firm” seems disingenuous at best and outright detrimental to all involved.
538 does make adjustments based on what they think are biases in individual polls. So they can't blame it all on their sources. They've taken on some responsibility to evaluate those sources and adjust their model accordingly. The polls had significant bias, and 538 failed to fully adjust for it.
One thing I found interesting about the final polling averages was that they were much more correct about Biden's share of the vote, even when they were very wrong about the margin itself (since these polls don't add to 100%, that's possible). See below for the 538 polling average vs. the current NYT tally (% for Biden):
PA: 50.2% vs 49.7%
FL: 49.1% vs 47.8%
TX: 47.4% vs 46.3%
OH: 46.8% vs 45.2%
IA: 46.3% vs 44.9%
NC: 48.9% vs 48.6%
WI: 52.1% vs 49.5%
MI: 51.2% vs 50.5%
GA: 48.5% vs 49.3%
AZ: 48.7% vs 48.9%
NV: 49.7% vs 49.9%
Most are within ~1% or so (some for Biden, some for Trump) with some outliers being Wisconsin (-2.6%), Ohio (-1.6%) and Iowa (-1.4%) in Trump's favor still not being so far off (relative to margins of error).
I'm not sure what that "missing" bit in the polls really means (undecided?) but it seems the issue was that that bit of the electorate ended up going entirely for Trump in many places.
That's really interesting, I hadn't sliced it that way yet.
I know FiveThirtyEight allocates undecideds evenly - if 8% respond undecided they assume 4% will break Biden and 4% will break Trump. In one of the podcasts they discussed this assumption and it's what they have found to be the most accurate historically.
We might be seeing a shift towards that no longer being true - maybe 75% break red and 25% break blue now.
Edit: Also, It looks like Jorgensen (Libertarian) significantly underperformed her polling (~3% > ~1%) in the first few states I spot checked. If those voters broke for Trump, that makes up about 2% of the error. That'd take a lot more in depth checking though.
Probably worth noting that the asymmetric break of the undecideds in 2016 was what won the election for Trump. There were a lot of undecided voters and a lot of them decided for Trump late in the process so models that assumed a traditional 50-50 split were wrong.
There definitely seems to have been systemic error. Note though that many of these states are likely to rise by 0.5-1.0% as they finish counting ballots.
There are tons of sources of systematic error in a "strange year" election plus a "nontraditional incumbent".
One of the most plausible ones mentioned by Nate was the fact that people who wfh weré more likely to be reached by pollsters this year, and may skew D.
But you didn't actually show the margin of error. In any case, I think I'd take this argument more seriously in an article of its own, rather than on a back-and-forth forum like this. Not enough opportunities to request and provide context.
This exactly. If there's one thing all my physics and math teachers burned into my head, it's that a number without an error is meaningless.
Florida for examples may be off by 8 points, but 538 gave Trump a 1/3 chance of winning, so it was well within the error. What people don't realize is that the error on polls are pretty damn big, and elections in modern times have been extremely close. Like most of the swing states come down to under 100k votes / 1%. It's insane how close these rates are, and no polls will ever have any chance at predicting things like that.
The problem is even these have narrowed as votes have counted. PA might be ~1 point victory. Many of these have shrunk. We can't really say what the 'actual' is until all the votes are counted.
Not only that, 538 gave Trump a 1/3 chance of winning Florida, so it's not a surprise at all, and all things considered now that the votes are almost final, 1/10 chance for Trump to win seems reasonable. But as you mention, there's no way to figure out what the true probability was.
The bigger mistake here though is looking at the mean and ignoring the error on those values. 538 is fairly conservative in their calculations and their error bars are pretty big, so almost all of these results are well within their error bars.
Obviously, the focus is on the presidential polling, but I wonder if polling is off by the same amount when there are polls regarding what Americans want or think about any given topic.
My issue with polling is that many races were well beyond the margin of error. The Senate polling in particular was bad this year.
Sara Gideon was favored to win the Maine race in the polling, because there hadn't been a single poll showing Collins in the lead since July. She lost her race by 9 points.
It's also strange to see the region makes a difference in the poll error. The polls in Minnesota were basically spot-on, but in Wisconsin (demographically very similar), the polling average was Biden +8, with one ABC news poll showing him +17, the kind of outlier result you'd expect with a +8 average. He's gonna win there by ~1 percentage point.
There's something wrong with how a lot of these pollsters determine samples, or how they judge someone's likeliness to vote.
It was both the sample (read actual reports from pollsters on how hard it is to get a sample these days... people DO NOT want to participate in polls) and an underestimate of the number of voters the Republicans would get to the polls. If your LV model is wrong you are really flying blind and for various reasons both parties activated a ton of voters this cycle so we had an electorate that no one was able to model well. The last time this large of an electorate turned out (in terms of percentage of eligible voters) they were deciding between William McKinley and William Jennings Bryan.
> It was both the sample (read actual reports from pollsters on how hard it is to get a sample these days... people DO NOT want to participate in polls) and an underestimate of the number of voters the Republicans would get to the polls.
Exactly, polling is very difficult, and getting even more so.
I was polled a few years ago by Gallup or Pew (or one of the other well known ones). The call was from an unknown number and I took it. No way I'd do that now with all the robocalls.
Is it possible that the projections themselves affected the outcome? If people saw that they had a comfortable lead, they wouldn't be as motivated to turnout.
Then the polling results might ironically be more accurate if people believed in them less.
Yes they can affect the outcome, but the opposite direction than you described. People who feel their candidate is doomed to lose don't turn out. People like to vote for a winner.
edit: To be clear, when polls overwhelmingly suggest a landslide, it suppresses votes from both sides. But a much higher proportion of the losing side will choose not to vote, thus inflating the gap.
it's not just polls that are wrong but likely this premature projection as well, tens of thousands of provisioned and military ballots are still not counted. Recounts are still to happen in most swing states. Lawsuits are pending.
Everyone would be surprised if Trump wins, because at this point it is so mathematically improbable that even gun-shy decision desks at major news networks are able to make the call without fearing they will look like idiots. Recounts have never moved a state election more than a thousand votes that I am aware of and Trump is behind by tens of thousands in the states that matter. Lawsuits won't do anything, because no on has standing to prevent someone else from legitimately voting according to the rules in place at the time of election, nor of preventing someone else from counting those votes as legally mandated.
Elections are run by the states and so in almost all instances the federal courts will defer to the state courts when it comes to things like determination of fact. So far there has been no evidence of either fraud or misconduct and the thin claims put up so far have been laughed out of court. Lacking any claims of fraud or misconduct the only other option is to somehow be able to prove a miscount and since everyone learned their lesson with the hanging chads of the butterfly ballots this is exceedingly unlikely. With no claims to be made by anyone with standing the states are on a clock in terms of exercising their Article 2 powers.
On the first Monday after the second Wednesday in December (Dec 14) the electors are going to meet in the respective state capitals, whereupon they will cast their votes and attach six copies of their vote to six Certificates of Ascertainment which will go to the president of the Senate (Chuck Grassley), two to the national archivist (David Ferriero), and then one to their secretary of state and one to the chief justice of whatever federal district their state it in. Voila! Now at 12:01 on the 20th of January anyone, even you, could deliver the oath of office to Joe Biden and swear him in as the 46th president.
> Elections are run by the states and so in almost all instances the federal courts will defer to the state courts when it comes to things like determination of fact.
Biden is leading in most states by more votes than there are outstanding ballots; that does not really qualify as hair thin. The margin may be small in a few states, but it is large enough in a sufficient collection of states right now that while we keep counting every last vote we also know that Trump lost.
If you look at RCP's polling average for Clinton in 2016, she outperformed that average for just about every race.
But, what the pollsters got crazy wrong, was Trumps polling numbers, by wide margins, far outside the margin of error.
I think they got it wrong this time too, and I think it all comes down to their methodology for actually getting a random sample of voters. Many polls are still married to live interviews and also to landline contact, and also tied to live interview polling - I think there is a partisan slant that they are not accounting for that includes a "propensity to answer a polling survey in the first place"
> I think there is a partisan slant that they are not accounting for that includes a "propensity to answer a polling survey in the first place"
They're well aware of that slant, and try their damndest to correct for it. But apparently they just don't have a good model of how large or how volatile that propensity attribute is, and how it relates to political leanings.
I was polled this year. One of the first questions I was asked is if I was answering the call on a landline or on a cell phone. They weight those differently. Also of course my basic demographic info (wage, race, marital status, education level, home ownership status, etc.)
I think your whole comment is correct, but instead of not accounting for the propensity factor, they do... just badly :)
I have no idea what the solution is to this. Maybe there isn't one, and instead we can get a silver-lining effect: If people trust the results of polls less, they're hopefully less likely to think the results are preordained. More voter turnout?
>I think your whole comment is correct, but instead of not accounting for the propensity factor, they do... just badly :)
In particular, Trump was extremely critical of polls and mail-in voting, and we have pretty much confirmed that his statements caused his supporters to avoid mail-in voting, so it's not too much of a leap to suggest the same thing happened with polls. Notably, this is a different effect from 2016, where IIRC Trump appealed to "non-likely" voters whose responses were inappropriately discounted.
It is a solution that has its own problems, but mandatory voting would solve this problem. The polling failures are primarily in trying to weight the sample that you manage to capture in a way that reflects the actual turnout on election day. If everyone is forced to vote then you eliminate the RV/LV problem and pollsters would only need to make sure their sample was representative of the population at large.
Slapping +/- 5% on top of a prediction that is systematically off by ~5% isn't "in the margin of error", it's just simply off. For one prediction, two predictions, no problem. They fall somewhere in this range with probability distribution (usually a Bell curve around the mean). But for hundreds, if not thousands, of polls to be so systematically off, it is not "within the margin of error" or a statistical fluke. If it was, you would expect a normal distribution around the mean of the polls. The mean was completely off.
Put simply, it was sampling bias. Pollsters screwed up. They either sampled the wrong voters, the wrong areas, mispredicted who would turn out, or voters did not accurately report who they planned to vote for. Garbage in, garbage out.
You can try to correct for sampling bias, but if you are blind to it, as your comment pushes more people to be, then you will fail.
Right, but the job of 538 is exactly to deal with that, and they do take into account the fact that errors are correlated, so if one poll is off, chances are they all are. That's why their predictions are fairly conservative. They gave Florida a 1/3 chance for Trump and 1/10 for the presidency. Both those numbers make sense.
Now you can argue that there's little utility if the error bars are so big, and you may have a point. The bigger issue here is that modern elections have been extremely close, and unfortunately we cannot get polling accurate enough.
I know they were within the margin of error, but the polls were all systematically within the margin of error in the same direction. Did a single poll that was wrong predict at GOP win and end up a Democratic win?
That has happened in basically every election. Errors in polls tend to be correlated across states. If they were independent, 538 would have been predicting a ~0% chance of Trump winning instead of ~10%.
Ah sorry, I was thinking of down ballot races and polls that would have had enough reputation to make it into the Economist or 538 polling averages. I'll do some more searching.
The DI poll is basically a push poll with questions like "Who do you believe is telling the truth about alleged Biden family corruption?”
One could argue the strategic mistakes of the Clinton campaign (ignoring the rust belt) and near fatal mistakes of the Biden campaign (which looks to be on issues swinging voters in battleground states) were driven by bad polling.
When I hear the conversation about "the polls were wrong" it's never been "the predictions were outside the error margin" so much as "voter sentiment was not accurately measured in regions where voters were most likely to flip and why they would do so."
> There is no reason that polls would be significantly different from the final vote results in most circumstances, if they are competently run.
There is lots of reason.
If you could wave your magic wand and get a representative sample of voters in your poll you would be right, but you can't. Instead you do something like call people on a telephone, get a sample of people who both answer the telephone and who give answers you think indicate they are likely to vote. That sample isn't going to be at all representative of the voting population, so you take your best guess at what the actual voting population looks like, and how representative each of the people in your sample is of the voting population, and weight it accordingly.
Your best guess at the actual voting population is probably wrong. Your best guess at how representative each person in your sample is of the actual voting population is wrong. Your doing things like assuming that every <race> person with <education level> votes similarly regardless of whether or not they answer the telephone and respond to pollsters because you really just don't have a better option.
Polling is hard and the result are likely non-representative. Great! Then don’t display them to the public and have pollsters in interviews acting as if their insight is in any way valuable to most people. They’re either incompetently wrong or, worse, intentionally manipulative.
Just because something exists doesn’t mean it has value in the wrong context.
Just because something isn't perfect doesn't mean it has no value. Polls communicate experts best guess at how the vote is going to turn out based on the data they have collected. It's likely to be a lot more accurate than a non-experts guess, or even the guess of an expert without any data. It's also not even possible to share the raw data because of privacy concerns.
The forecasts you see on sites like 538 and the economist are even better, because it doesn't just communicate experts best guess (democrats win by +7), but experts best guess at how accurate that guess is too (something like democrats win by +7 +- 6, except in a lot more detail and nuance).
On HN the catchphrase for this is basically "don't let perfect be the enemy of good".
@gpm what I’m arguing isn’t for replacing good with perfect. I’m saying these don’t make it to the level of good. They provide no relevant or valuable information to the public.
If weather forecasters were wrong nearly 100% of the time, would they provide value? If a hurricane was approaching and they couldn’t predict probable paths within any margin of error, would they provide any value.
If a weather forecaster can predict the temperature next week to within a few degrees and can tell you when they are sure it will rain, when they are sure it will be sunny, and provide a good guess as to rain or shine on most of the other days you would consider that person a good weather forecaster.
Do you expect the weather report for the next week to nail the temperature of every day next week to within 1 degree?
The pollsters can tell you if a hurricane is coming, they can tell you when the weather might shift significantly (albeit not be absolutely certain about which direction it is shifting) and they can give you a very good idea of what to expect even if they do not get the exact temp for each day correct.
really though, what tangible value do polls provide anyway? i get that the polling over the last few years hasn't been great, but i don't really understand why everyone's so upset about it. did anyone actually make or change decisions based on these polls? they're just predictions anyway.
for instance, in your example, a hurricane's path directly affects citizens of cities that lie on that path—do they need to prepare to leave their homes, etc.? but is anyone actually basing their decision to vote or who to vote for on the polls in such a way that it significantly affects the outcome?
I live in Canada but am working for an American country. I am considering if/when I want to move to the US, they informed me about the likely outcome of this election, which influenced my plans a non trivial amount (not a huge amount either, covid dominates in the planning process, but a non-zero amount).
For many people it can influence how they vote. If you're in a state with a potential to change the outcome of the election you are more likely to vote "strategically", i.e. vote for one of the two most popular candidates instead of vote for a third party that you prefer. On the flip side if you're voting in a non-competitive race you are much more likely to vote for a candidate who is unlikely to win in order to better indicate your preferences (i.e. you should be more likely to vote libertarian/green/...).
For many people who wins an election affects there careers going forward, in the US in particular a large number of people are fired/hired based on the election. Even if you're not in that position your industry might get more or less government funding, if you're a government contractor the projects you are working on might or might not be in danger of getting cancelled, and so on and so forth. Having better information earlier makes it easier to plan your life.
The US election is one of the most significant worldwide events that happens every 4 years, the idea that being able to better predict it is not valuable is... insane.
you definitely make some good points. i realized a few things i didn't think about after posting as well.
i guess i was thinking more along the lines of, most of the time i personally don't think you should be choosing a presidential candidate based on predictions of who may or may not win. even some of your examples are more centered on the ultimate outcome of the election and how to plan for it; i really think people whose lives could be affected that drastically should be planning ahead for that situation anyway.
nonetheless, i agree they do have some value, and perhaps i should have clarified my line of thinking more clearly.
I think it's clear that people being polled are not representative of those who are voting. Trump has eother managed to appeal to people resistant to being polled or has managed to disengage people that are being polled. Whatever your views on the man, it's remarkable.
What's more surprising to me is that he can do it consistently and that no one has seemingly appealed to these people before. Given that these are almost election winning demographics, it's odd that no one has spent more time on understanding these people.
You obviously have no understanding of either polling or statistics. This is like saying that if computer programmers were not incompetent we would not have bugs in programs, but since they all obviously do not know what they are doing we have a modern society held together with chewing gum and string because programmers are too dumb to get it right.
Sure, if some engineers told you "we are putting this person on Mars with a 99.9 probability", and the rocket in fact reaches Venus, the engineers are incompetent.
Also, polling in most European countries doesn't go this far wrong, on almost all polls. So it's not like "accurate polling is impossible", it's just the polls in the US that seem to have some problem (though there have been some spectacular failures in other countries as well, which itself may lead to some thoughts on the value of polls in general).
I think the main problem is that some forecasts gave an overwhelming amount of “hope” for certain races. This costs real money, as political operatives make decisions on where to spend money based on these forecasts. The closer a race is, the more likely more money will help shift the race.
Unfortunately you cannot just wish away the cause-effect here. Did Democrats piss away a hundred million in South Carolina on a race Jamie Harrison was never going to win, or did Jamie Harrison get within striking distance because of that money and was the investment in party infrastructure and training of lower-level party coordinators going to pay off over the next decade? Texas did not turn blue, but the effort spent on Beto's race and now on MJ Hagar may build down-ticket strength and knowledgeable people who may end up turning the state in a few more cycles.
When I grew up California and Virginia were red states, and I am sure people then also said money spent there was not going to move the needle and was a waste. When you have a billion+ to spend on races sometimes a little hope now pays off in a decade or two even if you lose this race.
California hasn't had 2 Republican Senators at the same time since 1971. California became Democrat when the Republican Party realigned to the Southern Strategy, not because of campaigning.
Prior to 1992 the only time California had two Democratic senators was post-Watergate and then prior to that single term for Tunney you have to go all the way back to 1863 to get two Dem senators. California had frequently elected Republican governors (including one who went on to become president), had house delegations that were majority Republican for stretches of time, and except for Johnson's win in 1964 CA was a solid batch of electors for the Republican presidential candidate up until Clinton.
CA was a Republican state from the late 19th century up until the 1990s.
> I really have no idea where this idea that the polls failed comes from.
For one, Andrew Gelman (professor of statistics and political science at Columbia University). Who knows more about the topic than anyone here in this HN discussion.
He may know a lot about the topic, but has little experience in actually modeling elections. He also has skin in the game and an incentive to make sure everyone knows that his miss was completely not his fault and was in fact due to failures by someone else, someone over there who you don't know, yes, she lives in Canada...
If the error skews in one specific direction for all the polls, then I don't think you can just excuse the polls by saying they're in the margin of error.
It was very very bad up and down the ballot. No shit one of our clients their poll was - i'm rounding here because this is my real name - +15%. They barely won <1%.
I wish pollsters would move towards very very large digital samples. You can't get a 15 minute survey but head-to-head you can get a huge sample for not a lot of money. And don't try to weight it to what you THINK the turnout will be. Break out likely voter or not. Do not add any inference or opinion or 'math'
All digital surveys skew heavily to a younger demographic and heavily male. You have basically captured the political opinions of the fraction of the potential electorate least likely to vote. How exactly is that worthwhile?
I don't buy that, and even if I did the flip would be worse if you look phone contact rates.
I've seen some good very large digital surveys. We do some - geared towards measuring ad recall/lift - and I find them valuable, especially a giant sample on only two recall + head to head questions
You can always ask for gender/age and weight it but that's part of what I see as the problem. so many 'traditional' surveys i see make fairly large adjustments, lots of 'looking back' at past turnout and over fitting based on personal bias. Especially when weighting from very small sub-sample xtab e.g. hispanics. im not a pollster and that's just my still fairly-insider / polling adjacent insight
Another problem with digital surveys is most of the firms do opt-in panels, like having people register to take surveys and get paid on mechanical turk. and then they weight from there. I think this is a problem. The surveys we do run inside of mobile ads, similar to Google / FB Brand Lift surveys they go after those who saw the ads or a truly random sample, not a small biased group of survey panel
Polls can only guess about turnout and the try to work backwards from there. The turnout estimates were wrong but not shockingly so, and as a consequence a lot of polls ended up having the result be at the far end of their margin of error. Nothing went wrong. Polling is hard. Get over this idea that you can have some sort of certainty regarding an election until we actually hold the election.