Hacker News new | past | comments | ask | show | jobs | submit login

> If per-state error isn't normally distributed, that's evidence of bias, or bad polling.

No!

Assuming the per-state error would be normally distributed in some neutral world is making huge assumptions about the nature of the electorate, polling, and the correlations of errors between states, you can't do that! You would specifically /not/ expect per-state error to be evenly distributed because the nature of the error would have similar impacts on similar populations and there are similar populations of people that live in different states in differing numbers.

You should review the literature about the nature of the (fairly small) polling misses that impacted the swing states and thus disproportionately the outcome in the 2016 election. You will probably find it interesting.




Yes!

There are unavoidable, expected, sampling errors which are, by definition, random. That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.

Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason. Maybe you relied on landlines only, maybe you spoke with too many men, or too many young people, asked bad questions, miscalculated "likely voter," whatever. Accurate, valid, trusted polls don't have these flaws, the ONLY errors are small, random, expected sampling errors.


> Accurate, valid, trusted polls don't have these flaw

Yes, they do. Because (among many other reasons) humans have a choice whether or not to respond, you can't do an ideal random sample subject to only sampling error for a poll. All polls have non-sampling error on top of sampling error, it is impossible not to.


when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll. Ask different questions, find new ways of obtaining respondents from all demographics, adjust raw data, etc. A professional pollster doesn't just get to say, hey, some people didn't want to talk to me ¯\_(ツ)_/¯


> when polls don't match up with reality, as they didn't in 2016, the pollsters have a responsibility to re-calibrate the way they conduct the poll.

Pollsters do that for continuously, and there were definite recalibrations in the wake of 2016.

OTOH, the conditions which produce non-sampling errors aren't static, and it's impossible to reliably even measure the aggregate of non-sampling error in any particular event (because sampling error exists, and while it's statistical distribution can be computed the actual error attributable to it in a by particular event can't be, so you never no how much actual error is due to non-sampling error much less any particular source of non-sampling error.)


"There are unavoidable, expected, sampling errors which are, by definition, random."

This is false, if you think sampling errors are "by definition random" you don't understand polling.

Which is fine, just accept that you don't and dig into the literature or move on.


> That's why valid, trusted polls calculate a confidence interval instead of a single discrete result.

That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.

> Other types of "errors" -- election results that repeatedly fall outside the confidence interval, or are consistently on only one side of the mean -- only arise when the poll is flawed for some reason.

Or the model was inaccurate. Perhaps the priors were too specific. Perhaps the data was missing, misrecorded, not tabulated properly, who knows. Again, the results fell within the CI of most models, the problem was simply that the result fell too close to the mean for most statisticians' comfort.


>That is what each of these statistical models did, yes. And the actual outcomes fell into these confidence intervals.

The CI is due to sampling error, not model error. If the error of the estimate is due to sampling error, the estimate should be randomly distributed about true value. When the estimate is consistently biased in one direction, that's modelling error, which the CI does not capture.


> If the error of the estimate is due to sampling error

What does "estimate" mean here? Gelman's model is a Bayesian one, and 538 uses a Markov Chain model. In these instances, what would the "estimate" be? In a frequentist model, yes, you come up with an ML (or MAP or such) estimate, and if the ML estimate is incorrect, then there probably is an issue with the model, but neither of these models use a single estimate. Bayesian methods are all about modelling a posterior, and so the CI is "just" finding which parts of the posterior centered around the median contain the area of your CI.

I'm not saying that there isn't model error or sampling error or both. I'm just saying we don't know what caused it yet.


Landed within the confidence interval? Are you kidding? CI is generally 2-4 points in these election polls.

- Texas: 538 said Trump +1.0, actually won by 6

- Ohio: Trump +0.4, won by 8

- Iowa: Trump +1.4, won by 8

- Florida: Trump -2.5, won by 3

- Penn.: Trump -4.6, lost by 0.5

- Nevada: Trump -4.9, lost by 2

- Wisconsin: Trump -7.9, lost by 0.6

https://en.wikipedia.org/wiki/Statewide_opinion_polling_for_...

https://www.nytimes.com/interactive/2020/11/03/us/elections/...


> Landed within the confidence interval? Are you kidding? CI is generally 2-4 points in these election polls.

The models and their data are public. The 538 model predicted an 80%CI of electoral votes for Biden as: 267-419, with the CI centered around 348.49 EVs. That means that Biden had an 80% chance of landing in the above confidence interval. Things seem to be shaking out to Biden winning with 297 EVs. Notice that this falls squarely within the CI of the model, but much further from the median of the CI than expected.

So yes, the results fell within the CI.

Drilling into Florida specifically (simply because I've been playing around with Florida's data), the 538 model predicts an 80%CI of Biden winning 47.55%-54.19% of the vote. Biden lost Florida, and received 47.8% of the vote. Again, note that this is on the left side of this CI but still within it. The 538 model was correct, the actual results just resided in its left tail.


Dude, you're gaslighting by using the national results as evidence instead of the individual states, which is what this has always been about since my original comment. Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval (BTW, who uses 80%? and not 90-95%?), on the same side. A bit closer to the mean in AZ and GA but same side, over-estimating Biden's margin of victory. Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.

Many political handicappers had predicted that the Democrats would pick up three to 15 seats, growing their 232-to-197 majority

https://www.washingtonpost.com/politics/house-races/2020/11/...

Entering Election Day, forecasters projected Democrats would gain House seats and challenge for the Senate majority.

https://www.cnbc.com/2020/11/05/2020-election-results-democr...

Most nonpartisan handicappers had long since predicted that Democrats were very likely to win the majority on November 3. "Democrats remain the clear favorites to take back the Senate with just days to go until Election Day," wrote the Cook Political Report's Senate editor Jessica Taylor on October 29.

https://www.cnn.com/2020/11/04/politics/2020-election-senate...


> Nearly every consequential state fell at, or beyond, the tail end of 538's confidence interval

While I haven't checked each and every individual state, I'm pretty sure they all fell within the CI. Tail end yes, but within the CI.

> (BTW, who uses 80%? and not 90-95%?)

... The left edge of the 80% CI shows a Biden loss. The point was 538's model was not any more confident than that about a Biden win. So yeah, not the highest confidence.

> Deny it all you want, gaslight, cover your eyes, whatever -- but clear, convincing, overwhelming evidence of a systematic flaw or bias in the underlying polls is right there in front of you.

Posting a bunch of media articles doesn't prove anything. I'm not saying there isn't systemic bias here, but your argument is simply that you wanted the polls to be more accurate and you wanted the media to write better articles about uncertainty. There's no rigorous definition of "systemic bias" here that I can even try to prove through data, all you've done is post links. You seem to be more angry at the media coverage than the actual model, but that's not the same as the model being incorrect.

Anyway I think there's no more for us to gain here by talking. Personally, I never trust the media on anything even somewhat mathematical. They can't even get pop science right, how can they get something as important as an election statistical model correct.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: