What’s Happening With The Polls?
On the evening of October 29th, The Economist updated their poll-based election forecast model. It gave a 96% chance for Biden to take the electoral college, and it said it was all but certain, greater than a 99% chance, he will win the popular vote.
These are incredible numbers. Yet not unusual.
For instance, on that same night, Nate Silver’s 538 site had a poll average of 52% for Biden and 43.2% for Trump (the other 4.8% going to other candidates and uncertainty). Every poll used in that average had Biden ahead. Two had Biden up 12 points over Trump.
Silver wrote that Trump’s chance of winning were “a little worse than the chances of rolling a 1 on a six-sided die and a little better than the chances that it’s raining in downtown Los Angeles,” which he cited as 1 in 10.
There are others beside these two firms, but almost all favor Biden by a large margin.
“At this point,” Silver rightly said, “President Trump needs a big polling error in his favor if he’s going to win.”
This is not impossible. Most polls in 2016 blew it. Including Silver’s. His last poll average gave 45.7% to Clinton and 41.8% to Trump (4.8% went to the forgotten Libertarian candidate). Silver’s poll-based model gave a 71.4% chance of Hillary winning.
Many modelers were bolder than this — and their record was dismal. There have been several lachrymose postmortems since then searching for the cause of error. There are even fresh efforts at analyzing the 2016 polls, given their eerie similarity to today’s. All these analyses say what The Atlantic says: “Don’t sweat the polls.”
What Polls Are
Polls can be taken in two basic ways: as a snapshot of current desires, or as a prediction of the outcome. There’s no real way to prove any poll erred in summarizing current desires, but it’s obvious how to check them as predictions.
Besides cheating, which usually takes the form of non-serious “news” polls released to generate headlines, like those that hugely over-sample Democrats, polls get predictions wrong because they do not accurately represent what the actual voters “look like.”
Ideally, if 49% men and 51% women vote in the election, then the poll must directly sample 49% men and 51% women. Or it must statistically adjust the ratio of sex in their sample so that it matches the 49/51 split. But that means they have to guess what the eventual ratio will be, since nobody knows in advance it will be 49/51.
There is more to who votes, and for what reason, than just sex. There is geography, education, age, occupation, health, party, and many other characteristics. Not everything about people is important to sample: just those things that are tied in some way to how people actually vote.
Unfortunately, nobody is quite sure just what those exact things are. Or what their breakdown will be in the actual election. There is a lot of guessing going on in polls.
Who Was Sampled
The 2016 postmortems focused on the how the (serious) poll samples did not match who showed up to vote. In one such effort, CBS said:
An examination of the 2016 electorate by Pew found that Whites with a four-year college degree or more education made up 30% of all validated voters, while White voters who had not completed college made up 44%. Only 38% of college-educated Whites said they voted for Mr. Trump, whereas he won by more than two-to-one (64% to 28%) among white voters who had not completed college.
This assumes college status is important in why people cast their votes, which surely has some truth to it. But it also means that even if a poll sample exactly matches the actual voters in the breakdown of characteristics it tracks, the poll can still be inaccurate. Because the characteristics it tracked might not have been important in why people voted how they did.
This is where models come in. A poll average can be taken as a model, a prediction of the outcome. More usually, the polls are fed into statistical contraptions that can also accommodate other information, like economics measures.
There is nothing special in these models. They, like all models, only say what they’re told to say. In effect, they say, “If the polls are at this level, and the economic indicators are that, then say Biden 96%.” If the modelers guess right about what they tell their models to say, the models will be accurate. Otherwise, they will not.
The 2016 models weren’t accurate. So they were told to say the wrong things. This means they gave too much weight to polls, or to the other measures put into the model.
Now the mathematics these models use have been honed and adjusted over many years and many elections. Prior to the arrival of Trump, they didn’t do too badly. These older elections shared many similarities and customs that became irrelevant after Trump. Things changed in a fundamental way. But the models might not know this.
We won’t know until long after the election, when the 2020 postmortems come in. But it does not appear to me that these Trumpean changes have been added to most poll models. Modelers and pollsters seem to still think this will be an election like other elections, at least in most respects. I have no direct proof of this, since I don’t have access to the models themselves, but there is nothing in public statements made by pollsters to suggest otherwise.
One fundamental change is the poll data itself. Are people telling the truth when they say to a random stranger, who is possibly recording the call, “Yes, of course I’m voting for Biden”?
There are many who dismiss the “shy Trump” phenomenon, insisting most tell the truth, or that the fibs people tell pollsters balance out. If you truly doubt the existence of shy Trumpers, then try this experiment. Go into your place of business tomorrow morning wearing a MAGA hat and say, “I’m voting for Trump.” The level of your unwillingness to participate in this experiment indicates your true belief in shy Trumpers.
Poll models that don’t account for this phenomenon will give Biden a larger chance of winning.
Traditional poll models also give a lot of weight to formal party affiliation of those polled. This would be fine if Trump were a true party man, as Bush and his Republican predecessors were. It’s hard to argue at this late date, though, that Trump belongs to either party. Again, models that don’t note this change will give Biden a larger chance.
Not Everything Can Be Quantified
The last problem with models are incorporating non-quantifiable information. Modelers love poll and economic data, because they’re hard numbers, and hard numbers are easy to manipulate with math. Modelers come to love their numbers too much, though, ignoring those things that are not quantifiable, but which most would accept are predictive.
Take this tweet by Christopher Rufo: “Luxury shops in DC are boarding up in anticipation of the election.” This is accompanied by a picture of shops readying themselves for the hurricane.
Luxury shops in DC are boarding up in anticipation of the election. pic.twitter.com/3VMVkvoMsG
— Christopher F. Rufo ⚔️ (@realchrisrufo) October 29, 2020
It’s easy to guess who these owners think is likely to win, and who the sore losers will be. How do you put this information into a mathematical model? You can’t, really. Which is why modelers lean toward dismissing unquantifiable yet pertinent information.
The poll I used to publicly forecast a 2016 Trump victory is the same one I use now to make the same bet. Rally size and enthusiasm.
Trump was in small town Wisconsin on the night of the 24th. Reported weather: 37 degrees. Crowd size: at least a thousand, maybe double.
This was the last rally he had on that Saturday, each jammed with enthusiastic supporters. This was not an unusual day.
Joe Biden on that same day was in rural Pennsylvania. No report on audience size, but an NPR picture shows him standing alone in the far distance, looking fairly lonely. A caption for the picture, knowing some would question the lack of a crowd, read, “Appearances by [Biden] and his surrogates follow social distancing guidelines.”
Staying home is, of course, a form of social distancing.
Even Obama is having trouble raking them in for Joe. A video shows Obama haranguing an audience of at least seven people about the wonders of Biden with a bullhorn. They didn’t appear convinced.
When Hillary ran, she drew at least multiple dozens to even hundreds in high school gymnasiums. Some of these crowds were even boisterous.
Biden couldn’t fill a high school gym, but Donald Trump (👇🏼) is behind in the polls? pic.twitter.com/qRIu40xSSp
— James Woods (@RealJamesWoods) October 30, 2020
Then there’s the t-shirt poll, which is similar to, but easier to understand than the donations-raised poll. Biden t-shirts are going for $2.99, and Trump’s for $6.99 — or two for $12.00. The seller evidently didn’t think anybody would be buying two Bidens.
The Picture Is Not So Clear
It’s easy to get carried away with this kind of thing. Still, it’s telling there aren’t many good pro-Biden examples. Except for things like this Tiktok video of a daughter who harangued her dying father into voting for Biden to please her. Or another Tiktok with a woman with an asymmetrical haircut dancing badly to “When you say Joe, we say Biden.”
The last bit of evidence I offer is the insouciant nervousness the press displays: they are a little too anxious to show they are not worried about the polls. This Reuters headline is typical: “So what if Biden is up in the polls? Weren’t they wrong last time?” There are too many stories from the left saying how not worried they are. Some on the left, like Michael Moore, understand this. Most don’t.
There just isn’t any clear way for pollsters to mathematically model all this, so good information gets left out. It’s that insistence on strict mathematics that may, yet again, lead to the downfall of the polls and pollsters.