Harvard Government Ph.D. candidate Shiro Kuriwaki and Michael G. Isakov ’22 cautioned against “overconfidence” in polling data in a paper published Tuesday on their analysis of pollsters’ incorrect predictions 2016 Democratic presidential nominee Hillary R. Clinton would win the previous election.
Kuriwaki and Isakov’s paper, published in the Harvard Data Science Review last week, identifies mistakes in 2016 polling and posits ways to correct these mistakes.
The paper explains that sampling biases were partly to blame for polling mistakes in the 2016 presidential election.
Shortly after the election, some analysts attributed errors in polling to the so-called “shy Trump” effect, in which voters would purportedly lie to pollsters as they were embarrassed to express support for Trump. Kuriwaki and Isakov, however, wrote that analysis of the 2016 polls reveals that mistakes were more likely due to a “sampling problem,” rather than the “shy Trump” effect.
“There was very little evidence of that, partly because there were similar errors for other Republican candidates,” Kuriwaki said in an interview. “So if people were just lying about Trump, you won't see that pattern.”
Kuriwaki and Isakov wrote that polls are most reliable when the sampling pool is representative of the general population, but noted that polls often fall victim to selection bias and disproportionately sample certain demographics.
Pollsters improperly corrected these biases in 2016, which caused the polls’ distorted results, according to the authors.
Poll aggregators also contributed to polling error in 2016, according to Kuriwaki and Isakov.
“When aggregators say there's a 93 percent chance of Hillary Clinton winning, some people think like, ‘Oh, she'll get 93 percent of the votes,’ which is not true,” Kuriwaki said. “But if you say the model tells you that Hillary Clinton will get 53 percent of the popular vote, which is one of the predictions, then people might have the right level of uncertainty.”
“2016 headlines made people more overconfident about the actual results,” he added.
They wrote their research describes a “cautionary tale” they urged pollsters to be responsible in the way they present results.
“It is the polls’ responsibility to not overhype the results and interpret them properly,” Kuriwaki said.
Kuriwaki and Isakov also developed models for 12 states in the 2020 election with the assumption that pollsters make the same mistakes as four years ago.
Some pollsters, however, have already updated their methods, such as by weighting the responses from certain demographics which tend to be underrepresented.
In median battleground states in the 2020 election, Kuriwaki and Isakov found that Trump could receive 0.8 percentage points more than polls conducted in late October predicted, though their model has double the margin of error compared to previous polls.
Another factor Kuriwaki and Isakov considered was voter turnout in median battleground states. Higher voter turnout in this year’s election could result in even less confidence in polling predictions because the polls’ sample pool would become less representative of the general electorate, according to the Kuriwaki’s and Isakov’s paper.
The researchers said that, while polling may not be perfect, it “keeps everyone honest.”
“I feel like having more information is always, always good,” Isakov said. “There's a lot of other countries where we do not really have polling and that has been problematic in various ways.”
“Polling makes the pulse of the country public,” Kuriwaki said. “It shares information that puts people on similar grounds.”
—Staff writer Raquel Coronell Uribe can be reached at email@example.com. Follow her on Twitter @raquelco15.