Polling by asking people about their neighbors: When does this work? Should people be doing more of it? And the connection to that French dude who bet on Trump

Statistical Modeling, Causal Inference, and Social Science 2024-11-09

Several people pointed me to this news report on a successful bettor in an election prediction market:

Not only did he see Donald Trump winning the presidency, he wagered that Trump would win the popular vote—an outcome that many political observers saw as unlikely. . . . He made his wagers on Polymarket, a crypto-based betting platform, using four anonymous accounts. . . . In messages sent privately to a reporter before Election Day, Théo predicted that Trump would take 49% or 50% of all votes cast in the U.S., beating Harris. He also predicted that Trump would win six of the seven battleground states. . . .

In his emails and a Zoom conversation with a reporter, Théo repeatedly criticized U.S. opinion polls. . . . Trump had overperformed his swing-state polling numbers in 2020. . . .

So what did this bettor do?

To solve this problem, Théo argued that pollsters should use what are known as neighbor polls that ask respondents which candidates they expect their neighbors to support. The idea is that people might not want to reveal their own preferences, but will indirectly reveal them when asked to guess who their neighbors plan to vote for.

Théo cited a handful of publicly released polls conducted in September using the neighbor method alongside the traditional method. These polls showed Harris’s support was several percentage points lower when respondents were asked who their neighbors would vote for, compared with the result that came from directly asking which candidate they supported. . . .

The data helped convince him to put on his long-shot bet that Trump would win the popular vote. . . . he had commissioned his own surveys to measure the neighbor effect, using a major pollster whom he declined to name. The results, he wrote, “were mind blowing to the favor of Trump!” . . . he argued that U.S. pollsters should use the neighbor method in future surveys to avoid another embarrassing miss.

Steve Shulman-Laniel sent this to me and wrote:

If it’s such an obvious method that some random rich French guy would use it and get better results than ordinary RDD [random-digit-dialing] polling, then we’d already be using it. If that’s not true, then that also seems interesting. Either we’re just leaving money on the table (in the sense that we’re ignoring a method that would improve poll results, in expectation) or there’s some good reason why pollsters don’t habitually use this method. (I guess a third option is that we’re already using this method, that the WSJ is overselling how novel it is, and that it hasn’t borne the fruit that the WSJ claims it would.)

I would not say that the above-quoted Wall Street Journal is overselling the novelty of neighbor polling—nowhere does it say that the method is new—but I see where Laniel is coming from. The other thing is that a lot of polls don’t use random digit dialing; they use internet panels. So, yes, there are methods that give better results than RDD, or, at least, no worse than RDD, and people are using these methods.

But, what about these neighbor polls? I don’t know how long they’ve been going on, but Julia Azari and I did suggest the idea in our 2017 paper, 19 Things We Learned from the 2016 Election, where we wrote:

We recognize the value of research into social networks and voting, especially in a fractured news media environment and declining trust in civilian institutions. In future studies, we recommend studying information about networks more directly: instead of asking voters who they think will win the election, ask them about the political attitudes of their family, friends, and neighbors.

We also discussed problems with just asking respondents who they think will win the election: this approach just invites them to spit back what they’ve already seen in the news media.

I’ve put “How do you think your family, friends, and neighbors?” questions on polls from time to time but can’t right now put my hands on any of the results.

Back in 2016 when discussing “How many people do you know?” surveys, Tian Zheng, Matt Salganik, and I pointed out that one advantage of asking people about their social networks is that it’s a way to use a survey to learn about people other than respondents.

Given that differential nonresponse has been such a problem with recent surveys, there’s an appeal to asking people about their social networks as a way to indirectly reach those hard-to-reach people.

I don’t know how many survey organizations followed our advice from 2017 to asking about the political attitudes of family, friends, and neighbors, but we did discuss once such attempt in 2022. At that time, the neighbor poll didn’t do so well. Jay Livingston shared the story:

They went to the Marist College poll and got the directors to insert two questions into their polling on local House of Representatives races. The questions were:

– Who do you think will win?

– Think of all the people in your life, your friends, your family, your coworkers. Who are they going to vote for?

At the time, the direct question “Who will you vote for?” the split between Republicans and Democrats was roughly even. But these new two questions showed Republicans way ahead. On “Who will win?” the Republicans were up 10 points among registered voters and 14 points among the “definitely will vote” respondents. On the friends-and-family question, the corresponding numbers were Republicans +12 and +16.

On the plus side, that result was so far off that nobody took it seriously. Yes, it was featured on NPR, but more as an amusing feature story than anything else.

Here’s what I wrote at the time:

So what happened? One possibility is that Republican family/friends/coworkers were more public about their political views, compared to Democratic family/friends/coworkers. So survey respondents might have had the impression that most of their contacts were Republicans, even if they weren’t. Another way things could’ve gone wrong is through averaging. If Republicans in the population on average have larger family and friend groups, and Democrats are more likely to be solo in their lives, then when you’re asked about family/friends/coworkers, you might be more likely to think of Republicans who you know, so they’d be overrepresented in this target group, even if the population as a whole is split 50/50. . . .

Also it would be good to see exactly how that question was worded and what the possible responses were. When writing the above-quoted bit a few years ago, I was imagining a question such as, “Among your family and friends who are planning to vote in the election, how do you think they will vote?” and then a 6-point scale on the response: “all or almost all will vote for Republicans,” “most will vote for Republicans,” “slightly more will vote for Republicans,” “slightly more will vote for Democrats,” “most will vote for Democrats,” “all or almost all will vote for Democrats.”

I hope they look into what went wrong. It still seems to me that there could be useful information in the family/friends/coworkers question, if we could better understand what’s driving the survey response and how best to adjust the sample.

So that’s where things stood in 2022.

So what about 2024? Apparently this year the neighbor polls did better. I’d like to see the data, as all we have now is third-hand reporting—a journalist telling us what a bettor told him about some polls we haven’t seen. I’m not saying anyone’s trying to mislead us here; it would be just good to see the information on which the claims were based.

Summary

1. Polling people by asking about their neighbors is an interesting idea.

2. We have an example in 2022 where it didn’t work,

3. We have an example in 2024 where someone claims it worked well.

4. To the extent that existing survey approaches continue to have problems, there will always be interest in clever ways of getting around the problem of nonresponse.

5. If you’re doing a neighbor poll, I think you’ll want to adjust it using the usual poststratification methods and maybe more. I’m not quite sure how—it’s an applied research problem—but I imagine something can be done here.

6. These principles apply to surveys and nonresponse more generally, not just election polling!

Unlike the French dude in that interview, I’m offering all this advice for free. I’m fortunate to be supported by public funds, and the least I can do is share as much of my understanding as is feasible.

Did the French guy know what he was doing with the network sample? I can’t really say, given that my knowledge of the story is at third hand. My guess is that he got lucky, but it’s not just luck. He got lucky in that this year the surveys of neighbors got the right answer—but that hasn’t always been the case. It’s not just luck because he had to use his judgment to decide what to do with that survey data, and he gets credit for making the right call.

P.S. Economist Peter Dorman writes:

About 20 years ago I was asked to do a study for the International Labor Organization on whether child workers were more exploited than adults across a set of occupations and countries. That required collecting a lot of data, both on how much kids are paid and how productive (in economic value terms) their labor is, both compared to adults. I was worried that employers would be reluctant to reveal how much they paid underage workers or even admit they employed them, since it’s all illegal, so in addition to direct questions on the employer questionnaire (matched to worker questionnaires), I asked, “How much do you think other employers pay children for this type of work?” In the writeup I referred to these as ecological questions. Lo and behold, direct and ecological responses were very highly correlated, and not much would change from using either.

At the time I was under the impression that this was a well known technique, and I didn’t even reference it in the writeup. But was I wrong?

For thinking about some of these general issues, I recommend my article, Learning about networks using sampling, published in 2017 in the Journal of Survey Statistics and Methodology.

I did some quick searching on the topic and found this paper, Perceptions of others’ opinions as a component of public opinion, by Carroll Glynn, published in 1989 in the journal Social Science Research:

This study investigates some relationships between stated opinions and perceptions of others’ opinions, clarifying certain ambiguities in the use of perceptual approaches in public opinion research. The study provides evidence that respondents perceive others as similar to themselves in opinions and values, suggesting support for the “looking glass hypothesis.” However, there was also evidence of an “ideological bias”—respondents tend to see neighbors as having more conservative opinions than their own and to see others living in the city as having more liberal opinions than their own. The study indicates that perceptions are important in understanding public opinion mechanisms but that we are far from a full understanding of the underlying process involved.

Also this paper, Social sampling explains apparent biases in judgments of social environments, by Mirta Galesic, Henrik Olsson, and Jörg Rieskamp, published in 2012 in Psychological Science:

How people assess their social environments plays a central role in how they evaluate their life circumstances. Using a large probabilistic national sample, we investigated how accurately people estimate characteristics of the general population. For most characteristics, people seemed to underestimate the quality of others’ lives and showed apparent self-enhancement, but for some characteristics, they seemed to overestimate the quality of others’ lives and showed apparent self-depreciation.

I imagine there must be some literature on the technique of asking survey questions about family/friends/neighbors and how to adjust the resulting data to get estimates for the general population. But I’m not sure where to look, and I couldn’t find any references on the topic. Googling “neighbors” in the survey research literature gets me lots of references on neighborhood effects in sociology; googling “neighbors” and “sampling” takes me to nearest-neighbor methods in statistics and machine learning; googling “network sampling” takes me to the computer science literature . . . nothing on how to survey people and ask about their neighbors.

P.P.S. In comments, Isaac Maddow-Zimet points to this article from 2019, Evaluating sampling biases from third-party reporting as a method for improving survey measures of sensitive behaviors, by Stéphane Helleringer, Jimi Adams, Sara Yeatman, and James Mkandawire, which states:

Survey participants often misreport their sensitive behaviors (e.g., smoking, drinking, having sex) during interviews. Several studies have suggested that asking respondents to report the sensitive behaviors of their friends or confidants, rather than their own, might help address this problem. This is so because the “third-party reporting” (TPR) approach creates a surrogate sample of alters that may be less subject to social desirability biases. However, estimates of the prevalence of sensitive behaviors based on TPR assume that the surrogate sample of friends is representative of the population of interest. . . . we suggest approaches to strengthen estimates of the prevalence of sensitive behaviors obtained from TPR.

Interesting. With election polling, the issues are slightly different. We have no reason to expect systematic misreporting of vote preferences; rather, our concern is differential nonresponse: in 2024, this would be Republican voters being less likely than Democratic voters to respond to the survey in the first place.

So the motivation for asking about friends and neighbors in an election poll with concerns about nonresponse is different from the motivation for asking about friends and neighbors in a social survey with concerns about insincere responses. Still, there should be some common lessons from these two different problems.