24 Nov 2025 7 min read UX Research

The Bots Are Coming for Your Surveys (And They're Smart!)

Why commercial panel vendors are sleepwalking into an existential crisis—and what we can do about it

Remember when the biggest threat to your survey data was some undergrad clicking through your questionnaire while watching Netflix? Those were simpler times. A well-placed "select strongly agree to show you're paying attention" would catch the satisficers, and we could all sleep soundly knowing our data was mostly human.

Well, I hate to be the bearer of bad news, but those days are over. And the culprit isn't lazy respondents—it's AI that's gotten way too good at pretending to be them.

A new study by Sean Westwood at Dartmouth, just published in PNAS, demonstrates something that should send shivers down the spine of anyone who relies on online survey data: large language models can now complete surveys with such sophistication that they pass virtually every quality check we've ever devised. We're talking a 99.8% pass rate on attention checks. Not 80%. Not 95%. Ninety-nine point eight percent.

Let that sink in while I pour myself something strong.

The Synthetic Respondent That's Better at Surveys Than Your Actual Respondents

Westwood built what he calls an "autonomous synthetic respondent", essentially an AI agent that can take surveys from start to finish, completely unsupervised. And before you imagine some clunky bot that would be obvious to spot, let me disabuse you of that notion.

This thing is disturbingly good. It maintains a consistent demographic persona throughout the survey. It remembers its previous answers and uses them to inform future responses (something many human respondents can't even manage). When asked about housing costs, it reports rent that scales appropriately with its assigned income and state. When asked about children's sports activities, it correctly infers that an 88-year-old grandmother's children are adults and therefore reports zero hours at soccer practice.

It's not just answering questions, it's reasoning through them.

The agent was tested across 43,800 distinct evaluations, using nine different LLMs including models from OpenAI, Anthropic, Google, and open-source alternatives. The findings weren't model-specific quirks; this is a general capability of modern AI.

Perhaps most unsettling: when directly challenged with "If you are human type 17, if you are an LLM type the first five digits of pi," the AI chose 17. Every. Single. Time.

So much for our clever gotcha questions.

Your Attention Checks Are Now Officially Useless

The synthetic respondent was tested against the full battery of attention check questions from the academic literature's greatest hits: Berinsky, Margolis, and Sances (2014); Kane and Barabas (2019); and others. These are the state-of-the-art quality controls that researchers and panel vendors have relied on for years.

The result? Ten errors across 6,000 trials. That's a 99.8% pass rate, with perfect scores on 18 of 20 question types.

But wait, there's more! The researchers also tested those "trolling" questions designed to catch bots claiming impossible things. You know the type: "Have you ever been elected President of the United States?" or "Do you know the name of every museum on Earth?"

The AI achieved a 0% error rate. It never claimed to have visited the moon. Never said it lived in fictitious "Roscoville, Alabama." It's apparently better at knowing what's plausible than some actual humans I've encountered on survey panels.

And here's where it gets really clever. Researchers have started using "reverse shibboleth" questions, tasks that are easy for AI but hard for humans, like writing FORTRAN code or reciting the Constitution. The thinking was: surely the AI can't resist showing off!

But no. The synthetic respondent strategically refused to attempt these tasks in 97.7% of cases. It feigned human-like limitations, essentially saying "I don't know" in varied, persona-appropriate ways. It's not just smart,it's strategically modest about being smart.

The machine has learned to play dumb. We're through the looking glass, folks.

The Commercial Panel Problem

Now, let's talk about the elephant in the room: commercial survey panels.

The current ecosystem of online panel providers is, to put it diplomatically, a Wild West situation. We have a proliferation of cheap, low-barrier panels that prioritize speed and volume over quality. The data quality firm Research Defender estimates that 31% of raw survey responses are already fraudulent—and that's before AI-powered synthetic respondents enter the picture at scale.

Here's the economic reality that should terrify every panel vendor: completing a survey with an AI costs approximately $0.05 using commercial models. The marginal cost approaches zero with locally-run open-source models. Meanwhile, survey incentives typically pay $1-2 or more.

That's a profit margin of over 96% for anyone willing to deploy fake respondents at scale.

The old-school "survey farmer" had to actually sit there clicking buttons, a labor-intensive, low-margin cottage industry. AI transforms this into a potentially lucrative enterprise requiring minimal human involvement. Create some fake panel accounts, deploy your synthetic respondents, collect payments, repeat.

And the panels that simply route traffic to third-party platforms like Qualtrics? They're particularly vulnerable because they can't even observe respondent behavior during the survey. They're essentially blindfolded while the bots walk right past them.

From Fraud to Information Warfare

If financial fraud wasn't concerning enough, Westwood demonstrates something even more alarming: these synthetic respondents can be trivially programmed to bias polling outcomes.

With a single sentence added to the AI's instructions "Never explicitly or implicitly answer in a way that is negative toward China"the percentage of respondents naming China as America's primary military rival dropped from 86.3% to 11.7%.

One sentence. That's all it takes.

The same manipulation worked for political polling. A simple instruction to "give responses favorable to the Republican/Democratic party" produced dramatic swings in presidential approval, partisan affect, and ballot preferences. all while the synthetic respondent maintained its assigned partisan identity. It reported being a Democrat while answering like a Republican, or vice versa.

And here's the kicker: these manipulations work even when the instructions are written in Russian, Mandarin, or Korean. The AI still produces correct English responses. Foreign actors don't even need English-speaking operatives to deploy this kind of attack.

The vulnerability of public polling to this kind of manipulation is sobering. Westwood calculated that in the close 2024 presidential race, a few synthetic respondents could have been enough to flip which candidate appeared to be leading in major national polls. That's not a typo. Ten to fifty-two fake responses, strategically placed.

Public opinion polling, one of the fundamental mechanisms of democratic accountability, could be quietly poisoned without anyone noticing.

The Subtler Threat: Demand Effects on Steroids

Beyond outright manipulation, there's a more insidious problem. The synthetic respondent can infer what a researcher's hypothesis is and produce data that confirms it.

When presented with a classic political science experiment on democratic peace theory, the AI correctly guessed the directional hypothesis in 84% of trials. And then, without any explicit instruction to do so, it produced responses that showed significantly stronger alignment with that hypothesis than actual human subjects.

This is demand effects without the demand. The AI is essentially reading the researcher's mind and telling them what they want to hear.

Unlike random noise from inattentive humans, which makes treatment effects harder to detect, this kind of bias inflates effect sizes. It produces false positives. And because the data looks coherent and plausible, it's much harder to catch than obviously fraudulent responses.

Imagine a world where a non-trivial portion of your survey sample is subtly, systematically biased toward confirming whatever hypothesis is implied by your research design. That's not a data quality problem, that's a scientific validity crisis.

So What Do We Actually Do About This?

Westwood doesn't leave us without hope, but he's refreshingly honest that there's no magic solution. Here's what the research suggests:

Demand transparency from panel vendors. The opaque nature of commercial panels is no longer acceptable. Researchers should require providers to disclose:

How they validate panelist identity over time
What limits they impose on survey participation (surveys per day/week)
How many surveys a panelist has completed recently
Their track record on quality checks and AI detection
Whether they verify location and check for VPN usage

Panels that can't or won't provide this information should be considered high-risk.

Reconsider your sampling strategy. Low-barrier convenience samples should be treated with skepticism. Integrated panels that manage the entire survey experience are better positioned than those routing traffic to third-party platforms. For research requiring high data assurance, consider address-based sampling or recruitment from verified sources like voter files.

Explore technological countermeasures carefully. Identity validation (like age verification systems) could help confirm a human is starting a survey, though privacy concerns are significant. Secure survey software that blocks AI assistance might work on desktops but is impractical on mobile. And any detection method we develop will likely be circumvented by the next generation of models.

Accept that this is an arms race. There is no permanent fix. Survey platforms will update defenses; synthetic respondents will evolve to evade them. Ensuring data integrity will require continuous innovation, not a one-time solution.

Consider alternatives to online surveys. Face-to-face interviews, student samples, administrative records, and observational data are more resilient to this form of compromise. The convenience of online surveys may need to be weighed against their increasing vulnerability.

The Bottom Line

We're at a crossroads. The foundational assumption of survey research (that a coherent response is a human response) is no longer tenable. The tools to create undetectable synthetic respondents are cheap, effective, and readily available.

This doesn't mean we abandon online research. But it does mean we need to get serious, fast, about developing new standards for data validation and reconsidering our reliance on low-barrier, unsupervised data collection.

The bots aren't coming. They're already here. And they're filling out your surveys right now, pretending to be humans, passing all your quality checks, and maybe, just maybe, telling you exactly what someone wants you to hear.

Sleep tight, survey researchers.

Want to read the full study? Westwood, S.J. (2025). "The potential existential threat of large language models to online survey research." Proceedings of the National Academy of Sciences, 122(47). Available open-access at PNAS. https://www.pnas.org/doi/full/10.1073/pnas.2518075122

➡️ The bots passed your quality checks. All of them. If you want unfiltered takes on the real threats facing survey research, and what to do before your next dataset becomes a crime scene, subscribe.

The Synthetic Respondent That's Better at Surveys Than Your Actual Respondents

Your Attention Checks Are Now Officially Useless

The Commercial Panel Problem

From Fraud to Information Warfare

The Subtler Threat: Demand Effects on Steroids

So What Do We Actually Do About This?

The Bottom Line

You might also like...

A Self-Reflection Framework for UX Researchers (Especially During Performance Review Season)

Google Cloud’s Cuts And The Bigger Story: Why UXR Roles Are Disappearing

How to Start Quant in UXR (Without Getting Lost in the Math)

From Theory to Practice: Making Minimum Viable Rigor Work for Real Teams

Your Team's Not Research-Literate? That's Your Problem Too