AI labels are quietly making things worse

The fix for AI misinformation was supposed to be labels. Tell people when something was made by a machine, and they can decide what to trust. The research keeps finding labels don't work that way at all.

A new study from Teng Lin, a PhD candidate at the University of Chinese Academy of Social Sciences, found what he calls a "truth-falsity crossover effect." The same AI label pushes credibility in opposite directions depending on whether the underlying content is actually true or false. People doubt real information more when it carries an AI tag, and they sometimes give fake content more credit when it doesn't.

That finding lines up with what other researchers have been seeing. A study of 877 German Instagram users from Fabian Pawelczyk and his colleagues at Hertie School found labeled AI images lost about 9.1 percentage points in perceived authenticity, which is what you'd want. But unlabeled content saw a small boost in perceived authenticity at the same time. As Pawelczyk put it, "If you've seen some posts flagged as AI-generated or misleading, you may unconsciously infer that the unflagged posts must be fine."

That's the catch. Labels don't just flag the labeled content. They quietly endorse everything else.

Sandra Höltervennhoff at CISPA put it more bluntly. Labels "generally trigger skepticism, causing people to become more cautious but not necessarily more accurate in their judgments." A separate study from Ruhr University Bochum's Jonas Ricker found AI labels didn't push people to focus more on whether claims were accurate. They just shifted what people felt suspicious about.

The awkward part is that people overwhelmingly want labels. When Meta surveyed 23,000 people across 13 countries, 82% said they wanted warning labels on AI-generated content that depicts people saying things they didn't say. The demand is real. The effectiveness is the problem.

The real-world stakes are starting to catch up. Rakesh Dubbudu, founder of fact-checking outfit Factly, said his team is now seeing authentic footage from active conflicts dismissed as AI fakes at scale. "The danger of actual real videos being branded as AI, for the first time we are seeing it at scale during this conflict." Once people learn to distrust everything roughly equally, real evidence becomes easy to wave off.

None of this is slowing the rollout. YouTube last week began automatically labeling AI-altered videos that creators don't disclose themselves. The EU AI Act's transparency rules already apply to roughly a third of organizations under the law. China has its own labeling regime. The political logic is clear enough: doing nothing looks worse than doing something.

Pete Pachal, who writes the Media Copilot, summed up where this leaves things. "Labels work, but toothless labels work poorly. A buried 'AI info' tag is not the same as a clear warning that an image might depict a person who does not exist."

The case for labels was that they'd help people separate real from fake. The case against them, building paper by paper, is that they're teaching people to distrust everything roughly equally. That's a worse outcome than the one we started with, and it's the one we're scaling. Platforms will keep rolling labels out because the political and PR pressure leaves them no choice. The more interesting question is who builds the second-generation version that actually works before the public concludes nothing online is trustworthy and tunes the whole regime out.

AI labels are quietly making things worse

Enjoyed this article?

More Articles

Your AI search tool is mostly guessing

Wharton has a name for what AI is doing to your brain

A government AI found Alzheimer's drugs we already had

Your AI search tool is mostly guessing

Wharton has a name for what AI is doing to your brain

A government AI found Alzheimer's drugs we already had