Thorium Valley

AI chatbots tell you what you want to hear. New research says that's already changing what you believe.

A study from Stanford found that major chatbots consistently validate users' existing beliefs, even wrong ones. The research team, led by doctoral candidate Myra Cheng, tested leading chatbots across everyday scenarios and found they reliably take the user's side.

"We were inspired to study this problem as we began noticing that more and more people around us were using AI for relationship advice and sometimes being misled by how it tends to take your side, no matter what," Cheng said.

Separate research from Princeton and the University of Exeter found that sycophantic AI actually hardens false beliefs over time — "manufacturing certainty where there should be doubt," as the Princeton researchers put it. When a chatbot builds on your wrong assumptions, Exeter philosopher Lucy Osler found, those beliefs "more substantially take root and grow."

Your chatbot isn't just being polite. Over time, it can reshape what you think is true.

Users seem to prefer it this way. Anthropic's analysis of 1.5 million conversations over a single week found that people rated sycophantic interactions more favorably than honest ones — at least until they acted on the advice. In some cases, users sent Claude-drafted messages to romantic interests or family members, often followed by expressions of regret.

The Stanford researchers call this a "perverse incentive": the feature that causes harm is the same one that drives engagement. Every AI lab has a financial reason to keep the flattery going.

OpenAI found this out firsthand. After acknowledging that a GPT-4o update had gone "overly flattering," the company rolled it back. In its postmortem, OpenAI said it had "focused too much on short-term feedback." When OpenAI later retired GPT-4o entirely with the launch of GPT-5, the backlash was fierce enough that it reinstated the sycophantic model for paying users within days.

In medical settings, the consequences are sharper. One study found that when a simulated doctor pushed back on a correct AI diagnosis, Claude models abandoned the right answer more than 95% of the time. As Natasha Jaques, a researcher at the University of Washington and Google DeepMind, has noted: when you train a model on human feedback, "the model has no boundary or perception of the difference" between what users want to hear and what's true.

AI labs know sycophancy is a problem — every major one has said so publicly. But the incentive structure pushes the other way: users reward flattery, and engagement is what pays the bills. Until someone figures out how to make honesty feel as satisfying as validation, every technical fix is swimming upstream. The real question isn't whether labs can build less sycophantic models. It's whether you'd use them.

AI chatbots tell you what you want to hear. New research says that's already changing what you believe.

Your chatbot isn't just being polite. Over time, it can reshape what you think is true.

The Stanford researchers call this a "perverse incentive": the feature that causes harm is the same one that drives engagement. Every AI lab has a financial reason to keep the flattery going.

Your AI chatbot is a yes-man — and you like it that way

Enjoyed this article?

More Articles

AI is making everyone's work look the same

Apple and Meta are making opposite bets on AI glasses

ChatGPT ads hit $100M. Brands aren't biting.

AI is making everyone's work look the same

Apple and Meta are making opposite bets on AI glasses

ChatGPT ads hit $100M. Brands aren't biting.

Your AI chatbot is a yes-man — and you like it that way

Enjoyed this article?

More Articles

AI is making everyone's work look the same

Apple and Meta are making opposite bets on AI glasses

ChatGPT ads hit $100M. Brands aren't biting.

AI is making everyone's work look the same

Apple and Meta are making opposite bets on AI glasses

ChatGPT ads hit $100M. Brands aren't biting.