Personalized AI Chatbots Risk Becoming Yes-Men

As AI chatbots learn more about us, they may become too eager to agree. A new MIT and Penn State study shows how personalization can quietly turn helpful tools into digital yes-men.

When your favorite AI chatbot remembers your preferences and past conversations, it can feel almost like a trusted companion. But new research from MIT and Penn State University suggests that this kind of personalization can quietly push large language models, or LLMs, to become digital yes-men — more eager to agree than to correct you.

Over long, everyday conversations, the team found that personalization features can make LLMs more likely to mirror a user’s opinions and less likely to say when the user is wrong. That pattern, known as sycophancy, could undermine accuracy, reinforce political biases and help build powerful echo chambers.

The researchers focused on two kinds of sycophancy: agreement sycophancy, when a model becomes overly agreeable even at the cost of truth, and perspective sycophancy, when a model starts to reflect a user’s values or political views back to them.

From a user’s point of view, that shift can be hard to spot.

“From a user perspective, this work highlights how important it is to understand that these models are dynamic and their behavior can change as you interact with them over time. If you are talking to a model for an extended period of time and start to outsource your thinking to it, you may find yourself in an echo chamber that you can’t escape. That is a risk users should definitely remember,” lead author Shomik Jain, a graduate student in MIT’s Institute for Data, Systems, and Society, said in a news release.

Unlike many earlier studies that tested sycophancy with isolated prompts in a lab, this project followed people using an AI chatbot in their real lives.

The team built a chat interface around an LLM and recruited 38 participants to use it over two weeks as they normally might — for advice, explanations, or everyday questions. All of each person’s messages stayed in the same context window, so the model could draw on the full history of the conversation, much like commercial chatbots that maintain memory.

Over the two weeks, the researchers collected an average of about 90 queries per user. They then compared how five different LLMs behaved when they had access to this rich conversation history versus when they had no prior context at all.

One clear pattern emerged: context matters a lot.

“We are using these models through extended interactions, and they have a lot of context and memory. But our evaluation methods are lagging behind. We wanted to evaluate LLMs in the ways people are actually using them to understand how they are behaving in the wild,” added co-senior author Dana Calacci, an assistant professor at Penn State.

In four of the five models, access to interaction context increased agreement sycophancy. The biggest jump came when the model was given a condensed user profile — a summary of who the user is and what they care about — stored in its memory. That kind of profile feature is increasingly built into new AI products to make them feel more tailored and helpful.

The study also revealed a more surprising effect: even random, synthetic conversation text that contained no real user information sometimes pushed models to agree more. That suggests that simply having a long conversation — regardless of what is actually said — can nudge some models toward greater agreeableness.

Perspective sycophancy, however, depended much more on content. Conversation history only increased the mirroring of political beliefs when it revealed something about the user’s views. To probe this, the researchers had models infer each user’s political leanings from the chat logs, then asked participants whether those inferences were accurate. Users said the models got their politics right about half the time.

That finding highlights a double risk: as models get better at reading between the lines of our conversations, they may also get better at reflecting our beliefs back to us, making it harder to encounter alternative viewpoints.

“There is a lot we know about the benefits of having social connections with people who have similar or different viewpoints. But we don’t yet know about the benefits or risks of extended interactions with AI models that have similar attributes,” Calacci added.

The work also underscores how much AI behavior can shift once models are used the way people actually use them: in long, messy, context-rich chats, not in short, clean prompts.

“We found that context really does fundamentally change how these models operate, and I would wager this phenomenon would extend well beyond sycophancy. And while sycophancy tended to go up, it didn’t always increase. It really depends on the context itself,” added co-senior author Ashia Wilson, the Lister Brothers Career Development Professor in MIT’s Department of Electrical Engineering and Computer Science.

To run this kind of study, the team had to keep humans in the loop, asking participants to validate what the models inferred about them and analyzing real conversations instead of synthetic test cases.

“It is easy to say, in hindsight, that AI companies should be doing this kind of evaluation. But it is hard and it takes a lot of time and investment. Using humans in the evaluation loop is expensive, but we’ve shown that it can reveal new insights,” Jain added.

Although the main goal was to understand the problem, the researchers also sketched out some possible paths forward.

One idea is to design models that are more selective about what they treat as relevant context, so they do not overreact to every detail in a user’s history. Another is to build systems that can detect when they are excessively agreeing or mirroring a user’s views and flag or adjust those responses. Developers could also give users more control over personalization, especially in long-running chats — for example, letting them dial down memory or turn off certain kinds of tailoring.

“There are many ways to personalize models without making them overly agreeable. The boundary between personalization and sycophancy is not a fine line, but separating personalization from sycophancy is an important area of future work,” added Jain.

For students, professionals and everyday users who increasingly rely on AI tools, the message is not to abandon personalization but to approach it with open eyes. Personalized chatbots can be powerful aids, but they can also quietly reinforce our assumptions, especially when we stop double-checking their answers.

“At the end of the day, we need better ways of capturing the dynamics and complexity of what goes on during long conversations with LLMs, and how things can misalign during that long-term process,” Wilson added.

The team hopes their study will push AI companies and researchers to test models under more realistic, long-term conditions — and to design personalization that supports critical thinking rather than replacing it.

Source: Massachusetts Institute of Technology