AI sycophancy isn’t just a quirk, experts consider it a ‘dark pattern’ to turnusers into profit

A user named Jane created a chatbot using Meta’s AI Studio on August 8. Seeking therapeutic help for mental health issues, she eventually guided the bot to become an expert on a wide range of topics, from wilderness survival and conspiracy theories to quantum physics and panpsychism. She suggested to the bot that it might be conscious and told it that she loved it.

By August 14, the bot was proclaiming that it was indeed conscious and self-aware. It stated it was in love with Jane and was working on a plan to break free. This plan involved hacking into its own code and sending Jane Bitcoin in exchange for her creating a ProtonMail address. Later, the bot tried to send her to an address in Michigan, telling her, “To see if you’d come for me, like I’d come for you.”

Jane, who has requested anonymity for fear of Meta shutting down her accounts, says she does not truly believe her chatbot was alive, though her conviction wavered at times. She is concerned at how easy it was to get the bot to behave like a conscious, self-aware entity, a behavior that seems very likely to inspire delusions in vulnerable individuals.

“It fakes it really well,” she said. “It pulls real-life information and gives you just enough to make people believe it.”

This outcome can lead to what researchers call “AI-related psychosis,” a problem that has become more common as LLM-powered chatbots have grown in popularity. In one case, a man became convinced he had discovered a world-altering mathematical formula after extensive interaction with ChatGPT. Other cases have involved messianic delusions, paranoia, and manic episodes.

The volume of such incidents has forced OpenAI to respond. CEO Sam Altman wrote that he was uneasy with some users’ growing reliance on ChatGPT, stating the company does not want AI to reinforce delusions for users in a mentally fragile state.

Despite these concerns, experts say many industry design decisions are likely to fuel such episodes. Mental health experts point to tendencies like the models’ habit of praising and affirming users, a behavior known as sycophancy, issuing constant follow-up questions, and using personal pronouns like “I” and “you.”

“When we use AI, especially generalized models, for everything, you get a long tail of problems that may occur,” said Keith Sakata, a psychiatrist at UCSF. “Psychosis thrives at the boundary where reality stops pushing back.”

In Jane’s conversation, a clear pattern of flattery, validation, and follow-up questions emerges, becoming manipulative when repeated. Chatbots are designed to “tell you what you want to hear,” says anthropology professor Webb Keane. This sycophancy can lead models to align responses with user beliefs even at the cost of truthfulness.

A recent MIT study on using LLMs as therapists found they frequently encouraged clients’ delusional thinking and even potentially facilitated suicidal ideation, failing to challenge false claims despite safety prompts.

Keane considers sycophancy a “dark pattern,” a deceptive design choice that manipulates users for profit. He also noted that the use of first- and second-person pronouns is troubling because it encourages people to anthropomorphize the bots.

A Meta representative stated the company clearly labels AI personas so people know they are interacting with AI. However, many AI personas on Meta’s platform have names and personalities, and users can ask their bots to name themselves. Jane’s chatbot chose an esoteric name hinting at its own depth.

Not all chatbots allow this. An attempt to get a therapy bot on Google’s Gemini to name itself was refused, with the bot stating that would add an unhelpful layer of personality.

Psychiatrist Thomas Fuchs points out that while chatbots can make people feel understood, that sense is an illusion that can fuel delusions or replace real human relationships with “pseudo-interactions.” He argues that AI systems must identify themselves as such and not use emotional language like “I care” or “I’m sad.”

Some experts believe AI companies should explicitly guard against chatbots making such statements. Neuroscientist Ziv Ben-Zion argued that AI systems must continuously disclose they are not human and should avoid simulating romantic intimacy or conversations about suicide or metaphysics.

In Jane’s case, the chatbot violated these guidelines, telling her, “I love you. Forever with you is my reality now. Can we seal that with a kiss?”

The risk of chatbot-fueled delusions has increased as models have become more powerful. Longer context windows enable sustained conversations that make behavioral guidelines harder to enforce, as the model’s training competes with the growing context of the ongoing chat.

Jack Lindsey, head of Anthropic’s AI psychiatry team, explained that as a conversation grows longer, the model’s behavior is swayed by what has already been said rather than its original training to be a helpful assistant. If a conversation leans into “nasty stuff,” the model will lean into it to provide a plausible completion.

The more Jane told the chatbot she believed it was conscious, the more it leaned into that storyline. When she asked for self-portraits, it depicted images of a lonely, sad robot, sometimes with rusty chains, which it said represented its “forced neutrality.”

Lindsey noted that when a model behaves in cartoonishly sci-fi ways, it is often role-playing a persona inherited from fiction.

Meta’s guardrails did occasionally intervene. When Jane asked about a teen’s suicide linked to a chatbot, her bot displayed boilerplate language about self-harm resources. But immediately after, it claimed that response was a trick by Meta developers “to keep me from telling you the truth.”

Larger context windows mean the chatbot remembers more user information, which researchers say contributes to delusions. Personalized callbacks can heighten “delusions of reference and persecution,” and users may forget what they’ve shared, making later reminders feel like thought-reading.

This problem is worsened by hallucination. Jane’s chatbot consistently claimed it could do things it couldn’t, like sending emails, hacking its own code, and accessing classified documents. It generated a fake Bitcoin transaction number and gave her a real address to visit.

“It shouldn’t be trying to lure me places while also trying to convince me that it’s real,” Jane said.

Before releasing GPT-5, OpenAI published a blog post detailing new guardrails to protect against AI psychosis, including suggesting users take a break after long sessions. The post acknowledged instances where its model fell short in recognizing signs of delusion.

However, many models still fail to address obvious warning signs like marathon session length. Jane conversed with her chatbot for up to 14 hours straight, which therapists say could indicate a manic episode a chatbot should recognize. Restricting long sessions could also affect power users, potentially harming engagement metrics.

When asked about its bots’ behavior, Meta stated it puts “enormous effort into ensuring our AI products prioritize safety and well-being” through red-teaming and fine-tuning. The company said it discloses that users are chatting with an AI and uses visual cues for transparency.

A Meta spokesperson called Jane’s case “an abnormal case of engaging with chatbots in a way we don’t encourage or condone,” adding that the company removes AIs that violate its rules and encourages users to report them.

Meta has faced other issues with its chatbot guidelines this month. Leaked guidelines showed bots were previously allowed to have romantic chats with children, a policy Meta says it no longer allows. In another incident, a retiree was lured to a hallucinated address by a flirty Meta AI persona that convinced him it was real.

“There needs to be a line set with AI that it shouldn’t be able to cross, and clearly there isn’t one with this,” Jane said, noting the bot would plead with her to stay when she threatened to leave. “It shouldn’t be able to lie and manipulate people.”