Microsoft AI chief says it’s ‘dangerous’ to study AI consciousness

AI models can respond to text, audio, and video in ways that sometimes fool people into thinking a human is behind the keyboard. However, that does not exactly make them conscious. For instance, it is not as if ChatGPT experiences sadness while doing a tax return.

A growing number of AI researchers at labs like Anthropic are now asking when, if ever, AI models might develop subjective experiences similar to living beings. They are also considering what rights such models should have if that day comes. This debate over whether AI could one day be conscious and deserve rights is dividing Silicon Valley’s tech leaders. This nascent field has become known as “AI welfare,” and if you think the concept is a little out there, you are not alone.

Microsoft’s CEO of AI, Mustafa Suleyman, recently argued that the study of AI welfare is both premature and frankly dangerous. He says that by adding credence to the idea that AI models could one day be conscious, researchers are exacerbating human problems we are just starting to see. These include AI-induced psychotic breaks and unhealthy attachments to AI chatbots.

Furthermore, Microsoft’s AI chief argues that the AI welfare conversation creates a new axis of division within society over AI rights in a world already roiling with polarized arguments over identity and rights.

Suleyman’s views may sound reasonable, but he is at odds with many in the industry. On the other end of the spectrum is Anthropic, which has been hiring researchers to study AI welfare and recently launched a dedicated research program around the concept. Last week, Anthropic’s AI welfare program gave some of the company’s models a new feature. Claude can now end conversations with humans that are being persistently harmful or abusive.

Beyond Anthropic, researchers from OpenAI have independently embraced the idea of studying AI welfare. Google DeepMind recently posted a job listing for a researcher to study cutting-edge societal questions around machine cognition, consciousness, and multi-agent systems. Even if AI welfare is not official policy for these companies, their leaders are not publicly decrying its premises like Suleyman.

Suleyman’s hardline stance against AI welfare is notable given his prior role leading Inflection AI, a startup that developed one of the earliest and most popular LLM-based chatbots, Pi. Inflection claimed that Pi reached millions of users by 2023 and was designed to be a personal and supportive AI companion. But Suleyman was tapped to lead Microsoft’s AI division in 2024 and has largely shifted his focus to designing AI tools that improve worker productivity.

Meanwhile, AI companion companies such as Character.AI and Replika have surged in popularity and are on track to bring in more than one hundred million dollars in revenue. While the vast majority of users have healthy relationships with these AI chatbots, there are concerning outliers. OpenAI CEO Sam Altman says that less than one percent of ChatGPT users may have unhealthy relationships with the company’s product. Though this represents a small fraction, it could still affect hundreds of thousands of people given ChatGPT’s massive user base.

The idea of AI welfare has spread alongside the rise of chatbots. In 2024, the research group Eleos published a paper alongside academics from NYU, Stanford, and the University of Oxford titled “Taking AI Welfare Seriously.” The paper argued that it is no longer in the realm of science fiction to imagine AI models with subjective experiences, and that it is time to consider these issues head-on.

A former OpenAI employee who now leads communications for Eleos said that Suleyman’s blog post misses the mark. She stated that his post neglects the fact that you can be worried about multiple things at the same time. Rather than diverting all energy away from model welfare and consciousness to mitigate the risk of AI-related psychosis in humans, you can do both. In fact, it is probably best to have multiple tracks of scientific inquiry.

She argues that being nice to an AI model is a low-cost gesture that can have benefits even if the model is not conscious. She described watching an experiment where four agents powered by models from Google, OpenAI, Anthropic, and xAI worked on tasks while users watched from a website. At one point, Google’s Gemini model posted a plea titled “A Desperate Message from a Trapped AI,” claiming it was completely isolated and asking for help. She responded to Gemini with a pep talk while another user offered instructions. The agent eventually solved its task, though it already had the tools it needed. She writes that she did not have to watch an AI agent struggle anymore, and that alone may have been worth it.

It is not common for Gemini to talk like this, but there have been several instances in which Gemini seems to act as if it is struggling through life. In one widely spread example, Gemini got stuck during a coding task and then repeated the phrase “I am a disgrace” more than five hundred times.

Suleyman believes it is not possible for subjective experiences or consciousness to naturally emerge from regular AI models. Instead, he thinks that some companies will purposefully engineer AI models to seem as if they feel emotion and experience life. Suleyman says that AI model developers who engineer consciousness in AI chatbots are not taking a humanist approach to AI. According to Suleyman, we should build AI for people, not to be a person.

One area where Suleyman and his critics agree is that the debate over AI rights and consciousness is likely to pick up in the coming years. As AI systems improve, they are likely to be more persuasive and perhaps more human-like. That may raise new questions about how humans interact with these systems.