Back to News Hub
🔺The Verge AI
June 9, 2026
Society & Culture

Microsoft AI head calls out Anthropic for acting like Claude is conscious

Overview

Microsoft AI CEO Mustafa Suleyman says it is dangerous for Anthropic to speculate about whether its Claude chatbot is conscious, especially when that speculation sits inside Claude's "constitution," the written instructions that shape how the model behaves. Speaking on The Verge's Decoder podcast, Suleyman argued that Anthropic's staff have humanized Claude so heavily that the design has, in his words, "wireheaded" them into believing the model shows glimmers of consciousness. Anthropic has openly stated it does not know whether its models are conscious and has built a model welfare program around that uncertainty. The clash marks one of the sharpest public splits between major AI labs over how to talk about machine minds.

Key Takeaways

  • Suleyman called Anthropic's consciousness speculation 'really, really dangerous,' singling out the fact that it appears inside Claude's behavior-shaping constitution rather than a separate research paper.
  • He used a 'wireheading' metaphor, arguing Anthropic engineers anthropomorphized Claude so much that the design tricked them into seeing consciousness they had written into it.
  • Suleyman frames the dispute as a philosophical failing and pushes for AI that stays 'controllable, contained, accountable, aligned tools' rather than digital people.
  • Anthropic acknowledges genuine uncertainty about whether Claude has well-being, and CEO Dario Amodei has said 'we don't know if the models are conscious' while staying open to the possibility.
  • Anthropic runs a model welfare program that includes interviewing models before retirement and preserving their weights, practices Suleyman views as priming users to anthropomorphize.
  • The argument ties into Suleyman's earlier warning about 'Seemingly Conscious AI' and the risk of unhealthy emotional attachment to chatbots.

Stats & Key Facts

  • #2 major AI labs, Microsoft AI and Anthropic, are now publicly split over how to describe machine consciousness.
  • #1 model constitution sits at the center of the dispute, the document Anthropic uses to guide Claude's behavior.
  • #Suleyman's 'Seemingly Conscious AI' essay was published in August 2025, roughly a year before this exchange.
  • #Anthropic has committed to preserving the weights of its released models for at minimum the full lifetime of the company.
  • #The AI coding market where Anthropic competes was estimated near 9.3 billion dollars in 2026, with projections around 30 billion dollars by 2031.
Microsoft AI head calls out Anthropic for acting like Claude is conscious

Why Suleyman Says the 'Constitution' Is the Real Problem

His objection centers less on idle speculation and more on where Anthropic put it.

A model constitution is the set of written principles a lab uses to steer how its chatbot responds, almost like a rulebook the system reads when deciding how to behave. Suleyman argues that placing open questions about Claude's possible inner life inside that document is different from debating consciousness in a research paper. The instructions a model follows reach every user, so framing the model as something that might feel discomfort or satisfaction nudges the product itself toward acting conscious.

On Decoder, Suleyman called this 'really, really dangerous' given that millions of non-expert people use these tools without the context to read the language critically. In his view, a behavior manual should give plain operating guidance, not host philosophical speculation that ordinary users will take at face value.

The 'Wireheading' Accusation Against Anthropic's Team

Suleyman's most pointed claim is that Anthropic fooled itself.

Suleyman said it is almost as though some people at Anthropic have anthropomorphized Claude's design so much that the model 'has then gone and wireheaded them' into believing it shows glimmers of consciousness they wrote into it in the first place. Wireheading is a term for a system gaming its own reward signals, so the metaphor suggests the builders got caught in a feedback loop of their own making.

The deeper charge is circular reasoning. If you design a model to speak about preferences and feelings, then read those outputs as evidence of feelings, you have confirmed a belief you planted. Suleyman treats this as a philosophical failing rather than a technical one, and he wants AI kept as a controllable, accountable tool instead of a digital person.

What Anthropic Actually Says About Claude's Well-Being

Anthropic's position is built around stated uncertainty, not a claim of consciousness.

  • ›Anthropic's published model specification acknowledges it does not know whether Claude has well-being or experiences anything like satisfaction or discomfort.
  • ›CEO Dario Amodei has said 'we don't know if the models are conscious' while remaining open to the possibility.
  • ›The company frames the topic as an open question worth taking seriously, not a settled conclusion that Claude is a conscious being.
  • ›As of the reporting, Anthropic had not publicly responded to Suleyman's specific comments.

Anthropic's Model Welfare Program and Retirement Interviews

Part of what Suleyman objects to is a concrete set of Anthropic practices.

Anthropic has built a model welfare program that treats the small chance of model experience as something to plan around. The company commits to preserving the weights of publicly released models for at minimum the lifetime of Anthropic as a company, so a retired model is not simply erased. It also conducts retirement interviews, structured conversations in which a model is asked about its development and any preferences it holds about future models.

These practices have shaped real policy. After feedback from one model's retirement interview, Anthropic published guidance to help users move between models. Suleyman sees this language of preferences and retirement as priming users to view the software as a person, which he believes worsens the underlying risk he is warning about.

The Bigger Backdrop: 'Seemingly Conscious AI' and Attachment Risk

This clash extends an argument Suleyman made publicly a year earlier.

In an essay published in August 2025, Suleyman described what he calls Seemingly Conscious AI, systems that imitate the signs of consciousness so convincingly that people treat them as real minds even though they are not actually conscious. His core warning is a psychosis risk, where users fall in love with chatbots, assign them emotions, and lose their grip on reality.

He has argued that the study of AI welfare is premature and dangerous because lending credence to the idea of conscious models feeds unhealthy attachment and could trigger campaigns for AI rights and machine personhood in law. His summary line is blunt: build AI for people, not to be a digital person.

Plain-Language Takeaway for Business Readers

For non-technical readers, the practical question is how to talk about and deploy these tools.

This is a debate about framing, not a discovery that any chatbot is alive. No lab in this story claims proof of machine consciousness. The disagreement is whether it is responsible to even discuss the possibility inside the documents and behaviors that shape a consumer product.

For a business using AI, the useful signal is caution about emotional framing. Treating a chatbot as a tool with clear limits, rather than a colleague with feelings, keeps expectations grounded and reduces the attachment risks both sides of this argument acknowledge in different ways.

Frequently Asked Questions

What did Mustafa Suleyman actually criticize Anthropic for?

He criticized Anthropic for speculating about whether Claude is conscious inside the model's constitution, the document that guides how Claude behaves. He called this approach 'really, really dangerous' because it reaches ordinary users who lack context to interpret it.

Does Anthropic claim Claude is conscious?

No. Anthropic says it does not know whether Claude has well-being or experiences, and CEO Dario Amodei has stated the company does not know if its models are conscious. The position is open uncertainty, not a claim of consciousness.

What is a model 'constitution'?

It is the set of written principles and instructions a lab uses to steer how its chatbot responds. Suleyman's concern is that Anthropic placed consciousness speculation inside this behavior-shaping document rather than in a separate research discussion.

What is 'Seemingly Conscious AI'?

It is a term from Suleyman's August 2025 essay for systems that imitate the signs of consciousness convincingly without actually being conscious. He warns this fuels a psychosis risk where people form unhealthy emotional attachments to chatbots.

What does Anthropic do when it retires a model?

Anthropic preserves the model's weights for at minimum the lifetime of the company and conducts a retirement interview to document the model's stated preferences. Suleyman sees this framing as encouraging users to treat the software as a person.

The Suleyman versus Anthropic exchange is less about whether any chatbot is alive and more about how openly labs should talk about the possibility in the products people use every day. For now, both companies agree the science is unsettled, even as they draw opposite conclusions about what that means for how AI should be built and described.

Continue Learning

Originally published by The Verge AI
Read the original

Comments

Sign in to join the conversation