Back to News Hub
📐SiliconANGLE AI
May 12, 2026
Product Updates

Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time

Overview

, the artificial intelligence research startup founded by former OpenAI Group PBC Chief Technology Officer Mira Murati, wants to move beyond the era of “turn-based” AI interactions. The company has just announced a research preview of its first “interaction models,” which are a new class of multimodal AI systems designed to avoid […] The post Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time appeared first on SiliconANGLE. SiliconANGLE UPDATED 20:59 EDT / MAY 11 2026 AI Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time by Mike Wheatley SHARE Thinking Machines Lab Inc.

Key Takeaways

  • The company has just announced a research preview of its first "interaction models," which are a new class of multimodal AI systems designed to avoid the inevitable pauses that characterize human interactions with AI systems.

    As anyone who uses AI regularly knows, the basic interaction is a spotty one, at best: The user provides an input, such as text or an image upload, then waits anywhere from a few milliseconds to several minutes, depending on the model used, before finally receiving the output.

  • Over multiple months of use, humans have learned to phrase their questions like emails and batch their thoughts, because they know the AI they're using cannot handle interruptions or deal with the subtle "backchanneling," or the "mhmms" and "I sees" that exist in truly natural human interactions.

    But if AI is to become a true humanlike collaborator in high-stakes applications like medical surgery, it has to find a way to ditch that lag.

  • The first component of this new architecture is TML-Interaction-Small, a 276-billion parameter mixture-of-experts model that's designed to manage dialogue, presence and immediate follow-ups with rapid speed.
  • Thinking Machines claims that this dual-model architecture delivers some impressive results.

    On FD-bench, a benchmark designed to measure AI interaction quality, TML-Interaction-Small achieved a turn-taking latency of less than 0.

  • Though speedier chatbots will be appreciated by most people, the most significant implications could be found in enterprise applications.

Stats & Key Facts

  • #SiliconANGLE UPDATED 20:59 EDT / MAY 11 2026 AI Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time by Mike Wheatley SHARE Thinking Machines Lab Inc.
  • #The first component of this new architecture is TML-Interaction-Small, a 276-billion parameter mixture-of-experts model that's designed to manage dialogue, presence and immediate follow-ups with rapid speed.
Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time

SiliconANGLE UPDATED 20:59 EDT / MAY 11 2026 AI Thinking Machines drops a new, highly responsive model designed for humanlike interactions in real time by Mike Wheatley SHARE Thinking Machines Lab Inc. , the artificial intelligence research startup founded by former OpenAI Group PBC Chief Technology Officer Mira Murati, wants to move beyond the era of "turn-based" AI interactions. The company has just announced a research preview of its first "interaction models," which are a new class of multimodal AI systems designed to avoid the inevitable pauses that characterize human interactions with AI systems.

As anyone who uses AI regularly knows, the basic interaction is a spotty one, at best: The user provides an input, such as text or an image upload, then waits anywhere from a few milliseconds to several minutes, depending on the model used, before finally receiving the output. This occurs because existing models need to wait for their users to finish asking a question or complete the sentence they're saying before they can start processing a response. To get around this, Thinking Machines has created an entirely new model architecture that enables "full-duplex" communication, which means AI that can listen, see and talk simultaneously.

Thinking Machines argues that the back-and-forth interactions with current models forces human users to "contort themselves" to the interface. Over multiple months of use, humans have learned to phrase their questions like emails and batch their thoughts, because they know the AI they're using cannot handle interruptions or deal with the subtle "backchanneling," or the "mhmms" and "I sees" that exist in truly natural human interactions. But if AI is to become a true humanlike collaborator in high-stakes applications like medical surgery, it has to find a way to ditch that lag.

For more details please read the original article at SiliconANGLE AI.

Continue Learning

Originally published by SiliconANGLE AI
Read the original

Comments

Sign in to join the conversation