OpenAI, Broadcom debut custom Jalapeño chip for AI inference
OpenAI Group PBC today revealed a custom chip called Jalapeño that it will use to power its large language models. The processor is the fruit of a collaboration with Broadcom Inc. , which is no stranger to custom silicon design.
Key Takeaways
- SiliconANGLE UPDATED 16:30 EDT / JUNE 24 2026 AI OpenAI, Broadcom debut custom Jalapeño chip for AI inference by Maria Deutscher OpenAI Group PBC today revealed a custom chip called Jalapeño that it will use to power its large language models.
The company helped Google LLC develop its TPU line of artificial intelligence accelerators.
- " That hints Jalapeño's architecture may be designed to reduce data movement between its logic circuits and off-chip memory, one of the main performance bottlenecks in inference clusters.
AI chip suppliers take several approaches to reducing data movement.
- Broadcom's newest Tomahawk chip, the Tomahawk 6, can process up to 1.
- Its blog post describes Jalapeño as the "first step in a multi-generation compute platform," which hints that it may be planning to develop additional inference processors in the future.
Another possibility is that OpenAI will design custom chips for adjacent use cases such as model training.
- Photo: OpenAI A message from John Furrier, co-founder of SiliconANGLE: Support our mission to keep content open and free by engaging with theCUBE community.
Stats & Key Facts
- #SiliconANGLE UPDATED 16:30 EDT / JUNE 24 2026 AI OpenAI, Broadcom debut custom Jalapeño chip for AI inference by Maria Deutscher OpenAI Group PBC today revealed a custom chip called Jalapeño that it will use to power its large language models.

The company helped Google LLC develop its TPU line of artificial intelligence accelerators. In April, the search giant extended its chip collaboration with Broadcom to 2031. 's flagship Rubin graphics cards can run both training and inference workloads.
By contrast, Jalapeño is only designed for the latter use case, which is the process of running the AI models in response to queries. According to OpenAI, early testing indicates that the chip can perform inference with significantly higher performance per watt than "current state-of-the-art," which may be a reference to Nvidia chips. The company has shared few details about Jalapeño's design.
However, the blog post in which it announced the chip specifies that the underlying "architecture reduces data movement. " That hints Jalapeño's architecture may be designed to reduce data movement between its logic circuits and off-chip memory, one of the main performance bottlenecks in inference clusters. AI chip suppliers take several approaches to reducing data movement.
For more details please read the original article at SiliconANGLE AI.
Continue Learning
Comments
Sign in to join the conversation