Back to News Hub
🟩NVIDIA Blog
June 24, 2026
E-Commerce

NVIDIA and AWS Collaborate to Bring AI to Production at Scale

Overview

Building AI systems at scale is demanding, requiring low-latency inference, fast vector search, strong GPU price-performance and infrastructure that can grow without multiplying operational complexity. NVIDIA's latest work with Amazon Web Services (AWS) addresses each of those constraints. Across Amazon OpenSearch and Amazon EC2, NVIDIA AI infrastructure is giving enterprises more practical paths to deploy [...

Key Takeaways

  • Across Amazon OpenSearch and Amazon EC2, NVIDIA AI infrastructure is giving enterprises more practical paths to deploy AI at production scale.
  • 6x AI inference performance, up to 2.

    1x graphics performance and significantly faster GPU-accelerated data analytics on Amazon EMR using the NVIDIA cuDF library for Apache Spark workloads.

  • Media and entertainment teams get high-resolution video workflows and rendering.

    Simulation, computer-aided design, virtual desktop infrastructure, gaming and spatial computing teams get the same instance type for graphics-intensive applications.

  • It uses GPU-accelerated vector indexing, powered by NVIDIA cuVS, as the default compute choice for all vector collections.

    For teams building retrieval-augmented generation , semantic search, recommendation systems and agentic AI applications, that shift matters.

  • AWS Achieves NVIDIA Exemplar Cloud Status for GB300 Training Performance AWS has achieved NVIDIA Exemplar Cloud status on NVIDIA GB300 for training workloads.

Stats & Key Facts

  • #The customer impact is direct: vector indexing up to 10x faster at a quarter of the cost, compared with CPU-only builds - making billion-scale vector databases practical to build in under an hour.

Across Amazon OpenSearch and Amazon EC2, NVIDIA AI infrastructure is giving enterprises more practical paths to deploy AI at production scale. Building AI systems at scale is demanding, requiring low-latency inference, fast vector search, strong GPU price-performance and infrastructure that can grow without multiplying operational complexity. NVIDIA's latest work with Amazon Web Services (AWS) addresses each of those constraints.

Across Amazon OpenSearch and Amazon EC2, NVIDIA AI infrastructure is giving enterprises more practical paths to deploy AI at production scale. EC2 G7 instances powered by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs expand the compute layer for AI, graphics, video and data analytics workloads, while the NVIDIA cuVS library accelerates the retrieval layer by making GPU-powered vector indexing the default in OpenSearch Serverless. And with AWS achieving NVIDIA Exemplar Cloud status for NVIDIA GB300, customers can trust they're receiving peak optimized performance for their training workloads.

With support for up to eight GPUs, 256GB of total GPU memory, 700 Gbps of EFA-enabled networking and up to 7. 6TB of local NVMe SSD storage - across one-, two-, four- and eight- GPU configurations plus bare metal, coming soon - G7 instances let customers right-size infrastructure for their workloads instead of over-provisioning for them. The platform's versatility means AI teams get lower-latency inference.

Media and entertainment teams get high-resolution video workflows and rendering. Simulation, computer-aided design, virtual desktop infrastructure, gaming and spatial computing teams get the same instance type for graphics-intensive applications. And data teams can apply the GPU memory, local storage and networking improvements to analytics pipelines and vector database workloads.

G7 instances are accessible through AWS Deep Learning Amazon Machine Images (AMIs), Amazon Deep Learning Containers, Amazon EMR, Amazon EKS, Amazon ECS and graphics AMIs - and coming soon to Amazon SageMaker AI. NVIDIA cuVS Makes GPU-Accelerated Vector Search the Default in Amazon OpenSearch The next generation of Amazon OpenSearch Serverless powers agentic AI and dynamic workloads with no infrastructure management required. It uses GPU-accelerated vector indexing, powered by NVIDIA cuVS, as the default compute choice for all vector collections.

For teams building retrieval-augmented generation , semantic search, recommendation systems and agentic AI applications, that shift matters. It turns GPU-powered vector search from a specialized optimization project into a standard AWS capability. The customer impact is direct: vector indexing up to 10x faster at a quarter of the cost, compared with CPU-only builds - making billion-scale vector databases practical to build in under an hour.

By making NVIDIA cuVS the default in OpenSearch Serverless, AWS customers get a much faster path from raw data to production-ready AI retrieval infrastructure - with serverless scaling that reduces operational overhead when workloads are idle. AWS Achieves NVIDIA Exemplar Cloud Status for GB300 Training Performance AWS has achieved NVIDIA Exemplar Cloud status on NVIDIA GB300 for training workloads.

For more details please read the original article at NVIDIA Blog.

Continue Learning

Originally published by NVIDIA Blog
Read the original

Comments

Sign in to join the conversation