Get started with the Claude apps gateway for Google Cloud
Anthropic has released the Claude apps gateway for Google Cloud, a self-hosted service that centralizes governance for Claude Code deployments across organizations. The gateway sits between local Claude Code clients and Google Cloud, handling identity verification, policy enforcement, cost controls, and usage telemetry without requiring developers to manage cloud credentials or API keys directly.
Key Takeaways
- The Claude apps gateway eliminates per-developer credential management by routing authentication through your identity provider and issuing short-lived sessions instead of exposing API keys or service-account credentials.
- Centralized policy enforcement through gateway.yaml ensures that role-based access control rules are server-side and updated across your entire fleet within an hour, preventing local configuration overrides.
- Built-in spend controls allow you to set daily, weekly, or monthly token caps per user, group, or organization, with metering against a Cloud SQL ledger to prevent runaway usage costs.
- Verified usage attribution links every token metric to the user's email and group membership from the session JWT, enabling accurate cost allocation and compliance reporting without relying on spoofable client-side attributes.
- The stateless Cloud Run deployment keeps all inference within your GCP project, maintaining your Data Processing Agreement, quota management, and billing scope while supporting failover across regions or endpoints.
Stats & Key Facts
- #Single Cloud Run service identity routes all inference calls, eliminating per-developer cloud credential overhead
- #Policy updates propagate to the entire developer fleet within one hour
- #Token caps support daily, weekly, or monthly billing cycles

Why the Gateway Solves Enterprise Adoption Friction
Direct integration of Claude Code with Google Cloud works for small teams but creates operational headaches at scale.
- ›Individual developers can already point CLAUDE_CODE_USE_VERTEX=1 at a GCP project with the aiplatform.user role, but this approach requires managing per-developer cloud credentials across your organization.
- ›Enterprise rollout demands pushing managed-settings.json to every laptop via MDM, verifying usage attribution per developer, and enforcing consistent spend controls - all manual and fragmented.
- ›The gateway consolidates these governance layers into a single self-hosted service that sits between Claude Code clients and Google Cloud, replacing credential distribution with centralized identity and policy enforcement.
The Claude apps gateway is shipped with the same claude binary and acts as a transparent proxy, handling authentication, authorization, cost metering, and telemetry attribution server-side. This shift moves governance from the developer's laptop to your infrastructure, making onboarding, offboarding, policy changes, and spend management both easier and more secure.
Identity and Access Management
The gateway integrates with your existing identity provider to eliminate sensitive credentials on developer machines.
- ›Login requests route through your identity provider - Google Workspace or any OIDC/OpenID Connect provider - and the gateway exchanges the login token for a short-lived session.
- ›No sensitive data lands on developer laptops: no service-account keys, API keys, or ANTHROPIC_VERTEX_PROJECT_ID environment variables are needed.
- ›Onboarding is as simple as adding a user to an IdP group; offboarding means removing them, and their next session refresh fails immediately.
- ›The gateway validates its own session bearer token locally; Google Workspace is only contacted during sign-in and token refresh, reducing dependency on external services for ongoing request processing.
Centralized Policy Enforcement
Role-based access control rules are defined once, server-side, and cannot be overridden locally.
- ›RBAC rules live in a single gateway.yaml file, resolved per user group and enforced server-side on every /v1/messages call.
- ›The gateway re-checks availableModels for each request, so editing local managed-settings.json files has no effect - policy is truly centralized.
- ›Policy updates reach your entire developer fleet within one hour, ensuring consistent enforcement across the organization without requiring endpoint restarts or configuration pushes.
- ›Groups defined in your IdP determine which models and API features each developer can access, simplifying compliance and preventing unauthorized usage patterns.
Usage Attribution and Telemetry
Every token metric includes verified user identity and group membership, enabling accurate billing and compliance tracking.
- ›Each claude_code.token.usage metric carries the verified email and groups from the session JWT (signed session token), not client-side OTEL_RESOURCE_ATTRIBUTES which can be spoofed.
- ›The gateway ships attributed metrics over OTLP/HTTP to a collector you operate - Cloud Monitoring, Grafana, Datadog, or any compatible observability platform.
- ›Telemetry includes complete session context, allowing you to link usage to specific users and teams for accurate cost allocation and trend analysis.
- ›Because metrics are signed server-side, you have a cryptographically verified audit trail of token consumption that cannot be forged by misbehaving clients.
Spend Controls and Cost Management
The gateway implements enforceable token budgets to prevent runaway usage and surprise bills.
- ›Set daily, weekly, or monthly token caps per user, group, or entire organization via the admin API.
- ›The gateway meters tokens against a Cloud SQL ledger in real time and returns HTTP 429 (too many requests) when a cap is reached.
- ›Costs are tracked at list price and function as a runaway-usage guardrail; committed-use discounts and negotiated rates do not appear in the ledger.
- ›Spend limits are immediately enforceable and update without restarting any services, allowing you to adjust budgets reactively if usage patterns shift.
Architecture and Routing
The gateway is a stateless service that maintains your inference within your GCP project and supports failover scenarios.
- ›A developer's local or deployed Claude process sends inference requests to the gateway over HTTPS; the gateway is a stateless container running on Cloud Run.
- ›All calls to Agent Platform go out under a single Cloud Run service identity, keeping inference inside your GCP project and maintaining your Data Processing Agreement scope.
- ›The gateway supports region routing: set region: global for Agent Platform's global endpoint, or add multiple upstreams entries to fail over on 5xx, 429, or timeout errors in list order.
- ›Cloud SQL holds device-code sign-in state and the spend ledger; quota, billing, and data residency all remain unchanged from direct integrations.
Getting Started on Google Cloud
Deployment involves provisioning a minimal GCP foundation and configuring the gateway to authenticate as a Cloud Run service account.
- ›Enable the Agent Platform, Cloud SQL, and Secret Manager APIs in your GCP project.
- ›Create a claude-gateway service account with the roles/aiplatform.user role to authenticate to Agent Platform.
- ›Stand up a small Cloud SQL Postgres database instance to store device-code sign-in state and the spend ledger.
- ›Full step-by-step instructions, all gcloud commands, and the complete gateway.yaml reference are available in the Claude apps gateway on Google Cloud documentation.
The gateway authenticates to Agent Platform using the Cloud Run service identity, eliminating the need for individual service-account key rotation. This design keeps credential management minimal and aligns with Google Cloud security best practices for stateless services.
Frequently Asked Questions
Do developers need to manage cloud credentials with the gateway?
No. Developers authenticate through your identity provider at login, and the gateway issues a short-lived session token. No API keys, service-account keys, or GCP project IDs are stored on developer machines.
How quickly do policy changes take effect across the organization?
Policy updates propagate to the entire developer fleet within one hour. The gateway re-checks rules on every API call, so changes to gateway.yaml are enforced immediately without requiring endpoint restarts.
Can developers bypass spend limits by editing local configuration files?
No. Spend caps are enforced server-side in the gateway against a Cloud SQL ledger. Local configuration changes have no effect on token budgets or available models, which are re-verified on every request.
Does the gateway keep inference within my GCP project?
Yes. All calls to Agent Platform go out under a single Cloud Run service identity, and inference stays within your GCP project. Quota, billing, data residency, and your Data Processing Agreement remain unchanged.
What identity providers does the gateway support?
The gateway supports Google Workspace and any OIDC/OpenID Connect compliant identity provider, allowing you to integrate with your existing authentication infrastructure.
The Claude apps gateway transforms Claude Code from a single-developer tool into an enterprise-grade service by moving governance from individual machines to a centralized, self-hosted proxy.
Continue Learning
Comments
Sign in to join the conversation