
Share
Google Cloud's integration of NVIDIA GPUs in Cloud Run offers precise pay-per-second billing and auto-scaling features, revolutionizing cost-efficiency and performance for AI tasks.
June 2, 2025
Google Cloud has announced that NVIDIA GPU support for Cloud Run is now generally available (GA). This update brings significant improvements to the serverless runtime, making it a powerful option for running AI workloads. Here’s what changed technically and why it matters to practitioners:
--gpu 1 to your command-line deployment or enable the "GPU" checkbox in the console. No quota requests are required, which removes a significant barrier to entry.
With general availability, Cloud Run with GPU support is now covered by Google Cloud's Service Level Agreement (SLA), providing reliability and uptime guarantees. This makes it a robust choice for production environments.
"Serverless GPU acceleration represents a major advancement in making cutting-edge AI computing more accessible. With seamless access to NVIDIA L4 GPUs, developers can now bring AI applications to production faster and more cost-effectively than ever before.", Dave Salvator, Director of Accelerated Computing Products, NVIDIA
To start using GPUs with Cloud Run, simply use the --gpu 1 flag in your deployment command or enable the "GPU" checkbox in the Google Cloud Console. No quota requests are necessary, making it easy to get up and running.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
5 June 2025
88 articles
Related Articles
Related Articles
More Stories