Not all GPU clouds are equal. AWS charges enterprise rates for enterprise problems you may not have. Lambda Cloud offers the simplest pricing but no spot market. CoreWeave runs rings around hyperscalers for pure GPU throughput. RunPod and Vast.ai push prices to near-hardware cost. This guide breaks down each provider with live pricing so you can pick the right one for your workload — not just the default.
All pricing pulled live from provider APIs every 30 minutes. Price ranges reflect current spot and on-demand availability.
| Provider | GPUs Available | Price Range | Spot Support | Best For |
|---|---|---|---|---|
| AWS | H100, A100, V100, T4 | $0.13–$98.32/hr | ✓ (EC2 Spot) | Enterprise & compliance |
| GCP | H100, A100, L4, T4, V100 | $0.16–$97.34/hr | ✓ (Preemptible) | Vertex AI & TPU workloads |
| Azure | H100, A100, V100 | $0.11–$29.92/hr | ✓ (Spot VMs) | Microsoft-stack enterprises |
| Lambda Cloud | H100, A100, A10G | $0.60–$2.50/hr | ✗ (On-demand only) | Researchers & flat pricing |
| CoreWeave | H100, A100, A40, RTX 4090 | $1.23–$4.76/hr | ✓ Yes | Inference at scale |
| RunPod | H100, A100, RTX 4090, L40S | $0.19–$3.79/hr | ✓ (Community) | Price-sensitive experiments |
| Vast.ai | H100, A100, RTX 3090/4090 | $0.02–$4.11/hr | ✓ (Bid) | Lowest cost batch jobs |
AWS is the default choice for enterprises — and the most expensive for pure GPU throughput. GPU instances run on EC2 under the p5 (H100), p4d/p4de (A100), and p3 (V100) families. SageMaker wraps these with managed ML infrastructure at an additional premium.
AWS GPU spot instances (EC2 Spot) offer meaningful discounts but require interruption handling. For production inference, most teams end up on On-Demand or Savings Plans — which narrows the price advantage significantly.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H100 SXM80GB | $31.30/hr | $98.32/hr | Save 68% |
| NVIDIA A100 40GB40GB | $8.10/hr | $19.22/hr | Save 58% |
| NVIDIA A100 SXM80GB | $13.07/hr | $32.77/hr | Save 60% |
| NVIDIA L40S48GB | $1.82/hr | $3.22/hr | Save 43% |
| NVIDIA A10G24GB | $0.68/hr | $1.62/hr | Save 58% |
| NVIDIA L424GB | $0.13/hr | $0.98/hr | Save 87% |
| NVIDIA T416GB | $0.67/hr | $4.35/hr | Save 85% |
| NVIDIA V10016GB | $4.95/hr | $12.24/hr | Save 60% |
Your team already runs on AWS infrastructure, you need compliance certifications, or your SLAs require the hyperscaler-grade reliability guarantee. If you're greenfield, there are cheaper options.
GCP is the strongest hyperscaler for ML workloads due to its native integration with Vertex AI, TensorFlow ecosystem, and proprietary TPU v5 access. For GPU specifically, GCP offers A100s and H100s via Compute Engine, and preemptible (spot) instances at meaningful discounts.
GCP's managed ML pipeline (Vertex AI Pipelines, Model Registry, Experiments) makes it the hyperscaler choice for teams standardizing their MLOps stack. Pure compute price-per-GPU is competitive with AWS but still trails specialized GPU clouds.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H100 SXM80GB | $28.96/hr | $97.34/hr | Save 70% |
| NVIDIA A100 40GB40GB | $4.49/hr | $14.69/hr | Save 69% |
| NVIDIA A100 SXM80GB | $1.52/hr | $5.01/hr | Save 70% |
| NVIDIA L424GB | $0.28/hr | $0.71/hr | Save 61% |
| NVIDIA T416GB | $0.16/hr | $0.54/hr | Save 71% |
| NVIDIA V10016GB | $0.89/hr | $2.95/hr | Save 70% |
You need TPUs, you're building on Vertex AI, or your data stack lives in BigQuery. For pure GPU training with no Google ecosystem lock-in, specialized providers are cheaper.
Azure's GPU offering targets the enterprise market with NDv4 (A100) and NDv5 (H100) series. Azure Machine Learning (AML) provides end-to-end MLOps tooling, and Azure OpenAI Service gives enterprise access to GPT-4 models — making Azure the default for Microsoft-stack organizations.
Azure Spot VMs offer GPU discounts, though availability in premium GPU SKUs is often constrained. For organizations already on M365, Azure AD, and Microsoft security stack, Azure reduces operational friction even if raw GPU pricing isn't the most competitive.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H100 SXM80GB | $4.40/hr | $13.96/hr | Save 68% |
| NVIDIA A100 SXM80GB | $7.29/hr | $29.92/hr | Save 76% |
| NVIDIA T416GB | $0.11/hr | $0.53/hr | Save 80% |
| NVIDIA V10016GB | $0.70/hr | $3.37/hr | Save 79% |
Your org runs on Microsoft infrastructure and Azure AD SSO matters more than GPU price. If you're greenfield and just need cheap GPUs, Azure is rarely the answer.
Lambda Cloud is purpose-built for ML teams that want GPU access with zero overhead. No spot market, no complex pricing tiers — just flat on-demand rates that are substantially cheaper than AWS/GCP/Azure for equivalent hardware.
Lambda's H100 and A100 pricing regularly undercuts the hyperscalers by 40-60% on on-demand rates. No egress fees, no storage markups, no support tier upsell. This simplicity attracts researchers and ML startups that find hyperscaler complexity a tax on their time.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H100 SXM80GB | $2.50/hr | $2.49/hr | — |
| NVIDIA A100 SXM80GB | $1.29/hr | $1.29/hr | — |
| NVIDIA A10G24GB | $0.60/hr | $0.60/hr | — |
You want cheap, reliable H100/A100 access without AWS/GCP complexity. No spot tolerance required. Lambda is the "just give me a GPU" answer.
CoreWeave is the GPU cloud designed by ML engineers for ML engineers. NVIDIA-partnered, Kubernetes-native, and built exclusively around GPU workloads since 2019. CoreWeave offers the widest selection of current-generation NVIDIA hardware — H100, H200, A100, A40, RTX 4090 — with both spot and reserved options.
At scale, CoreWeave competes directly with hyperscalers on raw GPU throughput while offering better performance-per-dollar. Their interconnect and NVLink topology is optimized for large model training. Best-in-class for companies running inference at serious scale.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H100 SXM80GB | $2.08/hr | $4.76/hr | Save 56% |
| NVIDIA A100 SXM80GB | $1.23/hr | $2.21/hr | Save 44% |
| NVIDIA L40S48GB | $1.85/hr | $1.84/hr | Save -1% |
| NVIDIA A4048GB | $1.28/hr | $1.28/hr | — |
You're running production inference or large-scale training and need the best GPU-per-dollar with reliability guarantees. CoreWeave is where serious ML infrastructure teams land.
RunPod operates a two-tier model: Secure Cloud (their own datacenter infrastructure) and Community Cloud (idle GPUs from independent hosts). Community Cloud offers the lowest spot prices in the market but without SLA guarantees. Secure Cloud provides more reliability at a modest premium.
RunPod's developer experience is excellent — one-click container deployments, serverless GPU endpoints, and a template library covering popular models (Stable Diffusion, LLaMA, Whisper). The tradeoff: community pods can disappear, and uptime SLAs aren't contractual.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H200141GB | $0.50/hr | $3.79/hr | Save 87% |
| NVIDIA H100 PCIe80GB | $1.99/hr | $2.89/hr | Save 31% |
| NVIDIA H100 SXM80GB | $2.59/hr | $3.19/hr | Save 19% |
| NVIDIA A100 PCIe80GB | $1.19/hr | $1.39/hr | Save 14% |
| NVIDIA A100 SXM80GB | $1.00/hr | $1.35/hr | Save 26% |
| NVIDIA L40S48GB | $0.79/hr | $0.86/hr | Save 8% |
| NVIDIA L4048GB | $0.69/hr | $0.82/hr | Save 16% |
| NVIDIA A10G24GB | $0.28/hr | $0.37/hr | Save 24% |
| NVIDIA L424GB | $0.44/hr | $0.39/hr | Save -13% |
| NVIDIA V10016GB | $0.19/hr | $0.26/hr | Save 26% |
| NVIDIA RTX 409024GB | $0.34/hr | $0.69/hr | Save 51% |
| NVIDIA RTX 309024GB | $0.22/hr | $0.46/hr | Save 52% |
You're experimenting, fine-tuning, or running batch jobs where interruption is tolerable. RunPod's Secure Cloud works for light production use; Community Cloud is for dev/test and cost maximization.
Vast.ai is a decentralized GPU marketplace where independent hosts rent out idle compute. This model pushes prices to near-hardware cost — you'll routinely find RTX 4090 and A100 instances at prices that don't exist anywhere else. The tradeoff is that you're renting from individuals, not datacenters.
Vast.ai's bidding system lets you set your price target and receive instances when matching supply appears. For overnight batch jobs, exploratory training runs, and cost-maximizing inference, Vast.ai is in a different price tier from every other option.
| GPU | Spot Price | On-Demand | Savings |
|---|---|---|---|
| NVIDIA H200141GB | $1.52/hr | $2.13/hr | Save 29% |
| NVIDIA H100 PCIe80GB | $2.13/hr | $2.99/hr | Save 29% |
| NVIDIA H100 SXM80GB | $2.93/hr | $4.11/hr | Save 29% |
| NVIDIA A100 40GB40GB | $0.68/hr | $1.10/hr | Save 38% |
| NVIDIA A100 PCIe80GB | $0.76/hr | $1.06/hr | Save 29% |
| NVIDIA A100 SXM80GB | $0.43/hr | $0.60/hr | Save 29% |
| NVIDIA L40S48GB | $0.85/hr | $1.19/hr | Save 29% |
| NVIDIA L424GB | $0.18/hr | $0.30/hr | Save 40% |
| NVIDIA T416GB | $0.07/hr | $0.15/hr | Save 53% |
| NVIDIA V10016GB | $0.02/hr | $0.03/hr | Save 28% |
| NVIDIA RTX 409024GB | $0.13/hr | $0.19/hr | Save 29% |
| NVIDIA RTX 309024GB | $0.05/hr | $0.07/hr | Save 28% |
You want the absolute lowest price and can tolerate variable reliability. Fine-tuning, exploratory runs, and non-critical batch jobs are the sweet spot. For production inference, look at CoreWeave or RunPod Secure.
The right provider depends on your workload type, reliability requirements, and existing infrastructure. Here's the decision tree most ML teams end up using:
→ AWS, Azure, or GCP. Only hyperscalers have HIPAA, FedRAMP, and SOC 2 Type II. AWS if you're already in AWS. Azure if you're Microsoft-stack. GCP if you're using Vertex AI or TPUs.
→ Lambda Cloud or CoreWeave. Lambda for simplicity and flat pricing. CoreWeave for scale, GPU variety, and best inference performance. Both beat hyperscalers by 40-60% on on-demand rates.
→ RunPod or Vast.ai. RunPod Secure Cloud for moderate reliability. Vast.ai for maximum savings on batch jobs where an interruption just means restarting. Expect 50-80% savings vs. on-demand hyperscalers.
Whichever provider you choose, monitor actual spot prices before you commit. Provider pricing changes frequently, and the difference between the cheapest and most expensive option for the same GPU can be 5-10× on any given day. RoofRun tracks all 7 providers every 30 minutes so you always have current data.
RoofRun monitors all 7 providers every 30 minutes. See live H100, A100, and L40S prices side-by-side, filtered by provider and GPU type.