Supported Cloud Instances

Learn about the instance types we support

Jump to a cloud provider:

Amazon Web Services (AWS)
Google Cloud Platform (GCP)
Vultr Cloud Servers
Oracle Distributed Cloud
Microsoft Azure
DigitalOcean
Lambda

We offer a range of instance types designed to handle a variety of machine learning workloads. These cloud instances vary in their CPU, RAM (Random Access Memory), and GPU configurations, which allow you to orchestrate the right balance of performance and cost for your use case.

The instances listed on this page are available on demand — they can be provisioned immediately when you create a nodepool. Clarifai also offers the latest NVIDIA, AMD, and Google accelerators (including B300, B200, H100, MI355X, and TPU v7) that require reservations and are available on request.

info

See the pricing page to learn more about pricing for each instance type.
Contact us to access GPU types not listed here, including instances that require dedicated provisioning.

Amazon Web Services (AWS) Instances

G7E Instances

The AWS G7E series provides the latest generation of high-performance NVIDIA RTX PRO 6000 GPUs, scaling from single-GPU setups to large multi-GPU clusters. These instances are designed for the most demanding deep learning training and large-scale inference workloads.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g7e.48xlarge`	us-east-1	8x NVIDIA-RTX-PRO-6000 (96Gi)	768Gi	191.1 cores (1,886.99Gi)	$33.156
`g7e.24xlarge`	us-east-1	4x NVIDIA-RTX-PRO-6000 (96Gi)	384Gi	95.3 cores (939.79Gi)	$16.524
`g7e.12xlarge`	us-east-1	2x NVIDIA-RTX-PRO-6000 (96Gi)	192Gi	47.4 cores (466.19Gi)	$10.368
`g7e.8xlarge`	us-east-1	1x NVIDIA-RTX-PRO-6000 (96Gi)	96Gi	31.5 cores (230.74Gi)	$5.256
`g7e.4xlarge`	us-east-1	1x NVIDIA-RTX-PRO-6000 (96Gi)	96Gi	15.5 cores (112.34Gi)	$5.004

Key Features

NVIDIA RTX PRO 6000 GPU — 96 GiB of GPU memory per card, delivering high memory capacity for very large models and long-context inference.
Scalable multi-GPU configurations — From a single GPU up to 8 GPUs (768 GiB combined GPU memory), suitable for distributed training of frontier models.
High CPU & RAM — Up to 191 cores and ~1.9 TiB RAM in the 8-GPU configuration, ensuring data pipelines match GPU throughput.

Example Use Cases

Fine-tuning or running inference on large language models that require more than 48 GiB of GPU memory per card (e.g., 70B+ parameter models on a single GPU).
Multi-GPU distributed training of foundation models where high per-GPU memory is critical.

Looking for the latest GPUs or TPUs?

The newest accelerators — including NVIDIA B300, B200, H100, AMD MI355X, MI300X, and Google TPU v7 — are available with reservations and are not automatically provisioned on demand. Contact us to discuss availability and reserve capacity.

G6 Instances

The AWS G6 series introduces next-generation NVIDIA GPUs for the most demanding machine learning and simulation workloads. These instances scale from single-GPU mid-tier setups to multi-GPU, high-memory configurations capable of handling large-scale model training.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g6e.12xlarge`	us-east-1	4x NVIDIA-L40S (44.99Gi)	179.95Gi	47.4 cores (351.44Gi)	$13.104
`g6e.2xlarge`	us-east-1	1x NVIDIA-L40S (44.99Gi)	44.99Gi	7.5 cores (57.95Gi)	$2.808
`g6e.xlarge`	us-east-1	1x NVIDIA-L40S (44.99Gi)	44.99Gi	3.5 cores (28.35Gi)	$2.34
`g6.2xlarge`	us-east-1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	7.5 cores (28.35Gi)	$1.224
`g6.xlarge`	us-east-1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	3.5 cores (13.55Gi)	$1.008

Key Features

Next-Gen GPUs — NVIDIA L4 GPUs target efficient inference and fine-tuning, while L40S GPUs deliver high throughput for large-scale training.
Scalable GPU memory — From 22.49 GiB (L4) to nearly 180 GiB (multi-L40S), supporting workloads from mid-sized tasks to multi-modal foundation models.
High vCPU & RAM options — Up to 47.4 cores and 351 GiB RAM in g6e.12xlarge, enabling massive parallelism and data-heavy preprocessing.
Flexible tiers — Ranges from cost-efficient single-GPU instances to powerful multi-GPU setups.

Example Use Cases

G6 (L4 instances) support mid-tier workloads such as fine-tuning BERT-large, or computer vision tasks like text-to-image generation and object recognition.
G6e (L40S instances) support advanced training workloads, including large-scale language models (e.g., GPT-4, T5-XL) or multi-modal tasks requiring both vision and language.

G5 Instances

The AWS G5 series provides high-performance GPU capabilities for workloads that demand more memory and compute power. These instances are optimized for deep learning training, large-scale inference, and advanced computer vision tasks.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g5.2xlarge`	us-east-1	1x NVIDIA-A10G (22.49Gi)	22.49Gi	7.5 cores (28.35Gi)	$1.512
`g5.xlarge`	us-east-1	1x NVIDIA-A10G (22.49Gi)	22.49Gi	3.5 cores (13.55Gi)	$1.26

Key Features

NVIDIA A10G GPU — High compute throughput and memory bandwidth, enabling faster training for deep learning and support for more complex models compared to T4 GPUs.
Scalable CPU & memory — From ~3.5 to ~7.5 vCPUs and 13.55 to 28.35 GiB of RAM, supporting data-heavy preprocessing, augmentation, and orchestration alongside GPU tasks.
Balanced design — Efficient for both training and inference, bridging the gap between lightweight GPU instances (like G4dn) and specialized multi-GPU clusters.

Example Use Cases

Training mid-sized NLP models like GPT-2 or T5 for text generation, or training image segmentation models like UNet or Mask R-CNN for medical imaging.
Running object tracking, pose estimation, or other GPU-accelerated pipelines for video analytics.

How to Choose the Best GPU

G4DN Instances

The AWS G4dn series is built for GPU-accelerated workloads at a moderate scale. These instances combine NVIDIA T4 GPUs with balanced CPU and memory resources, making them well-suited for small-to-medium machine learning and inference tasks.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g4dn.xlarge`	us-east-1	1x NVIDIA-T4 (15Gi)	15Gi	3.5 cores (13.86Gi)	$0.648

Key Features

NVIDIA T4 GPU — Designed for inference and light training, offering strong efficiency for workloads at a lower cost compared to heavier GPU families.
vCPUs and RAM — Provides ~3.5 vCPUs (baseline) and ~13.86 GiB memory, giving enough capacity to manage GPU-accelerated tasks, preprocessing, and orchestration.
Balanced performance — Ideal when you need GPU acceleration without the overhead of large, expensive GPU instances.

Example Use Cases

Inference workloads, such as running NLP models such as BERT-base for summarization, classification, or question answering.
Light training smaller models or experimenting with prototypes before scaling to larger GPU families.

T3A Instances

The AWS T3A series is intended for cost‑effective, general‑purpose workloads that do not require GPU acceleration. It provides a balanced mix of CPU and memory, making it suitable for lightweight use cases.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`t3a.2xlarge`	us-east-1	–	–	7.5 cores (28.35Gi)	$0.36
`t3a.xlarge`	us-east-1	–	–	3.5 cores (13.55Gi)	$0.18
`t3a.large`	us-east-1	–	–	1.5 cores (6.4Gi)	$0.108
`t3a.medium`	us-east-1	–	–	1.5 cores (2.89Gi)	$0.072

Key Features

vCPU (virtual CPUs) performance — Burstable performance that adapts to workload spikes. For example, t3a.medium provides ~1.5 vCPUs, while t3a.2xlarge scales up to ~7.5 vCPUs.
Memory — Ranges from 2.89 GiB to 28.35 GiB, enabling efficient in-memory data handling for lightweight to moderately intensive workloads.
Efficiency — Optimized for cost savings compared to other instance families, making them budget-friendly for everyday use.

Example Use Case

Running simple models such as for classification or regression tasks.

Note: The CPU values (e.g., 1.5 cores) are baseline vCPU allocations expressed as fractional units. The instance can burst up to its full vCPU count (e.g., 2 vCPUs for t3a.medium) by consuming CPU credits.

T4G Instances

The AWS T4G series provides ARM-based (AWS Graviton2) CPU-only instances for cost-effective general-purpose workloads. These instances offer a strong performance-per-dollar ratio for lightweight inference and preprocessing tasks that do not require GPU acceleration.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`t4g.xlarge`	us-east-1	–	–	3.5 cores (13.55Gi)	$2.34
`t4g.large`	us-east-1	–	–	1.5 cores (6.4Gi)	$2.34
`t4g.medium`	us-east-1	–	–	1.5 cores (2.89Gi)	$2.34

Key Features

AWS Graviton2 ARM processors — Deliver up to 40% better price performance compared to comparable x86 instances, making them cost-efficient for CPU-bound workloads.
Memory — Ranges from 2.89 GiB to 13.55 GiB, suitable for lightweight model serving and data processing.
Cost efficiency — A budget-friendly alternative to T3A for workloads that are compatible with ARM64 architecture.

Example Use Case

Running lightweight models or preprocessing pipelines where ARM compatibility is confirmed and cost savings are a priority.

Note: Ensure your model container is built for linux/arm64 when using T4G instances.

Google Cloud Platform (GCP) Instances

G4 Instances

The GCP G4 series features NVIDIA RTX PRO 6000 GPUs, offering 96 GiB of GPU memory per card — the highest per-GPU memory available in the GCP lineup. These instances scale from a single GPU to 8-GPU configurations and are well-suited for large model inference and training workloads that demand high GPU memory capacity.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g4-standard-48`	us-central1	1x NVIDIA-RTX-PRO-6000 (96Gi)	96Gi	47.2 cores (169.03Gi)	$5.652
`g4-standard-96`	us-central1	2x NVIDIA-RTX-PRO-6000 (96Gi)	192Gi	95.1 cores (345.43Gi)	$11.268
`g4-standard-192`	us-central1	4x NVIDIA-RTX-PRO-6000 (96Gi)	384Gi	190.9 cores (698.23Gi)	$22.5
`g4-standard-384`	us-central1	8x NVIDIA-RTX-PRO-6000 (96Gi)	768Gi	382.4 cores (1,403.83Gi)	$45

Key Features

NVIDIA RTX PRO 6000 GPU — 96 GiB HBM per card, ideal for workloads requiring more GPU memory than H100/A100 instances provide.
Multi-GPU scaling — From 1 to 8 GPUs (up to 768 GiB combined GPU memory), supporting distributed inference and training at scale.
High CPU & RAM — Up to 382 cores and ~1.4 TiB RAM in the 8-GPU variant.

Example Use Cases

Inference for very large language models (70B–400B+ parameters) where maximizing GPU memory per card reduces the need for model parallelism.
Multi-GPU distributed training where per-GPU memory is the primary bottleneck.

A2 & A3 High-Performance Instances

The A2 and A3 series are GCP's flagship high-performance GPU instances, designed for large-scale deep learning, high-performance inference, and real-time AI workloads. With NVIDIA A100 and H100 GPUs, they scale from single-GPU setups to multi-GPU powerhouse configurations capable of training foundation models.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`a2-ultragpu-1g`	us-central1	1x NVIDIA-A100 (80Gi)	80Gi	11.3 cores (159.23Gi)	$2.34
`a3-highgpu-1g`	us-central1	1x NVIDIA-H100 (79.65Gi)	79.65Gi	25.3 cores (221.95Gi)	$2.34
`a2-ultragpu-1g`	us-east4	1x NVIDIA-A100 (80Gi)	80Gi	11.3 cores (159.23Gi)	$7.128
`a3-highgpu-1g`	us-east4	1x NVIDIA-H100 (79.65Gi)	79.65Gi	25.3 cores (221.95Gi)	$13.824
`a3-highgpu-8g`	us-east4	8x NVIDIA-H100 (79.65Gi)	637.18Gi	206.8 cores (1,827.19Gi)	$110.628

Key Features

Next-generation GPUs — A100 (80 GiB) excels at large-scale training with strong throughput and memory bandwidth. H100 (80 GiB) delivers significant improvements for transformer-based models, enabling faster training and inference. Multi-GPU (a3-highgpu-8g) configurations scale this power dramatically.
Massive CPU & RAM scaling — From 11.3 cores / 159 GiB RAM in a2-ultragpu-1g to 206.8 cores / 1.8 TiB RAM in a3-highgpu-8g, ensuring parallel data pipelines can keep pace with GPU compute.
Flexible tiers — Options for single-GPU tasks or multi-GPU clusters, matching workloads of different scales and budgets.

Example Use Cases

Single-GPU (A2 / A3-1g) can be used for training or fine-tuning mid-to-large language models (e.g., GPT-3, T5-XL) or advanced vision models.
Multi-GPU (A3-8g) can be used for training large-scale, next-generation foundation models (e.g., GPT-4, PaLM, multi-modal transformers) where scale and GPU memory aggregation are critical.
Deploying video analytics, autonomous systems, or robotics pipelines that demand real-time, ultra-low latency.

G2-Standard Instances

The GCP G2-Standard series provides GPU acceleration with NVIDIA L4 GPUs, designed for moderate machine learning and inference workloads. These instances scale from small setups to larger configurations, balancing cost with performance for small-to-medium tasks.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`g2-standard-4`	us-central1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	3.4 cores (12.63Gi)	$2.34
`g2-standard-8`	us-central1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	7.3 cores (27.67Gi)	$2.34
`g2-standard-12`	us-central1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	11.3 cores (42.71Gi)	$2.34
`g2-standard-16`	us-central1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	15.3 cores (57.75Gi)	$2.34
`g2-standard-32`	us-central1	1x NVIDIA-L4 (22.49Gi)	22.49Gi	31.3 cores (118.07Gi)	$2.34
`g2-standard-4`	us-east4	1x NVIDIA-L4 (22.49Gi)	22.49Gi	3.4 cores (12.63Gi)	$0.9
`g2-standard-8`	us-east4	1x NVIDIA-L4 (22.49Gi)	22.49Gi	7.3 cores (27.67Gi)	$1.08
`g2-standard-12`	us-east4	1x NVIDIA-L4 (22.49Gi)	22.49Gi	11.3 cores (42.71Gi)	$1.26
`g2-standard-16`	us-east4	1x NVIDIA-L4 (22.49Gi)	22.49Gi	15.3 cores (57.75Gi)	$1.44
`g2-standard-32`	us-east4	1x NVIDIA-L4 (22.49Gi)	22.49Gi	31.3 cores (118.07Gi)	$2.16

Key Features

NVIDIA L4 GPUs — Optimized for inference and light training, delivering strong efficiency for vision and NLP tasks at lower cost compared to heavier GPU families.
CPU & memory scaling — From ~3.4 to ~31.3 cores and 12.63 GiB to 118.07 GiB RAM, allowing smooth orchestration of preprocessing, data loading, and GPU-bound tasks.
Cost-performance balance — A versatile option for teams that need GPU acceleration without the expense of A100/H100-based instances.

Example Use Cases

Running transformer-based models like BERT-base for summarization, classification, or Q&A.
Fine-tuning smaller computer vision models for object detection or image classification.

TPU v6e & v7 High-Performance Instances

Google's TPU v6e and v7 instances represent the latest generations of purpose-built AI accelerators, offering significant performance improvements over v5 for large-scale training and inference.

Instance Type	Region	Accelerators	Accelerator Memory	CPU	Price/hr
`ct6e-standard-1t`	us-central1	1x GOOGLE-TPU-v6e	–	43.3 cores (165.11Gi)	$2.34
`tpu7x-standard-1t`	us-central1	1x GOOGLE-TPU-v7	–	55.2 cores (227.83Gi)	$2.34
`ct6e-standard-4t`	us-central1	4x GOOGLE-TPU-v6e	–	178.9 cores (698.23Gi)	$2.34
`tpu7x-standard-4t`	us-central1	4x GOOGLE-TPU-v7	–	222.8 cores (933.43Gi)	$2.34

Key Features

TPU v6e (Trillium) — Google's sixth-generation TPU, delivering ~4x the compute performance per chip compared to v5e. Optimized for both training and serving large transformer models.
TPU v7 — Google's latest-generation TPU, designed for frontier-scale training workloads with improved interconnect bandwidth and compute throughput over v6e.
No exposed accelerator memory — TPU memory is managed by the TPU runtime rather than exposed as a configurable VRAM value.
High CPU & RAM — From 43 cores / 165 GiB RAM (v6e 1-chip) to 222 cores / 933 GiB RAM (v7 4-chip), supporting large data pipelines alongside TPU compute.

Example Use Cases

TPU v6e can be used for cost-efficient, high-throughput serving of large language models (e.g., Gemma, LLaMA) and training of mid-to-large scale transformer architectures.
TPU v7 can be used for training frontier models at scale where maximum TPU throughput and interconnect performance are required.

TPU v5e & v5p High-Performance Instances

Google's cloud TPU v5e and v5p instances are purpose-built accelerators optimized for deep learning training and inference. Unlike GPUs, TPUs (Tensor Processing Units) are specialized for matrix-heavy tensor operations, making them ideal for transformer-based models and large-scale distributed training.

Instance Type	Region	Accelerators	Accelerator Memory	CPU	Price/hr
`ct5lp-hightpu-1t`	us-central1	1x GOOGLE-TPU-v5e	–	23.3 cores (42.71Gi)	$2.34
`ct5lp-hightpu-4t`	us-central1	4x GOOGLE-TPU-v5e	–	111.1 cores (180.79Gi)	$2.34
`ct5p-hightpu-4t`	us-central1	4x GOOGLE-TPU-v5p	–	206.8 cores (431.67Gi)	$2.34

Key Features

Specialized Tensor Processing Units (TPUs) — TPU v5e provides a balanced design optimized for cost-efficiency in large-scale training and inference, which is great for productionizing ML workloads where throughput matters. TPU v5p provides higher-performance generation with faster interconnects and larger scaling potential, designed for frontier model training.
Scalable CPU & memory — From 23.3 cores / 42.71 GiB RAM in the 1-core v5e instance to 206.8 cores / 431.67 GiB RAM in the 4-core v5p, ensuring sufficient orchestration power for massive training workloads.
No exposed GPU memory — TPU memory is not presented like GPU VRAM but is instead managed by the TPU runtime for high-efficiency tensor operations.

Example Use Cases

TPU v5e (1t, 4t) can be used for cost-efficient training of language models (e.g., BERT, T5-small/XL) and vision transformers (ViT).
TPU v5p (4t) can be used for training large foundation models such as PaLM, Gemini-like multi-modal architectures, or massive LLMs where performance and throughput at scale are critical.

N2-Standard Instances

The GCP N2-Standard series offers cost-effective, general-purpose compute for workloads that don't require GPU acceleration. These instances balance CPU and memory, making them well-suited for lightweight applications, preprocessing, and small-scale model deployment.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`n2-standard-2`	us-central1	–	–	1.4 cores (5.42Gi)	$2.34
`n2-standard-4`	us-central1	–	–	3.4 cores (12.63Gi)	$2.34
`n2-standard-8`	us-central1	–	–	7.3 cores (27.67Gi)	$2.34
`n2-standard-16`	us-central1	–	–	15.3 cores (57.75Gi)	$2.34
`n2-standard-2`	us-east4	–	–	1.4 cores (5.42Gi)	$0.144
`n2-standard-4`	us-east4	–	–	3.4 cores (12.63Gi)	$0.288
`n2-standard-8`	us-east4	–	–	7.3 cores (27.67Gi)	$0.54
`n2-standard-16`	us-east4	–	–	15.3 cores (57.75Gi)	$1.08

Key Features

vCPUs — Baseline performance scales from ~1.4 to ~15.3 cores, with the ability to burst to the full allocation (2 to 16 vCPUs). Optimized for CPU-intensive tasks, such as running traditional models.
Memory (RAM) — From 5.42 GiB to 57.75 GiB, supporting in-memory data handling for lightweight to moderately intensive workloads.
Cost efficiency — Designed to deliver consistent performance at a lower cost, ideal for everyday compute tasks without GPU requirements.

Example Use Case

Running small-scale machine learning models or serving simple inference workloads.

Vultr Cloud Servers Instances

GH200, MI300X & MI355X High-Performance GPU Instances

Vultr offers high-performance GPU instances powered by NVIDIA and AMD accelerators. These instances are built for AI training, inference, and HPC (high-performance computing) workloads that demand extreme compute and memory bandwidth.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`vbm-72c-480gb-gh200-gpu`	atlanta	1x NVIDIA-GH200 (95.58Gi)	95.58Gi	71.4 cores (455.74Gi)	Contact Us
`vbm-256c-2048gb-8-mi300x-gpu`	chicago	8x AMD-MI300X (127.82Gi)	1,022.53Gi	255.4 cores (1,945.34Gi)	Contact Us
`vbm-256c-3072gb-8-mi355x-gpu`	chicago	8x AMD-MI355X (288Gi)	2,304Gi	255.4 cores (2,918.14Gi)	Contact Us

Key Features

NVIDIA GH200 superchip — Combines Hopper GPU architecture with Grace CPU integration, delivering ultra-fast memory bandwidth and low-latency compute, ideal for training massive AI models and real-time inference.
AMD MI300X GPUs — Designed for frontier AI workloads, offering huge HBM3 memory (128 GiB per GPU) and scaling efficiency with 8 GPUs per instance. Excellent for distributed training of very large models.
AMD MI355X GPUs — Next-generation AMD accelerator with 288 GiB HBM3 per card, pushing single-node GPU memory beyond MI300X for frontier-scale training and inference.
High CPU & RAM configurations — From 71 cores / 455 GiB RAM (GH200) up to 255 cores / ~2.85 TiB RAM (MI355X cluster), ensuring orchestration and preprocessing don't bottleneck GPU performance.

Example Use Cases

NVIDIA GH200 (single GPU instance) can be used for training and inference for large language models (e.g., LLaMA 2–70B, GPT-3 scale). It can also be used for real-time multi-modal use cases requiring tight GPU-CPU integration, such as speech-to-speech AI assistants or interactive robotics.
AMD MI300X (8-GPU cluster instance) can be used for training frontier LLMs and multi-modal models (GPT-4, Gemini-class, or open LLMs at >100B parameters).
AMD MI355X (8-GPU cluster instance) can be used for frontier-scale workloads requiring maximum single-node GPU memory beyond what MI300X provides.

VCG Instances

The Vultr VCG series provides instances with dedicated NVIDIA GPUs, enabling acceleration for deep learning, inference, and GPU-intensive applications. These instances scale from entry-level GPU setups to multi-GPU clusters, making them versatile for workloads ranging from experimentation to frontier AI model training.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`vcg-a16-6c-64g-16vram`	new-york	1x NVIDIA-A16 (16Gi)	16Gi	5.4 cores (60.54Gi)	$0.576
`vcg-l40s-16c-180g-48vram`	new-york	1x NVIDIA-L40S (44.99Gi)	44.99Gi	15.4 cores (170.74Gi)	$2.088
`vcg-a100-12c-120g-80vram`	new-york	1x NVIDIA-A100 (80Gi)	80Gi	11.4 cores (113.74Gi)	$2.988
`vcg-b200-248c-2826g-1536vram`	atlanta	8x NVIDIA-B200 (179.06Gi)	1,432.49Gi	247.4 cores (2,684.44Gi)	Contact Us
`vcg-b200-248c-2826g-1536vram`	seattle	8x NVIDIA-B200 (179.06Gi)	1,432.49Gi	247.4 cores (2,684.44Gi)	Contact Us

Key Features

Range of NVIDIA GPUs — From the A16 (lightweight inference) to the A100 (high-performance training), L40S (next-gen accelerated workloads), and B200 clusters (frontier-scale AI with 8 GPUs).
High vCPU and RAM configurations — Scales from 5.4 cores / 60 GiB RAM in entry-level instances to 255 cores / 1.9 TiB RAM in multi-GPU setups, ensuring GPU workloads are matched with sufficient CPU and memory.
Scalable GPU memory — Ranges from 16 GiB (A16) for smaller tasks up to 1.4 TiB (8 × B200) for extreme AI training.

Example Use Cases

High-performance training and inference for large-scale deep learning models.
Running AI inference workloads with optimized GPU acceleration.

VC2 Instances

The Vultr VC2 series provides general-purpose compute instances optimized for workloads that do not require GPU acceleration. With a balance of CPU and memory, these instances are best suited for lightweight use cases.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`vc2-2c-4gb`	new-york	–	–	1.4 cores (3.54Gi)	$0.036
`vc2-4c-8gb`	new-york	–	–	3.4 cores (7.34Gi)	$0.072
`vc2-6c-16gb`	new-york	–	–	5.4 cores (14.94Gi)	$0.144
`vc2-8c-32gb`	new-york	–	–	7.4 cores (30.14Gi)	$0.288
`vc2-16c-64gb`	new-york	–	–	15.4 cores (60.54Gi)	$0.54
`vc2-24c-96gb`	new-york	–	–	23.4 cores (90.94Gi)	$1.08

Key Features

Scalable CPU and RAM — Configurations range from 1.4 cores / 3.54 GiB RAM (vc2-2c-4gb) up to 23.4 cores / 90.94 GiB RAM (vc2-24c-96gb).
Cost-effective — Optimized for environments where GPU acceleration is unnecessary, making them a good fit for traditional compute workloads.
Flexibility — Suitable for a broad range of general-purpose tasks, with instance sizes that scale from small testing environments to larger services.

Example Use Cases

Suitable for lightweight applications as well as development and testing environments.

Oracle Distributed Cloud Instances

AMD-MI300X GPU Instances

The Oracle MI300X family provides cutting-edge AMD Instinct MI300X GPUs, purpose-built for large-scale AI training and HPC (high-performance computing) workloads. These instances feature massive GPU memory and CPU scaling, supporting frontier use cases.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`BM.GPU.MI300X.8`	us-chicago-1	8 × AMD-MI300X (127.82Gi)	1,022.53Gi	111.5 cores (1,945.01Gi)	Contact Us

Key Features

8 × AMD MI300X GPUs, each with 128 GiB HBM3 memory (over 1 TiB GPU memory total).
Extremely high CPU scaling, 111.5 cores, and nearly 2 TiB of system RAM.
Suited for frontier-scale AI training and supercomputing-class workloads.

Example Use Cases

Training LLMs and multi-modal models (>100B parameters).
High-performance computing simulations, such as weather forecasting and scientific modeling.

NVIDIA-A10G GPU Instances

The Oracle A10G series provides NVIDIA GPUs optimized for inference, visualization, and moderate ML training tasks. These instances combine consistent vCPU allocations with scalable GPU configurations, from single-GPU VMs to multi-GPU bare metal nodes.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`VM.GPU.A10.1`	us-chicago-1	1 × NVIDIA-A10G (22.49Gi)	22.49Gi	14.8 cores (227.41Gi)	Contact Us
`BM.GPU.A10.4`	us-chicago-1	4 × NVIDIA-A10G (22.49Gi)	89.95Gi	14.8 cores (227.41Gi)	Contact Us

Key Features

NVIDIA A10G GPUs with 22.49 GiB memory each.
Same CPU allocation (14.8 cores / 227 GiB RAM) across single- and four-GPU configurations.

Example Use Cases

VM.A10.1 (single GPU) can be used for small-scale ML inference, 3D rendering, and graphics-heavy applications.
BM.A10.4 (four GPUs) can be used for larger inference workloads, distributed graphics rendering, or multi-GPU training of medium-sized models.

Standard CPU-Only Instances

The Oracle E6 Flex series provides CPU-only instances for workloads that do not require GPU acceleration.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`VM.Standard.E6.Flex`	us-chicago-1	–	–	0.8 cores (14.61Gi)	Contact Us

Key Features

No GPU, pure vCPU, and memory resources.
0.8 cores and 14.61 GiB RAM, suitable for supporting tasks.
Cost-efficient for non-GPU workloads.

Example Use Cases

Running control-plane services or lightweight applications alongside GPU clusters.
Workloads requiring basic compute and memory without acceleration.

Microsoft Azure Instances

High-Performance GPU Instances

These Azure instances provide NVIDIA H100, A100, and AMD MI300X GPUs for large-scale deep learning training and inference workloads.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`Standard_NC40ads_H100_v5`	eastus	1x NVIDIA-H100 (79.65Gi)	79.65Gi	39.7 cores (215.63Gi)	$2.34
`Standard_NC24ads_A100_v4`	eastus	1x NVIDIA-A100 (80Gi)	80Gi	23.8 cores (148.13Gi)	$2.34
`Standard_ND96isr_MI300X_v5`	eastus	1x AMD-MI300X (127.82Gi)	127.82Gi	95.6 cores (1,248.38Gi)	Contact Us

Key Features

NVIDIA H100 and A100 GPUs — Industry-leading GPU performance for transformer-based model training, large-scale inference, and fine-tuning of frontier LLMs.
AMD MI300X GPU — 128 GiB HBM3 memory per card with massive CPU and RAM backing (~1.2 TiB system RAM), suited for very large model inference and distributed training.
Scalable memory and compute — From 23.8 cores / 148 GiB RAM (A100) up to 95.6 cores / 1,248 GiB RAM (MI300X), supporting memory-intensive training pipelines.

Example Use Cases

Fine-tuning or running inference on large language models (e.g., LLaMA-70B, GPT-3-scale) with A100 or H100 instances.
Training frontier LLMs (>100B parameters) or multi-modal models requiring extreme GPU memory with MI300X instances.

Mid-Range GPU Instances

These instances provide NVIDIA A10G, T4, and AMD V620 GPUs for moderate ML inference and training workloads.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`Standard_NV36ads_A10_v5`	eastus	1x NVIDIA-A10G (22.49Gi)	22.49Gi	35.7 cores (296.63Gi)	$2.34
`Standard_NG32ads_V620_v1`	eastus	1x AMD-V620 (32Gi)	32Gi	31.7 cores (42.82Gi)	$2.34
`Standard_NC4as_T4_v3`	eastus	1x NVIDIA-T4 (15Gi)	15Gi	3.8 cores (18.52Gi)	$2.34

Key Features

NVIDIA A10G GPU — High compute throughput for inference and moderate training, with a large CPU and RAM allocation (~296 GiB) suited for data-heavy pipelines.
AMD V620 GPU — 32 GiB GPU memory at a competitive price point, offering a balance between memory capacity and cost for inference tasks.
NVIDIA T4 GPU — Cost-efficient inference-focused GPU, ideal for deploying production NLP and vision models at smaller scale.

Example Use Cases

Running transformer-based NLP models (e.g., BERT, T5) for classification, summarization, or Q&A with T4 or A10G instances.
Moderate fine-tuning of mid-sized models or computer vision tasks requiring more than 16 GiB GPU memory with the AMD V620.

Standard CPU-Only Instances

These Azure instances provide general-purpose compute for workloads that do not require GPU acceleration.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`Standard_A4`	eastus	–	–	7.8 cores (9.07Gi)	$2.34
`Standard_A3`	eastus	–	–	3.8 cores (4.35Gi)	$2.34
`Standard_A4m_v2`	eastus	–	–	3.8 cores (21.22Gi)	$2.34
`Standard_L2aos_v4`	eastus	–	–	1.8 cores (10.42Gi)	$2.34

Key Features

Flexible CPU and memory — From 1.8 cores / 10.42 GiB RAM (Standard_L2aos_v4) up to 7.8 cores / 9.07 GiB RAM (Standard_A4), covering a range of lightweight workloads.
Memory-optimized option — Standard_A4m_v2 provides 21.22 GiB RAM at 3.8 cores, offering higher memory per core for in-memory data processing.
Cost-effective — Suitable for preprocessing, control-plane services, lightweight inference, and development environments.

Example Use Cases

Running simple classification or regression models that do not require GPU acceleration.
Preprocessing pipelines, orchestration services, or lightweight applications alongside GPU-accelerated nodepools.

DigitalOcean Instances

NVIDIA H100 GPU Instances

DigitalOcean's H100 instances provide NVIDIA H100 GPUs for high-performance training and inference workloads, available in the nyc1, nyc2, and sfo3 regions.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`gpu-h100x1-80gb`	nyc1	1x NVIDIA-H100 (80Gi)	80Gi	19.8 cores (238.24Gi)	Contact Us
`gpu-h100x2-160gb`	nyc1	2x NVIDIA-H100 (160Gi)	320Gi	39.7 cores (483.24Gi)	Contact Us
`gpu-h100x4-320gb`	nyc1	4x NVIDIA-H100 (320Gi)	1,280Gi	79.6 cores (973.24Gi)	Contact Us
`gpu-h100x1-80gb`	nyc2	1x NVIDIA-H100 (80Gi)	80Gi	19.8 cores (238.24Gi)	Contact Us
`gpu-h100x2-160gb`	nyc2	2x NVIDIA-H100 (160Gi)	320Gi	39.7 cores (483.24Gi)	Contact Us
`gpu-h100x4-320gb`	nyc2	4x NVIDIA-H100 (320Gi)	1,280Gi	79.6 cores (973.24Gi)	Contact Us
`gpu-h100x1-80gb`	sfo3	1x NVIDIA-H100 (80Gi)	80Gi	19.8 cores (238.24Gi)	Contact Us
`gpu-h100x2-160gb`	sfo3	2x NVIDIA-H100 (160Gi)	320Gi	39.7 cores (483.24Gi)	Contact Us
`gpu-h100x4-320gb`	sfo3	4x NVIDIA-H100 (320Gi)	1,280Gi	79.6 cores (973.24Gi)	Contact Us

Key Features

NVIDIA H100 GPUs — Leading GPU performance for transformer-based model training, large-scale inference, and fine-tuning of large language models.
Scalable multi-GPU configurations — From a single H100 (80 GiB) up to 4 GPUs (1,280 GiB combined GPU memory), with CPU and RAM scaling in step.
High memory backing — Up to 79.6 cores and ~973 GiB RAM in the 4-GPU configuration, ensuring data pipelines keep pace with GPU throughput.

Example Use Cases

Fine-tuning or serving large language models (e.g., LLaMA-70B, Mistral) on a single or dual H100.
Multi-GPU distributed training of foundation models where high per-GPU memory is required.

AMD MI300X & MI350X GPU Instances

DigitalOcean's AMD Instinct instances provide extreme GPU memory capacity for frontier-scale AI training and inference, available in atl1, nyc1, nyc2, and sfo3 regions.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`gpu-mi300x1-192gb`	atl1	1x AMD-MI300X (192Gi)	192Gi	19.8 cores (228.44Gi)	Contact Us
`gpu-mi300x8-1536gb`	atl1	8x AMD-MI300X (192Gi)	1,536Gi	159.4 cores (1,874.84Gi)	Contact Us
`gpu-mi350x1-288gb-contracted`	atl1	1x AMD-MI350X (288Gi)	288Gi	23.8 cores (244.12Gi)	Contact Us
`gpu-mi350x8-2304gb-contracted`	atl1	8x AMD-MI350X (288Gi)	2,304Gi	191.3 cores (2,000.28Gi)	Contact Us
`gpu-mi300x1-192gb`	nyc1	1x AMD-MI300X (192Gi)	192Gi	19.8 cores (228.44Gi)	Contact Us
`gpu-mi300x8-1536gb`	nyc1	8x AMD-MI300X (192Gi)	1,536Gi	159.4 cores (1,874.84Gi)	Contact Us
`gpu-mi350x1-288gb-contracted`	nyc1	1x AMD-MI350X (288Gi)	288Gi	23.8 cores (244.12Gi)	Contact Us
`gpu-mi350x8-2304gb-contracted`	nyc1	8x AMD-MI350X (288Gi)	2,304Gi	191.3 cores (2,000.28Gi)	Contact Us
`gpu-mi300x1-192gb`	nyc2	1x AMD-MI300X (192Gi)	192Gi	19.8 cores (228.44Gi)	Contact Us
`gpu-mi300x8-1536gb`	nyc2	8x AMD-MI300X (192Gi)	1,536Gi	159.4 cores (1,874.84Gi)	Contact Us
`gpu-mi350x1-288gb-contracted`	nyc2	1x AMD-MI350X (288Gi)	288Gi	23.8 cores (244.12Gi)	Contact Us
`gpu-mi350x8-2304gb-contracted`	nyc2	8x AMD-MI350X (288Gi)	2,304Gi	191.3 cores (2,000.28Gi)	Contact Us
`gpu-mi300x1-192gb`	sfo3	1x AMD-MI300X (192Gi)	192Gi	19.8 cores (228.44Gi)	Contact Us
`gpu-mi300x8-1536gb`	sfo3	8x AMD-MI300X (192Gi)	1,536Gi	159.4 cores (1,874.84Gi)	Contact Us
`gpu-mi350x1-288gb-contracted`	sfo3	1x AMD-MI350X (288Gi)	288Gi	23.8 cores (244.12Gi)	Contact Us
`gpu-mi350x8-2304gb-contracted`	sfo3	8x AMD-MI350X (288Gi)	2,304Gi	191.3 cores (2,000.28Gi)	Contact Us

Key Features

AMD MI300X GPUs — 192 GiB HBM3 memory per card, delivering industry-leading GPU memory capacity for very large model inference and distributed training without model parallelism.
AMD MI350X GPUs — Next-generation AMD accelerator with 288 GiB per card, pushing single-GPU memory capacity beyond MI300X for frontier-scale workloads.
Multi-GPU scaling — From a single GPU up to 8-GPU clusters (up to 2,304 GiB combined GPU memory with MI350X), supported by up to 191 cores and ~2 TiB system RAM.
Multi-region availability — Both MI300X and MI350X instances are available in atl1, nyc1, nyc2, and sfo3, providing geographic redundancy.

Example Use Cases

Running inference on very large language models (70B–400B+ parameters) where maximizing GPU memory per card reduces or eliminates model parallelism overhead.
Distributed training of frontier LLMs and multi-modal models requiring over 1 TiB of combined GPU memory.

Standard CPU-Only Instances

These DigitalOcean instances provide general-purpose compute for workloads that do not require GPU acceleration.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`s-2vcpu-2gb`	nyc1	–	–	1.8 cores (1.3Gi)	Contact Us
`s-4vcpu-8gb`	nyc1	–	–	3.8 cores (6.02Gi)	Contact Us
`s-8vcpu-16gb`	nyc1	–	–	7.8 cores (13.24Gi)	Contact Us
`s-2vcpu-2gb`	nyc2	–	–	1.8 cores (1.3Gi)	Contact Us
`s-4vcpu-8gb`	nyc2	–	–	3.8 cores (6.02Gi)	Contact Us
`s-8vcpu-16gb`	nyc2	–	–	7.8 cores (13.24Gi)	Contact Us
`s-2vcpu-2gb`	sfo3	–	–	1.8 cores (1.3Gi)	Contact Us
`s-4vcpu-8gb`	sfo3	–	–	3.8 cores (6.02Gi)	Contact Us
`s-8vcpu-16gb`	sfo3	–	–	7.8 cores (13.24Gi)	Contact Us

Key Features

Scalable CPU and memory — From 1.8 cores / 1.3 GiB RAM (s-2vcpu-2gb) up to 7.8 cores / 13.24 GiB RAM (s-8vcpu-16gb).
Cost-effective — Suitable for lightweight workloads, preprocessing pipelines, and supporting services that do not require GPU acceleration.

Example Use Cases

Running simple models, control-plane services, or development and testing environments alongside GPU-accelerated nodepools.

Lambda Instances

NVIDIA B200 GPU Instances

Lambda's B200 instances provide NVIDIA B200 GPUs for frontier-scale AI training and inference workloads, available in the us-south-3 region.

Instance Type	Region	GPU	GPU Memory	CPU	Price/hr
`cl-xl-gpu-8-b200`	us-south-3	8x NVIDIA-B200 (179.06Gi)	1,432.49Gi	207.3 cores (2,790.05Gi)	Contact Us

Key Features

NVIDIA B200 GPUs — Latest-generation Blackwell architecture, delivering significant performance improvements over H100 for large-scale transformer training and inference.
Massive GPU memory — 1,432 GiB of combined GPU memory across 8 GPUs, supporting frontier-scale models without requiring model parallelism across nodes.
Extreme CPU & RAM — 207 cores and ~2.8 TiB system RAM, ensuring data pipelines and preprocessing can fully saturate GPU throughput.

Example Use Cases

Training or fine-tuning frontier LLMs (100B+ parameters) where maximum GPU memory and Blackwell compute throughput are required.
High-throughput inference for very large models, leveraging the combined 1.4 TiB GPU memory to serve multiple large models simultaneously.

Amazon Web Services (AWS) Instances​

G7E Instances​

G6 Instances​

G5 Instances​

G4DN Instances​

T3A Instances​

T4G Instances​

Google Cloud Platform (GCP) Instances​

G4 Instances​

A2 & A3 High-Performance Instances​

G2-Standard Instances​

TPU v6e & v7 High-Performance Instances​

TPU v5e & v5p High-Performance Instances​

N2-Standard Instances​

Vultr Cloud Servers Instances​

GH200, MI300X & MI355X High-Performance GPU Instances​

VCG Instances​

VC2 Instances​

Oracle Distributed Cloud Instances​

AMD-MI300X GPU Instances​

NVIDIA-A10G GPU Instances​

Standard CPU-Only Instances​

Microsoft Azure Instances​

High-Performance GPU Instances​

Mid-Range GPU Instances​

Standard CPU-Only Instances​

DigitalOcean Instances​

NVIDIA H100 GPU Instances​

AMD MI300X & MI350X GPU Instances​

Standard CPU-Only Instances​

Lambda Instances​

NVIDIA B200 GPU Instances​

Amazon Web Services (AWS) Instances

G7E Instances

G6 Instances

G5 Instances

G4DN Instances

T3A Instances

T4G Instances

Google Cloud Platform (GCP) Instances

G4 Instances

A2 & A3 High-Performance Instances

G2-Standard Instances

TPU v6e & v7 High-Performance Instances

TPU v5e & v5p High-Performance Instances

N2-Standard Instances

Vultr Cloud Servers Instances

GH200, MI300X & MI355X High-Performance GPU Instances

VCG Instances

VC2 Instances

Oracle Distributed Cloud Instances

AMD-MI300X GPU Instances

NVIDIA-A10G GPU Instances

Standard CPU-Only Instances

Microsoft Azure Instances

High-Performance GPU Instances

Mid-Range GPU Instances

Standard CPU-Only Instances

DigitalOcean Instances

NVIDIA H100 GPU Instances

AMD MI300X & MI350X GPU Instances

Standard CPU-Only Instances

Lambda Instances

NVIDIA B200 GPU Instances