Supported Cloud Instances
Learn about the instance types we support
Jump to a cloud provider:
- Amazon Web Services (AWS)
- Google Cloud Platform (GCP)
- Vultr Cloud Servers
- Oracle Distributed Cloud
- Microsoft Azure
- DigitalOcean
- Lambda
We offer a range of instance types designed to handle a variety of machine learning workloads. These cloud instances vary in their CPU, RAM (Random Access Memory), and GPU configurations, which allow you to orchestrate the right balance of performance and cost for your use case.
The instances listed on this page are available on demand — they can be provisioned immediately when you create a nodepool. Clarifai also offers the latest NVIDIA, AMD, and Google accelerators (including B300, B200, H100, MI355X, and TPU v7) that require reservations and are available on request.
-
See the pricing page to learn more about pricing for each instance type.
-
Contact us to access GPU types not listed here, including instances that require dedicated provisioning.
Amazon Web Services (AWS) Instances
G7E Instances
The AWS G7E series provides the latest generation of high-performance NVIDIA RTX PRO 6000 GPUs, scaling from single-GPU setups to large multi-GPU clusters. These instances are designed for the most demanding deep learning training and large-scale inference workloads.
| Instance Type | Region | GPU | GPU Memory | CPU | Price/hr |
|---|---|---|---|---|---|
g7e.48xlarge | us-east-1 | 8x NVIDIA-RTX-PRO-6000 (96Gi) | 768Gi | 191.1 cores (1,886.99Gi) | $33.156 |
g7e.24xlarge | us-east-1 | 4x NVIDIA-RTX-PRO-6000 (96Gi) | 384Gi | 95.3 cores (939.79Gi) | $16.524 |
g7e.12xlarge | us-east-1 | 2x NVIDIA-RTX-PRO-6000 (96Gi) | 192Gi | 47.4 cores (466.19Gi) | $10.368 |
g7e.8xlarge | us-east-1 | 1x NVIDIA-RTX-PRO-6000 (96Gi) | 96Gi | 31.5 cores (230.74Gi) | $5.256 |
g7e.4xlarge | us-east-1 | 1x NVIDIA-RTX-PRO-6000 (96Gi) | 96Gi | 15.5 cores (112.34Gi) | $5.004 |
Key Features
-
NVIDIA RTX PRO 6000 GPU — 96 GiB of GPU memory per card, delivering high memory capacity for very large models and long-context inference.
-
Scalable multi-GPU configurations — From a single GPU up to 8 GPUs (768 GiB combined GPU memory), suitable for distributed training of frontier models.
-
High CPU & RAM — Up to 191 cores and ~1.9 TiB RAM in the 8-GPU configuration, ensuring data pipelines match GPU throughput.
Example Use Cases
- Fine-tuning or running inference on large language models that require more than 48 GiB of GPU memory per card (e.g., 70B+ parameter models on a single GPU).
- Multi-GPU distributed training of foundation models where high per-GPU memory is critical.
The newest accelerators — including NVIDIA B300, B200, H100, AMD MI355X, MI300X, and Google TPU v7 — are available with reservations and are not automatically provisioned on demand. Contact us to discuss availability and reserve capacity.
G6 Instances
The AWS G6 series introduces next-generation NVIDIA GPUs for the most demanding machine learning and simulation workloads. These instances scale from single-GPU mid-tier setups to multi-GPU, high-memory configurations capable of handling large-scale model training.
| Instance Type | Region | GPU | GPU Memory | CPU | Price/hr |
|---|---|---|---|---|---|
g6e.12xlarge | us-east-1 | 4x NVIDIA-L40S (44.99Gi) | 179.95Gi | 47.4 cores (351.44Gi) | $13.104 |
g6e.2xlarge | us-east-1 | 1x NVIDIA-L40S (44.99Gi) | 44.99Gi | 7.5 cores (57.95Gi) | $2.808 |
g6e.xlarge | us-east-1 | 1x NVIDIA-L40S (44.99Gi) | 44.99Gi | 3.5 cores (28.35Gi) | $2.34 |
g6.2xlarge | us-east-1 | 1x NVIDIA-L4 (22.49Gi) | 22.49Gi | 7.5 cores (28.35Gi) | $1.224 |
g6.xlarge | us-east-1 | 1x NVIDIA-L4 (22.49Gi) | 22.49Gi | 3.5 cores (13.55Gi) | $1.008 |
Key Features
-
Next-Gen GPUs — NVIDIA L4 GPUs target efficient inference and fine-tuning, while L40S GPUs deliver high throughput for large-scale training.
-
Scalable GPU memory — From 22.49 GiB (L4) to nearly 180 GiB (multi-L40S), supporting workloads from mid-sized tasks to multi-modal foundation models.
-
High vCPU & RAM options — Up to 47.4 cores and 351 GiB RAM in
g6e.12xlarge, enabling massive parallelism and data-heavy preprocessing. -
Flexible tiers — Ranges from cost-efficient single-GPU instances to powerful multi-GPU setups.
Example Use Cases
-
G6 (L4 instances) support mid-tier workloads such as fine-tuning BERT-large, or computer vision tasks like text-to-image generation and object recognition.
-
G6e (L40S instances) support advanced training workloads, including large-scale language models (e.g., GPT-4, T5-XL) or multi-modal tasks requiring both vision and language.
G5 Instances
The AWS G5 series provides high-performance GPU capabilities for workloads that demand more memory and compute power. These instances are optimized for deep learning training, large-scale inference, and advanced computer vision tasks.
| Instance Type | Region | GPU | GPU Memory | CPU | Price/hr |
|---|---|---|---|---|---|
g5.2xlarge | us-east-1 | 1x NVIDIA-A10G (22.49Gi) | 22.49Gi | 7.5 cores (28.35Gi) | $1.512 |
g5.xlarge | us-east-1 | 1x NVIDIA-A10G (22.49Gi) | 22.49Gi | 3.5 cores (13.55Gi) | $1.26 |
Key Features
-
NVIDIA A10G GPU — High compute throughput and memory bandwidth, enabling faster training for deep learning and support for more complex models compared to T4 GPUs.
-
Scalable CPU & memory — From ~3.5 to ~7.5 vCPUs and 13.55 to 28.35 GiB of RAM, supporting data-heavy preprocessing, augmentation, and orchestration alongside GPU tasks.
-
Balanced design — Efficient for both training and inference, bridging the gap between lightweight GPU instances (like G4dn) and specialized multi-GPU clusters.
Example Use Cases
-
Training mid-sized NLP models like GPT-2 or T5 for text generation, or training image segmentation models like UNet or Mask R-CNN for medical imaging.
-
Running object tracking, pose estimation, or other GPU-accelerated pipelines for video analytics.
G4DN Instances
The AWS G4dn series is built for GPU-accelerated workloads at a moderate scale. These instances combine NVIDIA T4 GPUs with balanced CPU and memory resources, making them well-suited for small-to-medium machine learning and inference tasks.
| Instance Type | Region | GPU | GPU Memory | CPU | Price/hr |
|---|---|---|---|---|---|
g4dn.xlarge | us-east-1 | 1x NVIDIA-T4 (15Gi) | 15Gi | 3.5 cores (13.86Gi) | $0.648 |
Key Features
-
NVIDIA T4 GPU — Designed for inference and light training, offering strong efficiency for workloads at a lower cost compared to heavier GPU families.
-
vCPUs and RAM — Provides ~3.5 vCPUs (baseline) and ~13.86 GiB memory, giving enough capacity to manage GPU-accelerated tasks, preprocessing, and orchestration.
-
Balanced performance — Ideal when you need GPU acceleration without the overhead of large, expensive GPU instances.
Example Use Cases
-
Inference workloads, such as running NLP models such as BERT-base for summarization, classification, or question answering.
-
Light training smaller models or experimenting with prototypes before scaling to larger GPU families.
T3A Instances
The AWS T3A series is intended for cost‑effective, general‑purpose workloads that do not require GPU acceleration. It provides a balanced mix of CPU and memory, making it suitable for lightweight use cases.
| Instance Type | Region | GPU | GPU Memory | CPU | Price/hr |
|---|---|---|---|---|---|
t3a.2xlarge | us-east-1 | – | – | 7.5 cores (28.35Gi) | $0.36 |
t3a.xlarge | us-east-1 | – | – | 3.5 cores (13.55Gi) | $0.18 |
t3a.large | us-east-1 | – | – | 1.5 cores (6.4Gi) | $0.108 |
t3a.medium | us-east-1 | – |