Compute Orchestration
Train and deploy any model on any compute infrastructure, at any scale
Compute Orchestration is currently in Public Preview. To request access, please contact us here.
Clarifai’s Compute Orchestration offers a streamlined solution for managing the infrastructure required for training, deploying, and scaling machine learning models and workflows.
These flexible capabilities support any compute instance — across various hardware providers and deployment methods — and provide automatic scaling to match workload demands.
Click here to learn more about our Compute Orchestration capabilities.
-
Run the following command to clone the repository containing various Compute Orchestration examples:
git clone https://github.com/Clarifai/examples.git
. After cloning, navigate to theComputeOrchestration
folder to follow along with this tutorial. -
For a step-by-step tutorial, see the CRUD operations notebook.
-
Clarifai provides a user-friendly command line interface (CLI) that simplifies Compute Orchestration tasks. You can follow its step-by-step tutorial provided here.
Prerequisites
Installation
To begin, install the latest version of the clarifai
Python package.
pip install --upgrade clarifai
Get a PAT
You need a PAT (Personal Access Token) key to authenticate your connection to the Clarifai platform. You can generate it in your Personal Settings page by navigating to the Security section.
Then, set it as an environment variable in your script.
import os
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE" # replace with your own PAT key
Set up Project Directory
- Create a directory to store your project files.
- Inside this directory, create a Python file for your Compute Orchestration code.
- Create a
configs
folder to store your YAML configuration files for clusters, nodepools, and deployments.
Then, create the following files in the configs
folder:
1. compute_cluster_config.yaml
:
- YAML
compute_cluster:
id: "test-compute-cluster"
description: "My AWS compute cluster"
cloud_provider:
id: "aws"
region: "us-east-1"
managed_by: "clarifai"
cluster_type: "dedicated"
visibility:
gettable: 10
2. nodepool_config.yaml
:
- YAML
nodepool:
id: "test-nodepool"
compute_cluster:
id: "test-compute-cluster"
description: "First nodepool in AWS in a proper compute cluster"
instance_types:
- id: "g5.xlarge"
compute_info:
cpu_limit: "8"
cpu_memory: "16Gi"
accelerator_type:
- "a10"
num_accelerators: 1
accelerator_memory: "40Gi"
node_capacity_type:
capacity_types:
- 1
- 2
max_instances: 1
3. deployment_config.yaml
:
- YAML
deployment:
id: "test-deployment"
description: "some random deployment"
autoscale_config:
min_replicas: 0
max_replicas: 1
traffic_history_seconds: 100
scale_down_delay_seconds: 30
scale_up_delay_seconds: 30
disable_packing: false
worker:
model:
id: "apparel-clusterering"
model_version:
id: "cc911f6b0ed748efb89e3d1359c146c4"
user_id: "clarifai"
app_id: "main"
scheduling_choice: 4
nodepools:
- id: "test-nodepool"
compute_cluster:
id: "test-compute-cluster"
Optionally, if you want to use the Clarifai CLI, create a login configuration file for storing your account credentials:
- YAML
user_id: "YOUR_USER_ID_HERE"
pat: "YOUR_PAT_HERE"
Then, authenticate your CLI session with Clarifai using the stored credentials in the configuration file:
$ clarifai login --config <config-filepath>
📄️ Upload Custom Models
Import custom models, including from external sources like Hugging Face and OpenAI
📄️ Clusters and Nodepools
Set up your compute clusters and nodepools
📄️ Managing Your Compute
Manage your clusters, nodepools, and deployments