Compute Orchestration
Train and deploy any model on any compute infrastructure, at any scale
Compute Orchestration is currently in Public Preview. To request access, please contact us here.
Clarifai’s Compute Orchestration offers a streamlined solution for managing the infrastructure required for training, deploying, and scaling machine learning models and workflows.
This flexible system supports any compute instance — across various hardware providers and deployment methods — and provides automatic scaling to match workload demands.
Click here to learn more about our Compute Orchestration system.
-
Run the following command to clone the repository containing various Compute Orchestration examples:
git clone https://github.com/Clarifai/examples.git
. After cloning, navigate to theComputeOrchestration
folder to follow along with the tutorial. -
For a step-by-step tutorial, check the CRUD operations notebook.
Clarifai provides a user-friendly command line interface (CLI) that simplifies Compute Orchestration tasks. With the CLI, you can easily manage the infrastructure required for deploying and scaling machine learning models, even without extensive MLOps expertise. This tool makes it easy to set up clusters, configure nodepools, and deploy models directly from the command line. You can follow its step-by-step tutorial provided here.
Prerequisites
Installation
To begin, install the latest version of the clarifai
Python package.
pip install --upgrade clarifai
Get a PAT
You need a PAT (Personal Access Token) key to authenticate your connection to the Clarifai platform. You can generate it in your Personal Settings page by navigating to the Security section.
Then, set it as an environment variable in your script.
import os
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE" # replace with your own PAT key
Set up Project Directory
- Create a directory to store your project files.
- Inside this directory, create a Python file for your Compute Orchestration code.
- Create a
configs
folder to store your YAML configuration files for clusters, nodepools, and deployments.
In the configs
folder:
1. compute_cluster_config.yaml
:
- YAML
compute_cluster:
id: "test-compute-cluster"
description: "My AWS compute cluster"
cloud_provider:
id: "aws"
region: "us-east-1"
managed_by: "clarifai"
cluster_type: "dedicated"
visibility:
gettable: 10
2. nodepool_config.yaml
:
- YAML
nodepool:
id: "test-nodepool"
compute_cluster:
id: "test-compute-cluster"
description: "First nodepool in AWS in a proper compute cluster"
instance_types:
- id: "g5.xlarge"
compute_info:
cpu_limit: "8"
cpu_memory: "16Gi"
accelerator_type:
- "a10"
num_accelerators: 1
accelerator_memory: "40Gi"
node_capacity_type:
capacity_types:
- 1
- 2
max_instances: 1
3. deployment_config.yaml
:
- YAML
deployment:
id: "test-deployment"
description: "some random deployment"
autoscale_config:
min_replicas: 0
max_replicas: 1
traffic_history_seconds: 100
scale_down_delay_seconds: 30
scale_up_delay_seconds: 30
enable_packing: true
worker:
model:
id: "got-ocr-2_0"
model_version:
id: "5d92321db5d341b5b4cf407ab34f618f"
user_id: "stepfun-ai"
app_id: "ocr"
scheduling_choice: 4
nodepools:
- id: "test-nodepool"
compute_cluster:
id: "test-compute-cluster"
Optionally, if you want to use the Clarifai CLI, create a login configuration file for storing your account credentials:
- YAML
user_id: "YOUR_USER_ID_HERE"
pat: "YOUR_PAT_HERE"
Then, authenticate your CLI session with Clarifai using the stored credentials in the configuration file:
$ clarifai login --config <config-filepath>
Cluster Operations
Create a Compute Cluster
To create a new compute cluster, pass the compute_cluster_id
and config_filepath
as arguments to the create_compute_cluster
method of the User
class.
- Python
- Bash
from clarifai.client.user import User
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the client
client = User(
user_id="YOUR_USER_ID_HERE",
base_url="https://api.clarifai.com"
)
# Create a new compute cluster
compute_cluster = client.create_compute_cluster(
compute_cluster_id="test-compute-cluster",
config_filepath="./configs/compute_cluster_config.yaml"
)
$ clarifai computecluster create --config <compute-cluster-config-filepath>
Get a Cluster
To get a specific compute cluster, pass the compute_cluster_id
to the compute_cluster
method of the User
class.
- Python
from clarifai.client.user import User
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the client
client = User(
user_id="YOUR_USER_ID_HERE",
base_url="https://api.clarifai.com"
)
# Get and print the compute cluster
compute_cluster = client.compute_cluster(
compute_cluster_id="test-compute-cluster"
)
print(compute_cluster)
List All Clusters
To list your existing compute clusters, call the list_compute_clusters
method of the User
class.
- Python
- Bash
from clarifai.client.user import User
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the client
client = User(
user_id="YOUR_USER_ID_HERE",
base_url="https://api.clarifai.com"
)
# Get and print all the compute clusters
all_compute_clusters = list(
client.list_compute_clusters()
)
print(all_compute_clusters)
$ clarifai computecluster list
Initialize the ComputeCluster
Class
To initialize the ComputeCluster
class, provide the user_id
and compute_cluster_id
as parameters.
Initialization is essential because it establishes the specific user and compute cluster context, which allows the subsequent operations to accurately target and manage the intended resources.
- Python
from clarifai.client.compute_cluster import ComputeCluster
# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)
Nodepool Operations
Create a Nodepool
To create a new nodepool, use the create_nodepool
method with the nodepool_id
and config_filepath
as parameters.
- Python
- Bash
from clarifai.client.compute_cluster import ComputeCluster
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)
# Create a new nodepool
nodepool = compute_cluster.create_nodepool(
nodepool_id="test-nodepool",
config_filepath="./configs/nodepool_config.yaml"
)
$ clarifai nodepool create --config <nodepool-config-filepath>
Get a Nodepool
To get a specific nodepool, provide the nodepool_id
to the nodepool
method of the ComputeCluster
class.
- Python
from clarifai.client.compute_cluster import ComputeCluster
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)
# Get and print the nodepool
nodepool = compute_cluster.nodepool(
nodepool_id="test-nodepool"
)
print(nodepool)
List All Nodepools
To list the existing nodepools, call the list_nodepools
method of the ComputeCluster
class.
- Python
- Bash
from clarifai.client.compute_cluster import ComputeCluster
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)
# Get and print all the nodepools
all_nodepools = list(
compute_cluster.list_nodepools()
)
print(all_nodepools)
$ clarifai nodepool list --compute_cluster_id <compute-cluster-id>
Initialize the Nodepool
Class
To initialize the Nodepool
class, provide the user_id
and nodepool_id
parameters.
- Python
from clarifai.client.nodepool import Nodepool
# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)
Deployment Operations
Create a Deployment
To deploy a model within a nodepool, provide the deployment_id
and config_filepath
parameters to the create_deployment
method of the Nodepool
class.
Each model or workflow can only have one deployment per nodepool.
- Python
- Bash
from clarifai.client.nodepool import Nodepool
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)
# Create a new deployment
deployment = nodepool.create_deployment(
deployment_id="test-deployment",
config_filepath="./configs/deployment_config.yaml"
)
$ clarifai deployment create --config <deployment-config-filepath>
Get a Deployment
To get a specific deployment, provide the deployment_id
to the deployment
method of the Nodepool
class.
- Python
from clarifai.client.nodepool import Nodepool
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)
## Get and print the deployment
deployment = nodepool.deployment(
deployment_id="test-deployment"
)
print(deployment)
List All Deployments
To list existing deployments, call the list_deployments
method of the Nodepool
class.
- Python
- Bash
from clarifai.client.nodepool import Nodepool
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)
# Get and print all the deployments
all_deployments = list(
nodepool.list_deployments()
)
print(all_deployments)
$ clarifai deployment list --nodepool_id <nodepool-id>
Initialize the Deployment
Class
To initialize the Deployment
class, provide the user_id
and deployment_id
parameters.
- Python
from clarifai.client.deployment import Deployment
# Initialize the deployment
deployment = Deployment(
user_id="YOUR_USER_ID_HERE",
deployment_id="test-deployment",
base_url="https://api.clarifai.com"
)
Predict With Deployed Model
Once your model is deployed, you can use it to make predictions. To do this, pass the compute_cluster_id
and nodepool_id
to the predict
method.
For example, here is how you can make a prediction using the deployed model.
- Python
from clarifai.client.model import Model
model_url = "https://clarifai.com/stepfun-ai/ocr/models/got-ocr-2_0"
# URL of the image to analyze
image_url = "https://samples.clarifai.com/featured-models/model-ocr-scene-text-las-vegas-sign.png"
# Initialize the model
model = Model(
url=model_url,
pat="YOUR_PAT_HERE"
)
# Make a prediction using the model with the specified compute cluster and nodepool
model_prediction = model.predict_by_url(
image_url,
input_type="image",
compute_cluster_id="test-compute-cluster",
nodepool_id="test-nodepool"
)
# Print the output
print(model_prediction.outputs[0].data.text.raw)
Click here to learn more about how to make predictions using different types of models.
Delete Resources
Delete Deployments
To delete deployments, pass a list of deployment IDs to the delete_deployments
method of the Nodepool
class.
- Python
- Bash
from clarifai.client.nodepool import Nodepool
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)
# Get all the deployments in the nodepool
all_deployments = list(nodepool.list_deployments())
# Extract deployment IDs for deletion
deployment_ids = [deployment.id for deployment in all_deployments]
# Delete a specific deployment by providing its deployment ID
# deployment_ids = ["test-deployment"]
# Delete the deployments
nodepool.delete_deployments(deployment_ids=deployment_ids)
$ clarifai deployment delete --nodepool_id <nodepool-id> --deployment_id <deployment-id>
Delete Nodepools
To delete nodepools, provide a list of nodepool IDs to the delete_nodepools
method of the ComputeCluster
class.
- Python
- Bash
from clarifai.client.compute_cluster import ComputeCluster
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)
# Get all nodepools within the compute cluster
all_nodepools = list(compute_cluster.list_nodepools())
# Extract nodepool IDs for deletion
nodepool_ids = [nodepool.id for nodepool in all_nodepools]
# Delete a specific nodepool by providing its ID
# nodepool_ids = ["test-nodepool"]
# Delete the nodepools
compute_cluster.delete_nodepools(nodepool_ids=nodepool_ids)
$ clarifai nodepool delete --compute_cluster_id <compute-cluster-id> --nodepool_id <nodepool-id>
Delete Compute Clusters
To delete compute clusters, provide a list of compute cluster IDs to the delete_compute_clusters
method of the User
class.
- Python
- Bash
from clarifai.client.user import User
import os
# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"
# Initialize the User client
client = User(
user_id="YOUR_USER_ID_HERE",
base_url="https://api.clarifai.com"
)
# Get all compute clusters associated with the user
all_compute_clusters = list(client.list_compute_clusters())
# Extract compute cluster IDs for deletion
compute_cluster_ids = [compute_cluster.id for compute_cluster in all_compute_clusters]
# Delete a specific nodepool by providing its ID
# compute_cluster_ids = ["test-compute-cluster"]
# Delete the compute clusters
client.delete_compute_clusters(compute_cluster_ids=compute_cluster_ids)
$ clarifai computecluster delete --compute_cluster_id <compute-cluster-id>