Skip to main content

Clusters and Nodepools

Set up your compute clusters and nodepools


A compute cluster serves as the primary environment for running models, whether for training or inference. Each cluster consists of multiple nodepools — groups of virtual machine instances with similar configurations, such as CPU/GPU type and memory.

After setting up a custom cluster, you can configure nodepools to optimize resource usage, tailoring the infrastructure to specific hardware, performance, cost, or compliance requirements.

note

The following sections will guide you through creating clusters and nodepools and deploying your models. Note that Compute Orchestration supports only models uploaded to the Clarifai platform via the Python SDK, as outlined here.

Before configuring compute clusters and nodepools, ensure you have completed the necessary prerequisites, as outlined here.

Cluster Operations

Create a Compute Cluster

To create a new compute cluster, pass the compute_cluster_id and config_filepath as arguments to the create_compute_cluster method of the User class.

from clarifai.client.user import User
import os

# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"

# Initialize the client
client = User(
user_id="YOUR_USER_ID_HERE",
base_url="https://api.clarifai.com"
)

# Create a new compute cluster
compute_cluster = client.create_compute_cluster(
compute_cluster_id="test-compute-cluster",
config_filepath="./configs/compute_cluster_config.yaml"
)

Initialize the ComputeCluster Class

To initialize the ComputeCluster class, provide the user_id and compute_cluster_id as parameters.

Initialization is essential because it establishes the specific user and compute cluster context, which allows the subsequent operations to accurately target and manage the intended resources.

from clarifai.client.compute_cluster import ComputeCluster

# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)

Nodepool Operations

Create a Nodepool

To create a new nodepool, use the create_nodepool method with the nodepool_id and config_filepath as parameters.

from clarifai.client.compute_cluster import ComputeCluster
import os

# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"

# Initialize the ComputeCluster instance
compute_cluster = ComputeCluster(
user_id="YOUR_USER_ID_HERE",
compute_cluster_id="test-compute-cluster",
base_url="https://api.clarifai.com"
)

# Create a new nodepool
nodepool = compute_cluster.create_nodepool(
nodepool_id="test-nodepool",
config_filepath="./configs/nodepool_config.yaml"
)

Initialize the Nodepool Class

To initialize the Nodepool class, provide the user_id and nodepool_id parameters.

from clarifai.client.nodepool import Nodepool

# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)

Deployment Operations

Create a Deployment

To deploy a model within a nodepool, provide the deployment_id and config_filepath parameters to the create_deployment method of the Nodepool class.

note

Each model or workflow can only have one deployment per nodepool.

from clarifai.client.nodepool import Nodepool
import os

# Set the PAT key
os.environ["CLARIFAI_PAT"] = "YOUR_PAT_HERE"

# Initialize the Nodepool instance
nodepool = Nodepool(
user_id="YOUR_USER_ID_HERE",
nodepool_id="test-nodepool",
base_url="https://api.clarifai.com"
)

# Create a new deployment
deployment = nodepool.create_deployment(
deployment_id="test-deployment",
config_filepath="./configs/deployment_config.yaml"
)

Initialize the Deployment Class

To initialize the Deployment class, provide the user_id and deployment_id parameters.

from clarifai.client.deployment import Deployment

# Initialize the deployment
deployment = Deployment(
user_id="YOUR_USER_ID_HERE",
deployment_id="test-deployment",
base_url="https://api.clarifai.com"
)

Predict With Deployed Model

Once your model is deployed, it can be used to make predictions by calling the appropriate prediction methods. Clarifai's Compute Orchestration system offers different types of prediction calls to suit various use cases.

important

To ensure proper routing and execution, you must specify the deployment_id parameter. This parameter is essential in directing prediction requests to the appropriate cluster. For example, you can assign a specific deployment ID to route requests to a GCP cluster, a different ID for an AWS cluster, and yet another for an on-premises deployment. This is important for performance optimization, scalability, and better load balancing.

tip

The following examples illustrate how to make predictions with inputs provided as publicly accessible URLs. Click here to learn how to make predictions using other types of inputs and models.

Unary-Unary Predict Call

This is the simplest type of prediction. In this method, a single input is sent to the model, and it returns a single response. This is ideal for tasks where a quick, non-streaming prediction is required, such as classifying an image.

It supports the following prediction methods:

  • predict_by_url — Use a publicly accessible URL for the input.
  • predict_by_bytes — Pass raw input data directly.
  • predict_by_filepath — Provide the local file path for the input.
from clarifai.client.model import Model

model_url = "https://clarifai.com/stepfun-ai/ocr/models/got-ocr-2_0"

# URL of the image to analyze
image_url = "https://samples.clarifai.com/featured-models/model-ocr-scene-text-las-vegas-sign.png"

# Initialize the model
model = Model(
url=model_url,
pat="YOUR_PAT_HERE"
)

# Make a prediction using the model with the specified compute cluster and nodepool
model_prediction = model.predict_by_url(
image_url,
input_type="image",
deployment_id="test-deployment"
)

# Print the output
print(model_prediction.outputs[0].data.text.raw)

Unary-Stream Predict Call

The Unary-Stream predict call processes a single input, but returns a stream of responses. It is particularly useful for tasks where multiple outputs are generated from a single input, such as generating text completions from a prompt.

It supports the following prediction methods:

  • generate_by_url — Provide a publicly accessible URL and handle the streamed responses iteratively.
  • generate_by_bytes — Use raw input data.
  • generate_by_filepath — Use a local file path for the input.
from clarifai.client.model import Model

model_url = "https://clarifai.com/meta/Llama-3/models/llama-3_2-3b-instruct"

# URL of the prompt text
text_url = "https://samples.clarifai.com/featured-models/falcon-instruction-guidance.txt"

# Initialize the model
model = Model(
url=model_url,
pat="YOUR_PAT_HERE"
)

# Perform unary-stream prediction with the specified compute cluster and nodepool
stream_response = model.generate_by_url(
text_url,
input_type="text",
deployment_id="test-deployment"
)

# Handle the stream of responses
list_stream_response = [response for response in stream_response]

Stream-Stream Predict Call

The stream-stream predict call enables bidirectional streaming of both inputs and outputs, making it highly effective for processing large datasets or real-time applications.

In this setup, multiple inputs can be continuously sent to the model, and the corresponding multiple predictions are streamed back in real-time. This is ideal for tasks like real-time video processing/predictions or live sensor data analysis.

It supports the following prediction methods:

  • stream_by_url — Stream a list of publicly accessible URLs and receive a stream of predictions. It takes an iterator of inputs and returns a stream of predictions.
  • stream_by_bytes — Stream raw input data.
  • stream_by_filepath — Stream inputs from local file paths.
from clarifai.client.model import Model

model_url = "https://clarifai.com/meta/Llama-3/models/llama-3_2-3b-instruct"

# URL of the prompt text
text_url = "https://samples.clarifai.com/featured-models/falcon-instruction-guidance.txt"

# Initialize the model
model = Model(
url=model_url,
pat="YOUR_PAT_HERE"
)

# Perform stream-stream prediction with the specified compute cluster and nodepool
stream_response = model.stream_by_url(
iter([text_url]),
input_type="text",
deployment_id="test-deployment"
)

# Handle the stream of responses
list_stream_response = [response for response in stream_response]