Skip to main content

Model Inference

Perform predictions with models


Clarifai’s Compute Orchestration capabilities enable you to make prediction calls efficiently, tailored to a variety of use cases. You can use these features to run inferences seamlessly and get results from your model with ease.

Predict With Compute Orchestration

To make a prediction request using our Compute Orchestration capabilities, you'll need to set up a cluster, create a nodepool, and deploy your model to it.

Once deployed, you’ll need to reference the model’s deployment_id — or alternatively, specify both the compute_cluster_id and nodepool_id. These parameters ensure that your prediction request is routed correctly to the intended compute resources.

For example, you can route requests to a GCP cluster by selecting a corresponding deployment ID, use a different deployment ID for an AWS cluster, and yet another for an on-premises deployment.

This gives you full control over performance, costs, and security, allowing you to focus on building cutting-edge AI solutions while we handle the infrastructure complexity.

Here's how prediction requests are routed:

  • Prioritized deployment routing — If you specify a deployment ID in the prediction request, Clarifai will route it directly to the associated nodepool.
  • Owner-defined default routing — If you do not specify a deployment ID, but the model owner has pre-configured a deployment for that model version, the request will be routed to the nodepool specified in that deployment.
  • Fallback to shared routing — If neither condition above is met and the model is uploaded and owned by Clarifai, the request defaults to the most cost-effective Clarifai shared compute nodepool.

Note: Shared compute is not available for user-uploaded models — you must set up a deployment when uploading your own models.

Structure of Prediction Methods

Supported Input and Output Data Types

Click here to explore the wide range of input and output data types supported by Clarifai’s model framework. You'll also find client-side examples that show how to work with these rich data formats effectively.

Clarifai models are mostly built using three primary files: model.py, requirements.txt, and config.yaml. As described here, the core prediction logic resides in model.py, which defines how your model processes inputs and generates outputs.

When making predictions on the client side, the structure of the prediction methods directly reflects the method signatures defined in the model.py file. This one-to-one mapping allows you to make custom predictions with flexible naming and argument structures, giving you full control over how you invoke models.

Here are some examples of this method mapping approach:

model.py Model ImplementationClient-Side Usage Pattern
@ModelClass.method def predict(...)model.predict(...)
@ModelClass.method def generate(...)model.generate(...)
@ModelClass.method def stream(...)model.stream(...)

This design provides flexibility in how to make model predictions. For example, a method could be defined as @ModelClass.method def analyze_video(...) in model.py, and then you can call it on the client side using model.analyze_video(...).

Here are some key characteristics of this design:

  • Method names must match exactly between model.py and client usage.

  • Parameters retain the same names and types as defined in the method.

  • Return types follow the structure defined by the model's outputs.