Test Models Locally
Learn how to test your custom models locally
Before uploading a custom model to the Clarifai platform, always test and debug it locally. It ensures smooth performance, verifies dependency compatibility, and streamlines the deployment process.
This step helps you detect problems like setup file errors, typos, code misconfigurations, or incorrect model implementations. This saves you time and avoids upload failures by validating the model's behavior on the target hardware you plan to deploy to.
You should ensure your local environment has sufficient memory and compute resources to handle model loading and execution during the testing process.
Prerequisites
Build a Model
You can either build a custom model from scratch or leverage pre-trained models from external repositories like Hugging Face.
If you're developing your own model, our step-by-step guide provides detailed instructions to get started. You can also explore this examples repository to learn how to build models compatible with the Clarifai platform.
Install Clarifai CLI
Install the latest version of the Clarifai CLI tool. We'll use this tool to test models in the local development environment.
- Bash
pip install --upgrade clarifai
System Requirements
Before running the test commands below, ensure your local environment meets the following requirements:
-
Python version — Python 3.11 or 3.12 is required.
-
GPU support — For models that require GPU acceleration, your environment must have NVIDIA GPU support installed and properly configured. CPU-only models can still be tested without a GPU.
-
Docker (optional) — Docker is recommended for container-based testing (
--mode container), but it is not mandatory. Without Docker, you can use--mode envto test in a virtual environment.
Test With the serve Command
The clarifai model serve command is the primary way to test your model locally. It has two modes:
- API-connected mode (default) — Connects to the Clarifai platform and exposes your model via a public URL, just like a cloud deployment.
- Standalone gRPC mode (
--grpc) — Runs your model as a local gRPC server with no Clarifai connection needed. Ideal for offline development.
Option A: API-Connected Mode
This mode connects your local model to the Clarifai platform, giving you a public URL you can use to test predictions through the API or the AI Playground.
- Bash
clarifai model serve
Note: You must be logged in (
clarifai login) to use API-connected mode.
You can specify additional options:
| Flag | Description | Default |
|---|---|---|
--mode | How to run: none (current env), env (virtual env), or container (Docker) | none |
--port | Port for the local server | 8000 |
--concurrency | Number of concurrent requests to handle | 1 |
--keep-image | Keep the Docker image after stopping (for container mode) | false |
-v, --verbose | Show detailed SDK and server logs | false |
You can also specify the path to the model directory. If omitted, the current directory is used:
- Bash
clarifai model serve ./path/to/my-model --mode env
Option B: Standalone gRPC Mode (Offline)
Use the --grpc flag to run the model as a standalone gRPC server without any Clarifai connection. This is ideal for offline development — no PAT or login required.
- Bash
clarifai model serve --grpc --port 8000
Once the server is running, set the CLARIFAI_API_BASE environment variable to point to it:
- Unix-Like Systems
- Windows
export CLARIFAI_API_BASE="localhost:8000"
set CLARIFAI_API_BASE="localhost:8000"
You can then make inference requests using the Clarifai Python SDK.
Implement a test Method
To enable quick validation, implement a test method in your model.py file. This method should internally call your model's other methods and verify the output.
Below is a sample model.py file with an example implementation of the test method:
- Python
from clarifai.runners.models.model_class import ModelClass
from typing import Iterator
class MyModel(ModelClass):
"""A custom runner that adds "Hello World" to the end of the text."""
def load_model(self):
"""Load the model here."""
@ModelClass.method
def predict(self, text1: str = "") -> str:
output_text = text1 + "Hello World"
return output_text
@ModelClass.method
def generate(self, text1: str = "") -> Iterator[str]:
"""Example yielding a whole batch of streamed stuff back."""
for i in range(10): # fake something iterating generating 10 times.
output_text = text1 + f"Generate Hello World {i}"
yield output_text
def test(self):
res = self.predict("test")
assert res == "testHello World"
res = self.generate("test")
for i, r in enumerate(res):
assert r == f"testGenerate Hello World {i}"