Skip to main content

OpenAI

Run inferences on Clarifai models using OpenAI


You can run inferences on Clarifai-hosted models using the OpenAI client library by leveraging the Clarifai’s OpenAI-compatible API endpoint.

This allows you to use the same code and tools you would with OpenAI, in either Python or JavaScript, by simply configuring the client to point to Clarifai and providing your PAT (Personal Access Token) key.

Prerequisites

Install OpenAI Package

Install the openai package.

 pip install openai 

Get a PAT Key

You need a PAT key to authenticate your connection to the Clarifai platform. You can generate the PAT key in your personal settings page by navigating to the Security section.

You can then set the PAT as an environment variable using CLARIFAI_PAT:

 export CLARIFAI_PAT=YOUR_PERSONAL_ACCESS_TOKEN_HERE 

Get a Clarifai Model

Go to the Clarifai Community platform and select the model you want to use for making predictions.

Some Clarifai models that support OpenAI
https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-R1-0528-Qwen3-8B
https://clarifai.com/meta/Llama-3/models/Llama-3_2-3B-Instruct
https://clarifai.com/anthropic/completion/models/claude-sonnet-4
https://clarifai.com/qwen/qwenLM/models/Qwen3-14B
https://clarifai.com/mistralai/completion/models/Devstral-Small-2505_gguf-4bit
https://clarifai.com/clarifai/main/models/general-image-recognition
https://clarifai.com/xai/chat-completion/models/grok-3
https://clarifai.com/openai/chat-completion/models/gpt-4o
https://clarifai.com/openai/chat-completion/models/gpt-4_1
https://clarifai.com/gcp/generate/models/gemini-2_5-flash
https://clarifai.com/anthropic/completion/models/claude-3_5-haiku
https://clarifai.com/qwen/qwenLM/models/Qwen3-30B-A3B-GGUF
https://clarifai.com/gcp/generate/models/gemini-2_0-flash
https://clarifai.com/gcp/generate/models/gemma-3-12b-it
https://clarifai.com/microsoft/text-generation/models/Phi-4-reasoning-plus
https://clarifai.com/openbmb/miniCPM/models/MiniCPM3-4B
https://clarifai.com/microsoft/text-generation/models/phi-4-mini-instruct
https://clarifai.com/qwen/qwen-VL/models/Qwen2_5-VL-7B-Instruct
https://clarifai.com/microsoft/text-generation/models/phi-4
https://clarifai.com/xai/chat-completion/models/grok-2-vision-1212
https://clarifai.com/xai/image-generation/models/grok-2-image-1212
https://clarifai.com/xai/chat-completion/models/grok-2-1212
https://clarifai.com/qwen/qwenLM/models/QwQ-32B-AWQ
https://clarifai.com/gcp/generate/models/gemini-2_0-flash-lite
https://clarifai.com/anthropic/completion/models/claude-opus-4
https://clarifai.com/openai/chat-completion/models/o4-mini
https://clarifai.com/openai/chat-completion/models/o3
https://clarifai.com/openbmb/miniCPM/models/MiniCPM-o-2_6-language
https://clarifai.com/deepseek-ai/deepseek-chat/models/DeepSeek-R1-Distill-Qwen-7B
https://clarifai.com/qwen/qwenCoder/models/Qwen2_5-Coder-7B-Instruct

Chat Completions

The OpenAI Chat Completions API endpoint enables you to generate a model response by providing a list of messages that constitute a conversation.

import os
from openai import OpenAI

# Initialize the OpenAI client, pointing to Clarifai's API
client = OpenAI(
base_url="https://api.clarifai.com/v2/ext/openai/v1", # Clarifai's OpenAI-compatible API endpoint
api_key=os.environ["CLARIFAI_PAT"] # Ensure CLARIFAI_PAT is set as an environment variable
)

# Make a chat completion request to a Clarifai-hosted model
response = client.chat.completions.create(
model="https://clarifai.com/anthropic/completion/models/claude-sonnet-4",
#model="anthropic/completion/models/claude-sonnet-4", # Or, provide Clarifai model name
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
],
# You can also add other OpenAI-compatible parameters like max_tokens, etc.
max_completion_tokens=100, # Limits the response length
temperature=0.7, # Controls randomness of the output
)

# Print the model's response
print(response.choices[0].message.content)
Example Output
I'm Claude, an AI assistant created by Anthropic. I'm here to help with a wide variety of tasks like answering questions, helping with analysis and research, creative writing, math and coding problems, and having conversations. Is there something specific I can help you with today?

Streaming

Clarifai offers support for streaming — the response is streamed back token by token, rather than waiting for the entire completion to be generated before returning.

import os
from openai import OpenAI

# Initialize the OpenAI client, pointing to Clarifai's API
client = OpenAI(
base_url="https://api.clarifai.com/v2/ext/openai/v1", # Clarifai's OpenAI-compatible API endpoint
api_key=os.environ["CLARIFAI_PAT"] # Ensure CLARIFAI_PAT is set as an environment variable
)

# Make a chat completion request to a Clarifai-hosted model
response = client.chat.completions.create(
model="https://clarifai.com/anthropic/completion/models/claude-sonnet-4",
#model="anthropic/completion/models/claude-sonnet-4", # Or, provide Clarifai model name
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"}
],
# You can also add other OpenAI-compatible parameters like max_tokens, etc.
max_completion_tokens=100, # Limits the response length
temperature=0.7, # Controls randomness of the output
stream=True # Enables streaming the response token by token
)

print("Assistant's Response:")
for chunk in response:
# Safely check if choices, delta, and content exist before accessing
if chunk.choices and \
chunk.choices[0].delta and \
chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end='')
print("\n")

Tool Calling

Tool calling (also known as function calling) enables LLMs to autonomously decide when and how to invoke external tools — such as APIs or custom functions — based on user input.

Here is an example code that sets up a basic tool-calling interaction. It simulates a weather API and shows how the LLM would "call" that tool when asked about the weather.

import os
from openai import OpenAI

# Initialize the OpenAI-compatible client for Clarifai
client = OpenAI(
base_url="https://api.clarifai.com/v2/ext/openai/v1",
api_key=os.environ["CLARIFAI_PAT"] # Ensure CLARIFAI_PAT is set as an environment variable
)

# Define the external tools (functions) that the LLM can call.
# In this example, it's a 'get_weather' function.
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Returns the current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country, e.g., 'Bogotá, Colombia'"
}
},
"required": ["location"],
"additionalProperties": False # Ensures no extra parameters are passed
}
}
}
]

# Create a chat completion request with tool-calling enabled
response = client.chat.completions.create(
model="https://clarifai.com/anthropic/completion/models/claude-sonnet-4",
#model="anthropic/completion/models/claude-sonnet-4", # Or, provide Clarifai model name
messages=[
{"role": "user", "content": "What is the weather like in New York today?"}
],
tools=tools,
tool_choice='auto' # Let the LLM decide if it needs to use a tool
)

# Print the tool call proposed by the model, if any
tool_calls = response.choices[0].message.tool_calls
print("Tool calls:", tool_calls)
Tool Calling Implementation Example
import os
import json
from openai import OpenAI

# Initialize the OpenAI client, pointing to Clarifai's OpenAI-compatible API endpoint
client = OpenAI(
base_url="https://api.clarifai.com/v2/ext/openai/v1",
api_key=os.environ["CLARIFAI_PAT"] # Ensure CLARIFAI_PAT is set as an environment variable
)

# Define the external tools (functions) that the LLM can call.
# In this example, it's a 'get_weather' function.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country, e.g., 'Bogotá, Colombia'"
}
},
"required": ["location"],
"additionalProperties": False # Ensures no extra parameters are passed
},
"strict": True # Enforces strict adherence to parameter schema
}
}]

## Simulate Tool Execution (for demonstration)

# This function simulates calling an external weather API.
# In a real application, this would make an actual API request.
def get_weather(location: str):
"""Simulates fetching weather for a given location."""
# Placeholder data for demonstration
if "New York" in location:
return {"location": "New York", "temperature": "20°C", "conditions": "Partly cloudy"}
elif "London" in location:
return {"location": "London", "temperature": "15°C", "conditions": "Rainy"}
else:
return {"location": location, "temperature": "N/A", "conditions": "Unknown"}

## LLM Call with Tooling

# First API call: The LLM decides if a tool needs to be called.
print("--- Initial LLM Call (Tool Recommendation) ---")
first_response = client.chat.completions.create(
model="anthropic/completion/models/claude-sonnet-4", # Ensure this model supports tool calling on Clarifai's platform
messages=[
{"role": "user", "content": "What is the weather like in New York today?"}
],
tools=tools, # Provide the list of available tools
tool_choice="auto", # Let the LLM decide if it needs to use a tool
)


## Process LLM's Response and Execute Tool (if recommended)

# Check if the LLM decided to call a tool
if first_response.choices[0].message.tool_calls:
tool_calls = first_response.choices[0].message.tool_calls
print(f"\nLLM recommended tool calls: {tool_calls}")

# Execute each recommended tool call
available_functions = {
"get_weather": get_weather, # Map function name to actual Python function
}

messages = [
{"role": "user", "content": "What is the weather like in New York today?"}
]
messages.append(first_response.choices[0].message) # Add LLM's tool call suggestion to messages

for tool_call in tool_calls:
function_name = tool_call.function.name
function_to_call = available_functions[function_name]
function_args = json.loads(tool_call.function.arguments)

# Call the actual Python function
function_response = function_to_call(**function_args)
print(f"\nExecuting tool: {function_name}({function_args}) -> {function_response}")

# Add the tool's output to the conversation for the LLM to process
messages.append(
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": json.dumps(function_response),
}
)

# ---
## Second LLM Call (Summarize Tool Output)


# Now, send the tool's output back to the LLM to get a natural language response
print("\n--- Second LLM Call (Summarizing Tool Output) ---")
second_response = client.chat.completions.create(
model="https://clarifai.com/anthropic/completion/models/claude-sonnet-4",
#model="anthropic/completion/models/claude-sonnet-4", # Or, provide Clarifai model name
messages=messages, # Continue the conversation with tool output
)

print("\nFinal Assistant's Response:")
print(second_response.choices[0].message.content)

else:
print("\nLLM did not recommend any tool calls.")
print("Assistant's direct response:")
print(first_response.choices[0].message.content)
--- Initial LLM Call (Tool Recommendation) ---

LLM recommended tool calls: [ChatCompletionMessageToolCall(id='toolu_01Mhqb1c7ne4GPKWY9eZtgxd', function=Function(arguments='{"location": "New York, United States"}', name='get_weather'), type='function')]

Executing tool: get_weather({'location': 'New York, United States'}) -> {'location': 'New York', 'temperature': '20°C', 'conditions': 'Partly cloudy'}

--- Second LLM Call (Summarizing Tool Output) ---

Final Assistant's Response:
The weather in New York today is:
- **Temperature:** 20°C (68°F)
- **Conditions:** Partly cloudy

It's a pleasant day with mild temperatures and partly cloudy skies!

Image Generation

Here is an example of how to generate an image using a model that supports Clarifai's OpenAI-compatible API endpoint.

import os
from openai import OpenAI

client = OpenAI(
base_url="https://api.clarifai.com/v2/ext/openai/v1",
api_key=os.environ['CLARIFAI_PAT'],
)
response = client.images.generate(
model="https://clarifai.com/xai/image-generation/models/grok-2-image-1212",
prompt="A cat in a tree",
)
print(response)
Example Output
ImagesResponse(created=None, data=[Image(b64_json=None, revised_prompt='A high-resolution photograph of a cat perched on a branch in a lush, green tree during the daytime. The cat, possibly a tabby, is the central focus of the image, looking slightly to the side with its fur naturally positioned. The background features a soft, slightly blurred forest setting with sunlight filtering through the leaves, creating a serene and natural environment. The composition avoids any distracting elements, ensuring the cat remains the primary subject in a peaceful outdoor scene.', url='https://imgen.x.ai/xai-imgen/xai-tmp-imgen-41202340-c0e1-4669-bed5-e70f7b491176.jpeg')], usage=None)