Skip to main content

Advanced Inference Options

Learn how to perform advance inference operations using Clarifai Python SDK

The inference API contains certain features that provides more flexibility while running predictions on inputs. This can help the end users to manipulate the outputs required for their tasks.

Batch Predict

Efficiently process multiple inputs in a single request by leveraging the Predict API's batch prediction feature. This allows you to streamline the prediction process, saving time and resources. Simply submit a batch of inputs to the model, and receive comprehensive predictions in return.


The batch size should not exceed 128.

from clarifai.client.input import Inputs
from clarifai.client.model import Model

model_url = ""
image_url = ""

# here is an example of creating an input proto list of size 128
for i in range(0,128):
proto_list.append(Inputs.get_input_from_url(input_id = f'demo_{i}', image_url=image_url))

# passthe input proto as paramater to the predict function
model_prediction = Model(url=model_url).predict(

# Check the length of predictions to see if all inputs were passed successfully


Different Base_URL

Use the flexibility of the Predict API to obtain model predictions by tailoring the base_url. Customize your endpoint to seamlessly integrate with different environments, ensuring a versatile and adaptable approach to accessing model predictions.


This feature is for Enteprise that use on-prem deployments. So the base_url can be used to point to the respective servers.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "cohere"
#APP_ID = "embed"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'cohere-embed-english-v3_0'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

input_text = """In India Green Revolution commenced in the early 1960s that led to an increase in food grain production, especially in Punjab, Haryana, and Uttar Pradesh. Major milestones in this undertaking were the development of high-yielding varieties of wheat. The Green revolution is revolutionary in character due to the introduction of new technology, new ideas, the new application of inputs like HYV seeds, fertilizers, irrigation water, pesticides, etc. As all these were brought suddenly and spread quickly to attain dramatic results thus it is termed as a revolution in green agriculture.
# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through URL:
# model_prediction = model.predict_by_url(URL ,input_type="text")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(image_filepath, input_type="text")

model_url = ""

# You can pass the new base url as paramater while initializing the Model object
model_prediction = Model(url=model_url, pat="YOUR_PAT",base_url="New Base URL").predict_by_bytes(
input_text, "text"

embeddings = model_prediction.outputs[0].data.embeddings[0].vector

num_dimensions = model_prediction.outputs[0].data.embeddings[0].num_dimensions











Prediction Paramaters

These parameters play a crucial role in configuring and customizing your prediction requests, ensuring accurate and tailored results based on your specific use case. Understanding and appropriately setting these prediction parameters will enhance your experience and enable you to extract meaningful insights from the Clarifai platform. The below parameters allows users to modify the predictions received from the model according to their needs.

Param NameParam DescriptionUsage example
temperatureTemperature is a parameter of OpenAI ChatGPT, GPT-3 and GPT-4 models that governs the randomness and thus the creativity of the responses. It is always a number between 0 and 1.inference_params = dict(temperature=0.2) Model(model_url).predict(inputs,inference_params=inference_params)
max_tokensMax_tokens is a parameter for GPT models. This parameter shows the maximum number of tokens that can be processed inorder to get the response to your needs.inference_params = dict(max_tokens=100) Model(model_url).predict(inputs,inference_params=inference_params)
min_valueThe minimum value of the prediction confidence to filter.output_config = dict(min_value=0.6) Model(model_url).predict(inputs,output_config=output_config)
max_conceptsThe maximum number of concepts to return.output_config = dict(max_concepts=3) Model(model_url).predict(inputs,output_config=output_config)
select_conceptsThe concepts to select.output_config = dict(select_concepts=["concept_name"]) Model(model_url).predict(inputs,output_config=output_config)