Skip to main content

Legacy Inference via API

Generate predictions using our older method


The legacy inference technique uses our previous API structure and is best suited for models built using the older techniques.

While this method remains functional, we recommend transitioning to the new inference method for improved efficiency, scalability, and access to the latest features.

info

Before using the Python SDK, CLI, Node.js SDK, or any of our gRPC clients, ensure they are properly installed on your machine. Refer to their respective installation guides for instructions on how to install and initialize them.

Legacy Inference via Compute Orchestration

note

Before making a prediction, ensure that your model has been deployed, as mentioned previously. Otherwise, the prediction will default to the Clarifai Shared deployment type.

Unary-Unary Predict Call

This is the simplest type of prediction. In this method, a single input is sent to the model, and it returns a single response. This is ideal for tasks where a quick, non-streaming prediction is required, such as classifying an image.

It supports the following prediction methods:

  • predict_by_url — Use a publicly accessible URL for the input.
  • predict_by_bytes — Pass raw input data directly.
  • predict_by_filepath — Provide the local file path for the input.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
IMAGE_URL = "https://samples.clarifai.com/birds.jpg"
MODEL_URL = "https://clarifai.com/qwen/qwen-VL/models/Qwen2_5-VL-7B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Make a unary-unary prediction using the image URL
model_prediction = model.predict_by_url(
IMAGE_URL,
input_type="image",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Output the model's response
print(model_prediction.outputs[0].data.text.raw)

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example prediction using a cluster and nodepool (no deployment ID needed):
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_url("INPUT_URL_HERE", input_type="image", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID_HERE", nodepool_id="YOUR_NODEPOOL_ID_HERE")

# Example prediction via bytes:
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_bytes("INPUT_TEXT_HERE".encode(), input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")

# Example prediction via filepath:
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_filepath("INPUT_FILEPATH_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
Example Output
2025-04-04 13:27:15.990657 INFO     Qwen2_5-VL-7B-Instruct model is still deploying, please wait...         model.py:443
2025-04-04 13:27:30.233847 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
2025-04-04 13:27:45.624827 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
2025-04-04 13:28:02.551081 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
This image captures three seagulls in flight over a body of water, likely a lake or river. The background is a natural setting with dry grass and trees, suggesting it might be late autumn or early spring. The seagulls appear to be gliding close to the water's surface, possibly searching for food. The lighting indicates it could be a sunny day. This scene is typical of coastal or lakeside environments where seagulls often congregate.

Unary-Stream Predict Call

The Unary-Stream predict call processes a single input, but returns a stream of responses. It is particularly useful for tasks where multiple outputs are generated from a single input, such as generating text completions from a prompt.

It supports the following prediction methods:

  • generate_by_url — Provide a publicly accessible URL and handle the streamed responses iteratively.
  • generate_by_bytes — Use raw input data.
  • generate_by_filepath — Use a local file path for the input.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
PROMPT = "What is the future of AI?"
MODEL_URL = "https://clarifai.com/meta/Llama-3/models/Llama-3_2-3B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Make a unary-stream prediction using the prompt as bytes
response_stream = model.generate_by_bytes(
PROMPT.encode(),
input_type="text",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Iterate through streamed responses and print them
for response in response_stream:
if response.outputs and response.outputs[0].data.text:
print(response.outputs[0].data.text.raw)

# Print a newline at the end for better formatting
print()

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example stream prediction using a cluster and nodepool (no deployment ID needed):
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_bytes("YOUR_PROMPT_HERE".encode(), input_type="text", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID", nodepool_id="YOUR_NODEPOOL_ID"):
# print(response.outputs[0].data.text.raw)

# Example unary-stream prediction via URL:
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_url("INPUT_URL_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE"):
# print(response.outputs[0].data.text.raw)

# Example unary-stream prediction via filepath:
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_filepath("INPUT_FILEPATH_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE"):
# print(response.outputs[0].data.text.raw)
Example Output
2025-04-04 15:10:09.952752 INFO     Llama-3_2-3B-Instruct model is still     model.py:726
deploying, please wait...
2025-04-04 15:10:24.522422 INFO Llama-3_2-3B-Instruct model is still model.py:726
deploying, please wait...
The
future
of
Artificial
Intelligence
(
AI
)
is
vast
and
rapidly
evolving
.
Based
on
current
trends
and
advancements
,
here
are
some
potential
developments
that
may
shape
the
future
of
AI
:


**
Short
-term
(
202
5
-
203
0
)**



1
.
**
Increased
Adoption
**:
AI
will
become
more
ubiquitous
in
various
industries
,
including
healthcare
,
finance
,
transportation
,
and
education
.

2
.
**
Improved
Natural
Language
Processing
(
N
LP
)**
:
N
LP
will
continue
to
advance
,
enabling
more
accurate
and
effective
human
-com
puter
interactions
.

3
.
**
Enh
anced
Machine
Learning
(
ML
)**
:
ML
will
become
more
sophisticated
,
allowing
for
more
accurate
predictions
and
decision
-making
.

4
.
**
R
ise
of
Explain
able
AI
(
X
AI
)**
:
X
AI
will
become
more
prominent
,
enabling
users
to
understand
the
reasoning
behind
AI
decisions
.


**
Mid
-term
(
203
0
-
204
0
)**



1
.
**
Art
ificial
General
Intelligence
(
AG
I
)**
:
AG
I
,
which
refers
to
AI
systems
that
can
perform
any
intellectual
task
,
may
emerge
.

2
.
**
Quant
um
AI
**:
Quantum
computing
will
be
integrated
with
AI
,
leading
to
exponential
advancements
in
processing
power
and
AI
capabilities
.

3
.
**
Edge
AI
**:
Edge
AI
will
become
more
prevalent
,
enabling
AI
to
be
deployed
at
the
edge
of
networks
,
reducing
latency
,
and
improving
real
-time
decision
-making
.

4
.
**
Human
-A
I
Collaboration
**:
Humans
and
AI
systems
will
collaborate
more
effectively
,
leading
to
increased
productivity
and
innovation
.


**
Long
-term
(
204
0
-
205
0
)**



1
.
**
M
erging
of
Human
and
Machine
Intelligence
**:
The
line
between
human
and
machine
intelligence
will
blur
,
leading
to
new
forms
of
intelligence
and
cognition
.

2
.
**
Aut
onomous
Systems
**:
Autonomous
systems
,
such
as
self
-driving
cars
and
drones
,
will
become
more
common
,
revolution
izing
industries
like
transportation
and
logistics
.

3
.
**
C
ognitive
Architect
ures
**:
Cognitive
architectures
,
which
aim
to
create
AI
systems
that
can
reason
and
learn
like
humans
,
will
emerge
.

4
.
**
AI
Ethics
and
Governance
**:
As
AI
becomes
more
pervasive
,
there
will
be
a
growing
need
for
AI
ethics
and
governance
frameworks
to
ensure
responsible
AI
development
and
deployment
.


**
Potential
Ris
ks
and
Challenges
**


1
.
**
Job
Dis
placement
**:
AI
may
dis
place
certain
jobs
,
leading
to
significant
social
and
economic
impacts
.

2
.
**
Bias
and
Fair
ness
**:
AI
systems
may
perpet
uate
existing
biases
and
inequalities
,
highlighting
the
need
for
more
diverse
and
inclusive
AI
development
.

3
.
**
Security
and
Safety
**:
AI
systems
may
pose
new
security
and
safety
risks
,
such
as
autonomous
systems
malfunction
ing
or
being
exploited
.

4
.
**
Value
Alignment
**:
AI
systems
may
not
align
with
human

Stream-Stream Predict Call

The stream-stream predict call enables bidirectional streaming of both inputs and outputs, making it highly effective for processing large datasets or real-time applications.

In this setup, multiple inputs can be continuously sent to the model, and the corresponding multiple predictions are streamed back in real-time. This is ideal for tasks like real-time video processing/predictions or live sensor data analysis.

It supports the following prediction methods:

  • stream_by_url — Stream a list of publicly accessible URLs and receive a stream of predictions. It takes an iterator of inputs and returns a stream of predictions.
  • stream_by_bytes — Stream raw input data.
  • stream_by_filepath — Stream inputs from local file paths.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
PROMPTS = [
"What is the future of AI?",
"Explain quantum computing in simple terms.",
"How does climate change affect global economies?"
]
MODEL_URL = "https://clarifai.com/meta/Llama-3/models/Llama-3_2-3B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Prepare input iterator: each item is a bytes-encoded prompt
input_stream = (prompt.encode() for prompt in PROMPTS)

# Stream-stream prediction using bytes
response_stream = model.stream_by_bytes(
input_stream,
input_type="text",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Iterate through streamed responses and print them
for response in response_stream:
if response.outputs and response.outputs[0].data.text:
print(response.outputs[0].data.text.raw)

# Print a newline at the end for better formatting
print()

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example stream prediction using a cluster and nodepool (no deployment ID needed):
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_bytes((prompt.encode() for prompt in PROMPTS), input_type="text", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID", nodepool_id="YOUR_NODEPOOL_ID")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)

# Example stream prediction via URL:
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_url(["INPUT_URL_1", "INPUT_URL_2"], input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)

# Example stream prediction via filepath:
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_filepath(["file1.txt", "file2.txt"], input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)

Legacy Inference via Traditional Methods

Image as Input

tip

When you take an image with a digital device (such as a smartphone camera) the image's meta-information (such as the orientation value for how the camera is held) is stored in the image's Exif's data. And when you use a photo viewer to check the image on your computer, the photo viewer will respect that orientation value and automatically rotate the image to present it the way it was viewed. This allows you to see a correctly-oriented image no matter how the camera was held.

So, when you want to make predictions from an image taken with a digital device, you need to strip the Exif data from the image. Since the Clarifai platform does not account for the Exif data, removing it allows you to make accurate predictions using images in their desired rotation.

Visual Classifier

You can use a visual classifier model to categorize images into predefined classes based on their visual content. You can provide image data either through URLs or by uploading files.

Predict via URL
note

You can send up to 128 images in a single API call, with each image file sized under 20MB. Learn more here.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "clarifai"
#APP_ID = "main"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = "general-image-recognition"
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = "aa7f35c01e0642fda5cf400f543e7c40"
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

model_url = "https://clarifai.com/clarifai/main/models/general-image-recognition"
image_url = "https://samples.clarifai.com/metro-north.jpg"

# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(input_bytes, input_type="image")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="image")

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_url(
image_url, input_type="image"
)

# Get the output
for concept in model_prediction.outputs[0].data.concepts:
print(f"concept: {concept.name:<20} confidence: {round(concept.value, 3)}")
Output
concept: statue               confidence: 0.996
concept: sculpture confidence: 0.994
concept: architecture confidence: 0.991
concept: travel confidence: 0.988
concept: no person confidence: 0.981
concept: art confidence: 0.98
concept: sky confidence: 0.973
concept: monument confidence: 0.968
concept: city confidence: 0.962
concept: liberty confidence: 0.955
concept: fame confidence: 0.949
concept: sightseeing confidence: 0.948
concept: ancient confidence: 0.945
concept: old confidence: 0.942
concept: public square confidence: 0.938
concept: popularity confidence: 0.928
concept: torch confidence: 0.908
concept: bronze confidence: 0.891
concept: outdoors confidence: 0.884
concept: marble confidence: 0.868
Predict via Bytes

Below is an example of how you would send the bytes of an image and receive model predictions.

######################################################################################################
# In this section, we set the user authentication, user and app ID, model details, and the location
# of the image we want as an input. Change these strings to run your own example.
#####################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = 'clarifai'
APP_ID = 'main'
# Change these to whatever model and image input you want to use
MODEL_ID = 'general-image-recognition'
MODEL_VERSION_ID = 'aa7f35c01e0642fda5cf400f543e7c40'
IMAGE_FILE_LOCATION = 'YOUR_IMAGE_FILE_LOCATION_HERE'

############################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
############################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

with open(IMAGE_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

post_model_outputs_response = stub.PostModelOutputs(
service_pb2.PostModelOutputsRequest(
user_app_id=userDataObject, # The userDataObject is created in the overview and is required when using a PAT
model_id=MODEL_ID,
version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
base64=file_bytes
)
)
)
]
),
metadata=metadata
)
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
print(post_model_outputs_response.status)
raise Exception("Post model outputs failed, status: " + post_model_outputs_response.status.description)

# Since we have one input, one output will exist here
output = post_model_outputs_response.outputs[0]

print("Predicted concepts:")
for concept in output.data.concepts:
print("%s %.2f" % (concept.name, concept.value))

# Uncomment this line to print the raw output
#print(output)

Predict Multiple Inputs

To predict multiple inputs at once and avoid the need for numerous API calls, you can use the following approach.

Note that these examples are provided for cURL and Python, but the same concept is applicable to any supported programming language.

curl -X POST "https://api.clarifai.com/v2/users/clarifai/apps/main/models/general-image-recognition/versions/aa7f35c01e0642fda5cf400f543e7c40/outputs" \
-H "Authorization: Key YOUR_PAT_HERE" \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{
"data": {
"image": {
"url": "https://samples.clarifai.com/metro-north.jpg"
}
}
},
{
"data": {
"image": {
"url": "...any other valid image url..."
}
}
},
# ... and so on
]
}'

Visual Detector - Image

Unlike image classification, which assigns a single label to an entire image, a visual detector model identifies and outlines multiple objects or regions within an image, associating each with specific classes or labels.

You can provide input images either through URLs or by uploading files.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "clarifai"
#APP_ID = "main"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'general-image-detection'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = '1580bb1932594c93b7e2e04456af7c6f'

# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id
# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)


DETECTION_IMAGE_URL = "https://s3.amazonaws.com/samples.clarifai.com/people_walking2.jpeg"
model_url = "https://clarifai.com/clarifai/main/models/general-image-detection"
detector_model = Model(
url=model_url,
pat="YOUR_PAT",
)


# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(input_bytes, input_type="image")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="image")

prediction_response = detector_model.predict_by_url(
DETECTION_IMAGE_URL, input_type="image"
)

# Since we have one input, one output will exist here
regions = prediction_response.outputs[0].data.regions

for region in regions:
# Accessing and rounding the bounding box values
top_row = round(region.region_info.bounding_box.top_row, 3)
left_col = round(region.region_info.bounding_box.left_col, 3)
bottom_row = round(region.region_info.bounding_box.bottom_row, 3)
right_col = round(region.region_info.bounding_box.right_col, 3)

for concept in region.data.concepts:
# Accessing and rounding the concept value
name = concept.name
value = round(concept.value, 4)

print(
(f"{name}: {value} BBox: {top_row}, {left_col}, {bottom_row}, {right_col}")
)
Output
Footwear: 0.9618 BBox: 0.879, 0.305, 0.925, 0.327

Footwear: 0.9593 BBox: 0.882, 0.284, 0.922, 0.305

Footwear: 0.9571 BBox: 0.874, 0.401, 0.923, 0.418

Footwear: 0.9546 BBox: 0.87, 0.712, 0.916, 0.732

Footwear: 0.9518 BBox: 0.882, 0.605, 0.918, 0.623

Footwear: 0.95 BBox: 0.847, 0.587, 0.907, 0.604

Footwear: 0.9349 BBox: 0.878, 0.475, 0.917, 0.492

Tree: 0.9145 BBox: 0.009, 0.019, 0.451, 0.542

Footwear: 0.9127 BBox: 0.858, 0.393, 0.909, 0.407

Footwear: 0.8969 BBox: 0.812, 0.433, 0.844, 0.445

Footwear: 0.8747 BBox: 0.852, 0.49, 0.912, 0.506

Jeans: 0.8699 BBox: 0.511, 0.255, 0.917, 0.336

Footwear: 0.8203 BBox: 0.808, 0.453, 0.833, 0.465

Footwear: 0.8186 BBox: 0.8, 0.378, 0.834, 0.391

Jeans: 0.7921 BBox: 0.715, 0.273, 0.895, 0.326

Tree: 0.7851 BBox: 0.0, 0.512, 0.635, 0.998

Woman: 0.7693 BBox: 0.466, 0.36, 0.915, 0.449

Jeans: 0.7614 BBox: 0.567, 0.567, 0.901, 0.647

Footwear: 0.7287 BBox: 0.847, 0.494, 0.884, 0.51

Tree: 0.7216 BBox: 0.002, 0.005, 0.474, 0.14

Jeans: 0.7098 BBox: 0.493, 0.447, 0.914, 0.528

Footwear: 0.6929 BBox: 0.808, 0.424, 0.839, 0.437

Jeans: 0.6734 BBox: 0.728, 0.464, 0.887, 0.515

Woman: 0.6141 BBox: 0.464, 0.674, 0.922, 0.782

Human leg: 0.6032 BBox: 0.681, 0.577, 0.897, 0.634

...

Footwear: 0.3527 BBox: 0.844, 0.5, 0.875, 0.515

Footwear: 0.3395 BBox: 0.863, 0.396, 0.914, 0.413

Human hair: 0.3358 BBox: 0.443, 0.586, 0.505, 0.622

Tree: 0.3306 BBox: 0.6, 0.759, 0.805, 0.929

Visual Segmenter

You can use a segmentation model to generate segmentation masks by providing an image as input. This enables detailed analysis by identifying distinct regions within the image and associating them with specific concepts.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "clarifai"
#APP_ID = "main"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'image-general-segmentation'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = '1581820110264581908ce024b12b4bfb'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

SEGMENT_IMAGE_URL = "https://s3.amazonaws.com/samples.clarifai.com/people_walking2.jpeg"

# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(input_bytes, input_type="image")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="image")

model_url = "https://clarifai.com/clarifai/main/models/image-general-segmentation"
segmentor_model = Model(
url=model_url,
pat="YOUR_PAT",
)

prediction_response = segmentor_model.predict_by_url(
SEGMENT_IMAGE_URL, input_type="image"
)

regions = prediction_response.outputs[0].data.regions

for region in regions:
for concept in region.data.concepts:
# Accessing and rounding the concept's percentage of image covered
name = concept.name
value = round(concept.value, 4)
print((f"{name}: {value}"))
Output
tree: 0.4965

person: 0.151

house: 0.0872

pavement: 0.0694

bush: 0.0588

road: 0.0519

sky-other: 0.0401

grass: 0.0296

building-other: 0.0096

unlabeled: 0.0035

roof: 0.0017

teddy bear: 0.0006

Image-to-Text

You can use an image-to-text model to generate meaningful textual descriptions from images.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "salesforce"
#APP_ID = "blip"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = "general-english-image-caption-blip"
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = "cdb690f13e62470ea6723642044f95e4"
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

model_url = (
"https://clarifai.com/salesforce/blip/models/general-english-image-caption-blip"
)
image_url = "https://s3.amazonaws.com/samples.clarifai.com/featured-models/image-captioning-statue-of-liberty.jpeg"

# The Predict API also accepts data through URL, Filepath & Bytes.

# Example for predict by filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="image")

# Example for predict by bytes:
# model_prediction = Model(model_url).predict_by_bytes(image_bytes, input_type="image")

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_url(
image_url, input_type="image"
)

# Get the output
print(model_prediction.outputs[0].data.text.raw)
Output
a photograph of a statue of liberty in front of a blue sky

Image-to-Image

You can use an upscaling image-to-image model to improve the quality of an image.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "stability-ai"
#APP_ID = "Upscale"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'stabilityai-upscale'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)


inference_params = dict(width=1024)

# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(image_bytes, input_type="image")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(image_filepath, input_type="image")

model_url = "https://clarifai.com/stability-ai/Upscale/models/stabilityai-upscale"

image_url = "https://s3.amazonaws.com/samples.clarifai.com/featured-models/image-captioning-statue-of-liberty.jpeg"


model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_url(
image_url, input_type="image", inference_params=inference_params
)

# Get the output
output_base64 = model_prediction.outputs[0].data.image.base64

image_info = model_prediction.outputs[0].data.image.image_info

with open("image.png", "wb") as f:
f.write(output_base64)

Visual Embedder

You can use an embedding model to generate embeddings from an image. Image embeddings are vector representations that capture the semantic content of an image, providing a powerful foundation for applications like similarity search, recommendation systems, and more.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "clarifai"
#APP_ID = "main"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'image-embedder-clip'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

image_url = "https://s3.amazonaws.com/samples.clarifai.com/featured-models/general-elephants.jpg"


# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = Model(model_url).predict_by_url(input_bytes ,input_type="image")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(image_filepath, input_type="image")


model_url = "https://clarifai.com/clarifai/main/models/image-embedder-clip"

# Model Predict
model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_url(
image_url, "image"
)
# print(model_prediction.outputs[0].data.text.raw)

embeddings = model_prediction.outputs[0].data.embeddings[0].vector

num_dimensions = model_prediction.outputs[0].data.embeddings[0].num_dimensions

print(embeddings[:10])
Output
[-0.016209319233894348,

-0.03517452999949455,

0.0031261674594134092,

0.03941042721271515,

0.01166260801255703,

-0.02489173412322998,

0.04667072370648384,

0.006998186931014061,

0.05729646235704422,

0.0077746850438416]

Video as Input

Configure FPS

When processing a video input, the API returns a list of predicted concepts for each frame. By default, the video is analyzed at 1 frame per second (FPS), which corresponds to one prediction every 1000 milliseconds. This rate can be adjusted by setting the sample_ms parameter in your prediction request.

The sample_ms defines the time interval, in milliseconds, between frames selected for inference. It must be a value between 100 and 60000.

It is calculated as: FPS = 1000 / sample_ms

For example, setting sample_ms to 1000 results in 1 FPS, which is the default rate.

Visual Detector - Video

You can use a visual detector model to get predictions for every frame when processing a video input. You can also fine-tune your requests by adjusting parameters, such as the number of frames processed per second, giving you greater control over the speed and depth of the analysis.

You can provide video inputs either through URLs or by uploading files.

note

When uploading via URL, videos must be no longer than 10 minutes in duration or 300MB in size. Learn more here.

Predict via URL

Below is an example of how you would send video URLs and receive predictions.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the portal under # Authentification
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = "clarifai"
APP_ID = "main"
# Change these to whatever model and video URL you want to use
MODEL_ID = "general-image-detection"
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = '1580bb1932594c93b7e2e04456af7c6f'

VIDEO_URL = "https://samples.clarifai.com/beer.mp4"
# Change this to configure the FPS rate (If it's not configured, it defaults to 1 FPS)
# The number must range betweeen 100 and 60000.
# FPS = 1000/sample_ms

SAMPLE_MS = 2000

# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id
# eg: model = Model("https://clarifai.com/clarifai/main/models/general-image-detection")


model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID, pat="YOUR_PAT")
output_config = {"sample_ms": SAMPLE_MS} # Run inference every 2 seconds
model_prediction = model.predict_by_url(
VIDEO_URL, input_type="video", output_config=output_config
)

# The predict API gives flexibility to generate predictions for data provided through filepath, URL and bytes format.

# Example for prediction through Filepath:
# model_prediction = model.predict_by_filepath(video_file_path, input_type="video", output_config=output_config)

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(input_video_bytes, input_type="video", output_config=output_config)


# Print the frame info and the first concept name in each frame
for frame in model_prediction.outputs[0].data.frames:
print(f"Frame Info: {frame.frame_info} Concept: {frame.data.concepts[0].name}\n")
Output
Frame Info: time: 1000

Concept: beer

Frame Info: index: 1

time: 3000

Concept: beer

Frame Info: index: 2

time: 5000

Concept: beer

Frame Info: index: 3

time: 7000

Concept: beer

Frame Info: index: 4

time: 9000

Concept: beer
Predict via Bytes

Below is an example of how you would send the bytes of a video and receive predictions.

############################################################################################################
# In this section, we set the user authentication, user and app ID, model details, location of the video
# we want as an input, and sample_ms. Change these strings to run your own example.
###########################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = 'clarifai'
APP_ID = 'main'
# Change these to whatever model and video input you want to use
MODEL_ID = 'general-image-recognition'
MODEL_VERSION_ID = 'aa7f35c01e0642fda5cf400f543e7c40'
VIDEO_FILE_LOCATION = 'YOUR_VIDEO_FILE_LOCATION_HERE'
# Change this to configure the FPS rate (If it's not configured, it defaults to 1 FPS)
SAMPLE_MS = 500

############################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
############################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

with open(VIDEO_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

post_model_outputs_response = stub.PostModelOutputs(
service_pb2.PostModelOutputsRequest(
user_app_id=userDataObject, # The userDataObject is created in the overview and is required when using a PAT
model_id=MODEL_ID,
version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
video=resources_pb2.Video(
base64=file_bytes
)
)
)
],
model=resources_pb2.Model(
output_info=resources_pb2.OutputInfo(
output_config=resources_pb2.OutputConfig(sample_ms=SAMPLE_MS)
)
),
),
metadata=metadata
)
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
print(post_model_outputs_response.status)
raise Exception("Post model outputs failed, status: " + post_model_outputs_response.status.description)

# Since we have one input, one output will exist here
output = post_model_outputs_response.outputs[0]

# A separate prediction is available for each "frame"
for frame in output.data.frames:
print("Predicted concepts on frame " + str(frame.frame_info.time) + ":")
for concept in frame.data.concepts:
print("\t%s %.2f" % (concept.name, concept.value))

# Uncomment this line to print the raw output
#print(output)

Text as Input

Text Classifier

You can use a text classifier model to automatically categorize text into predefined categories based on its content.

You can provide the text data via URLs, file uploads, or by entering raw text directly.

note

The file size of each text input should be less than 20MB. Learn more here.

Predict via URL

Below is an example of how you would make predictions on passages of text hosted on the web.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "nlptownres"
#APP_ID = "text-classification"

# Text sentiment analysis with 3 classes positive, negative, neutral.
# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'sentiment-analysis-twitter-roberta-base'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'aa7f35c01e0642fda5cf400f543e7c40'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id


# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

model_url = "https://clarifai.com/erfan/text-classification/models/sentiment-analysis-twitter-roberta-base"

# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(input_bytes, input_type="text")

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(URL, input_type="text")


file_path = "datasets/upload/data/text_files/positive/0_9.txt"
model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_filepath(
file_path, input_type="text"
)

# Get the output
for concept in model_prediction.outputs[0].data.concepts:
print(f"concept: {concept.name:<20} confidence: {round(concept.value, 3)}")
Output
concept: LABEL_0              confidence: 0.605
concept: LABEL_1 confidence: 0.306
concept: LABEL_2 confidence: 0.089
Predict via Local Files

Below is an example of how you would provide text inputs via local text files and receive predictions.

#######################################################################################################
# In this section, we set the user authentication, user and app ID, model details, and the location
# of the text we want as an input. Change these strings to run your own example.
#######################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = 'nlptownres'
APP_ID = 'text-classification'
# Change these to whatever model and text input you want to use
MODEL_ID = 'multilingual-uncased-sentiment'
MODEL_VERSION_ID = '29d5fef0229a4936a607380d7ef775dd'
TEXT_FILE_LOCATION = 'YOUR_TEXT_FILE_LOCATION_HERE'

############################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
############################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

with open(TEXT_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

post_model_outputs_response = stub.PostModelOutputs(
service_pb2.PostModelOutputsRequest(
user_app_id=userDataObject, # The userDataObject is created in the overview and is required when using a PAT
model_id=MODEL_ID,
version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
text=resources_pb2.Text(
raw=file_bytes
)
)
)
]
),
metadata=metadata
)
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
print(post_model_outputs_response.status)
raise Exception("Post model outputs failed, status: " + post_model_outputs_response.status.description)

# Since we have one input, one output will exist here.
output = post_model_outputs_response.outputs[0]

print("Predicted concepts:")
for concept in output.data.concepts:
print("%s %.2f" % (concept.name, concept.value))

# Uncomment this line to print the raw output
#print(output)
Predict via Raw Text

Below is an example of how you would provide raw text inputs and receive predictions.

#########################################################################################################
# In this section, we set the user authentication, user and app ID, model details, and the raw text
# we want as an input. Change these strings to run your own example.
########################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = 'nlptownres'
APP_ID = 'text-classification'
# Change these to whatever model and raw text you want to use
MODEL_ID = 'multilingual-uncased-sentiment'
MODEL_VERSION_ID = '29d5fef0229a4936a607380d7ef775dd'
RAW_TEXT = 'I love your product very much'

############################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
############################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_model_outputs_response = stub.PostModelOutputs(
service_pb2.PostModelOutputsRequest(
user_app_id=userDataObject, # The userDataObject is created in the overview and is required when using a PAT
model_id=MODEL_ID,
version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
text=resources_pb2.Text(
raw=RAW_TEXT
)
)
)
]
),
metadata=metadata
)
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
print(post_model_outputs_response.status)
raise Exception("Post model outputs failed, status: " + post_model_outputs_response.status.description)

# Since we have one input, one output will exist here
output = post_model_outputs_response.outputs[0]

print("Predicted concepts:")
for concept in output.data.concepts:
print("%s %.2f" % (concept.name, concept.value))

# Uncomment this line to print the raw output
#print(output)

Text Generation Using LLMs

You can use text generation models to dynamically create textual content based on user-defined prompts.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
prompt = "What’s the future of AI?"
# You can set the model using model URL or model ID.
model_url="https://clarifai.com/openai/chat-completion/models/GPT-4"

# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and Bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(url, input_type="text")

# Example for prediction through Filepath
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="text")


# Model Predict
model_prediction = Model(url=model_url,pat="YOUR_PAT").predict_by_bytes(prompt.encode(), input_type="text")

print(model_prediction.outputs[0].data.text.raw)
Output
The future of AI is vast and holds immense potential. Here are a few possibilities:

1. Enhanced Personalization: AI will be able to understand and predict user preferences with increasing accuracy. This will allow for highly personalized experiences, from product recommendations to personalized healthcare.

2. Automation: AI will continue to automate routine tasks, freeing up time for individuals to focus on more complex problems. This could be in any field, from manufacturing to customer service.

3. Advanced Data Analysis: AI will be able to analyze and interpret large amounts of data more efficiently. This could lead to significant breakthroughs in fields like climate science, medicine, and economics.

4. AI in Healthcare: AI is expected to revolutionize healthcare, from predicting diseases before symptoms appear, to assisting in surgeries, to personalized treatment plans.

5. Improved AI Ethics: As AI becomes more integral to our lives, there will be an increased focus on ensuring it is used ethically and responsibly. This could lead to advancements in AI that are more transparent, fair, and accountable.

6. General AI: Perhaps the most exciting (and daunting) prospect is the development of Artificial General Intelligence (AGI) - AI systems that possess the ability to understand, learn, adapt, and implement knowledge across a wide array of tasks, much like a human brain.

Remember, while AI holds great promise, it's also important to consider the challenges and implications it brings, such as job displacement due to automation, privacy concerns, and ethical considerations.

Set Inference Parameters

When making predictions using LLMs on our platform, some models offer the ability to specify various inference parameters to influence their output.

These parameters control the behavior of the model during the generation process, affecting aspects like creativity, coherence, and the diversity of the generated text.

You can learn more about them here.

Note: You can also find various examples of how to set inference parameters throughout this guide.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section

prompt = "What’s the future of AI?"

# You can set inference parameters
prompt_template = '''<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>'''

system_prompt= "You're the helpful assistant"

inference_params = dict(temperature=0.7, max_tokens=200, top_k = 50, top_p= 0.95, prompt_template= prompt_template, system_prompt=system_prompt)

# You can set the model using model URL or model ID.
model_url="https://clarifai.com/meta/Llama-3/models/llama-3_1-8b-instruct"

# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and Bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(url, input_type="text", inference_params=inference_params)

# Example for prediction through Filepath
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="text", inference_params=inference_params)

# Model Predict
model_prediction = Model(url=model_url,pat="YOUR_PAT").predict_by_bytes(prompt.encode(), input_type="text", inference_params=inference_params)

print(model_prediction.outputs[0].data.text.raw)

Text Classification Using LLMs

You can leverage LLMs to categorize text using carefully crafted prompts.

from clarifai.client.model import Model

model_url = "https://clarifai.com/openai/chat-completion/models/GPT-4"

prompt = """Classes: [`positive`, `negative`, `neutral`]
Text: Sunny weather makes me happy.

Classify the text into one of the above classes."""

# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(url, input_type="text")

# Example for prediction through Filepath
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="text")

# Model Predict
model_prediction = Model(model_url, pat="YOUR_PAT").predict_by_bytes(prompt.encode(), input_type="text")

#Output
print(model_prediction.outputs[0].data.text.raw)
Output
`positive`

Text-to-Image

You can use a text-to-image model to transform textual input into vibrant and expressive images.

from clarifai.client.model import Model
import numpy as np
import cv2
import matplotlib.pyplot as plt

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "stability-ai"
#APP_ID = "stable-diffusion-2"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'stable-diffusion-xl'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = '0c919cc1edfc455dbc96207753f178d7'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

input_text = b"floor plan for 2 bedroom kitchen house"

# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and Bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(url, input_type="text")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="text")

# Image Generation using Stable Diffusion XL
model_url = "https://clarifai.com/stability-ai/stable-diffusion-2/models/stable-diffusion-xl"

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_bytes(
input_text, input_type="text"
)


# Base64 image to numpy array
im_b = model_prediction.outputs[0].data.image.base64
image_np = np.frombuffer(im_b, np.uint8)
img_np = cv2.imdecode(image_np, cv2.IMREAD_COLOR)
# Display the image
plt.axis("off")
plt.imshow(img_np[..., ::-1])
Output

Text-to-Audio

You can use a text-to-audio model to convert written text into natural, expressive speech.

from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "eleven-labs"
#APP_ID = "audio-generation"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'speech-synthesis'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'f588d92c044d4487a38c8f3d7a3b0eb2'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)


input_text = "Hello, How are you doing today!"

# The predict API gives flexibility to generate predictions for data provided through URL,Filepath and bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(url, input_type="text")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(filepath, input_type="text")

model_url = "https://clarifai.com/eleven-labs/audio-generation/models/speech-synthesis"

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_bytes(input_text, "text")

# Save the audio file
with open("output_audio.wav", mode="bx") as f:
f.write(model_prediction.outputs[0].data.audio.base64)

Text Embedder

You can use an embedding model to generate embeddings from text. These embeddings are vector representations that capture the semantic meaning of the text, making them ideal for applications such as similarity search, recommendation systems, document clustering, and more.

note

Cohere Embed-v3 model requires an input_type parameter to be specified, which can be set using one of the following values:

  • search_document (default): For texts (documents) intended to be stored in a vector database.
  • search_query: For search queries to find the most relevant documents in a vector database.
  • classification: If the embeddings are used as input for a classification system.
  • clustering: If the embeddings are used for text clustering.
from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "cohere"
#APP_ID = "embed"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'cohere-embed-english-v3_0'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

input_text = """In India Green Revolution commenced in the early 1960s that led to an increase in food grain production, especially in Punjab, Haryana, and Uttar Pradesh. Major milestones in this undertaking were the development of high-yielding varieties of wheat. The Green revolution is revolutionary in character due to the introduction of new technology, new ideas, the new application of inputs like HYV seeds, fertilizers, irrigation water, pesticides, etc. As all these were brought suddenly and spread quickly to attain dramatic results thus it is termed as a revolution in green agriculture.
"""
# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through URL:
# model_prediction = Model(model_url).predict_by_url(URL ,input_type="text")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(image_filepath, input_type="text")


model_url = "https://clarifai.com/cohere/embed/models/cohere-embed-english-v3_0"

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_bytes(
input_text, "text"
)

embeddings = model_prediction.outputs[0].data.embeddings[0].vector

num_dimensions = model_prediction.outputs[0].data.embeddings[0].num_dimensions

print(embeddings[:10])
Output
[-0.02596100978553295,

0.023946398869156837,

-0.07173235714435577,

0.032294824719429016,

0.020313993096351624,

-0.026998838409781456,

0.008684193715453148,

-0.016651064157485962,

-0.012316598556935787,

0.00042328768176957965]

Audio as Input

Audio-to-Text

You can use an audio-to-text model to convert audio files into text. This enables the transcription of spoken words for a variety of use cases, including transcription services, voice command processing, and more.

Predict via URL
from clarifai.client.model import Model

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "facebook"
#APP_ID = "asr"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'asr-wav2vec2-base-960h-english'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

audio_url = "https://s3.amazonaws.com/samples.clarifai.com/GoodMorning.wav"

# The predict API gives the flexibility to generate predictions for data provided through URL, Filepath and bytes format.

# Example for prediction through Bytes:
# model_prediction = model.predict_by_bytes(audio_bytes, input_type="audio")

# Example for prediction through Filepath:
# model_prediction = Model(model_url).predict_by_filepath(audio_filepath, input_type="audio")

model_url = "https://clarifai.com/facebook/asr/models/asr-wav2vec2-large-robust-ft-swbd-300h-english"

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict_by_url(
audio_url, "audio"
)

# Print the output
print(model_prediction.outputs[0].data.text.raw)

Output
GOOD MORNING I THINK THIS IS GOING TO BE A GREAT PRESENTATION
Predict via Bytes
#########################################################################################
# In this section, we set the user authentication, user and app ID, model ID, and
# audio file location. Change these strings to run your own example.
########################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
USER_ID = "facebook"
APP_ID = "asr"
# Change these to make your own predictions
MODEL_ID = "asr-wav2vec2-base-960h-english"
AUDIO_FILE_LOCATION = "YOUR_AUDIO_FILE_LOCATION_HERE"

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(
user_id=USER_ID, app_id=APP_ID
) # The userDataObject is required when using a PAT

with open(AUDIO_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

post_model_outputs_response = stub.PostModelOutputs(
service_pb2.PostModelOutputsRequest(
user_app_id=userDataObject,
model_id=MODEL_ID,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(audio=resources_pb2.Audio(base64=file_bytes))
)
],
),
metadata=metadata,
)
if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
print(post_model_outputs_response.status)
raise Exception(
"Post workflow results failed, status: "
+ post_model_outputs_response.status.description
)

# Since we have one input, one output will exist here
output = post_model_outputs_response.outputs[0]

# Print the output
print(output.data.text.raw)

MultiModal as Input

[Image,Text]-to-Text

You can process multimodal inputs — combining multiple modalities, such as text, images, and/or other types of data — to generate accurate predictions.

Below is an example of how you can send both image and text inputs to a model.

Predict via Image URL
from clarifai.client.model import Model
from clarifai.client.input import Inputs

# Your PAT (Personal Access Token) can be found in the Account's Security section
# Specify the correct user_id/app_id pairings
# Since you're making inferences outside your app's scope
#USER_ID = "openai"
#APP_ID = "chat-completion"

# You can set the model using model URL or model ID.
# Change these to whatever model you want to use
# eg : MODEL_ID = 'openai-gpt-4-vision'
# You can also set a particular model version by specifying the version ID
# eg: MODEL_VERSION_ID = 'model_version'
# Model class objects can be inititalised by providing its URL or also by defining respective user_id, app_id and model_id

# eg : model = Model(user_id="clarifai", app_id="main", model_id=MODEL_ID)

prompt = "What time of day is it?"
image_url = "https://samples.clarifai.com/metro-north.jpg"

model_url = "https://clarifai.com/openai/chat-completion/models/openai-gpt-4-vision"
inference_params = dict(temperature=0.2, max_tokens=100)
multi_inputs = Inputs.get_multimodal_input(input_id="", image_url=image_url, raw_text=prompt)

# Predicts the model based on the given inputs.

model_prediction = Model(url=model_url, pat="YOUR_PAT").predict(
inputs=[
multi_inputs
],
inference_params=inference_params,
)

print(model_prediction.outputs[0].data.text.raw)
Output
The time of day in the image appears to be either dawn or dusk, given the light in the sky. It's not possible to determine the exact time without additional context, but the sky has a mix of light and dark hues, which typically occurs during sunrise or sunset. The presence of snow and the lighting at the train station suggest that it might be winter, and depending on the location, this could influence whether it's morning or evening.
Predict via Local Image
from clarifai.client.model import Model
from clarifai.client.input import Inputs

IMAGE_FILE_LOCATION = 'LOCAL IMAGE PATH'
with open(IMAGE_FILE_LOCATION, "rb") as f:
file_bytes = f.read()


prompt = "What time of day is it?"
inference_params = dict(temperature=0.2, max_tokens=100)

model_prediction = Model("https://clarifai.com/openai/chat-completion/models/openai-gpt-4-vision").predict(inputs = [Inputs.get_multimodal_input(input_id="", image_bytes = file_bytes, raw_text=prompt)], inference_params=inference_params)
print(model_prediction.outputs[0].data.text.raw)
Output
The time of day in the image appears to be either dawn or dusk, given the light in the sky. It's not possible to determine the exact time without additional context, but the sky has a mix of light and dark hues, which typically occurs during sunrise or sunset. The presence of snow and the lighting at the train station suggest that it might be winter, and depending on the location, this could influence whether it's morning or evening.

Use Third-Party API Keys

info

The ability to use third-party API keys is currently exclusively available to Enterprise users. Learn more here.

For the third-party models we've wrapped into our platform, like those provided by OpenAI, Anthropic, Cohere, and others, you can also choose to utilize their API keys as an option—in addition to using the default Clarifai keys.

This Bring Your Own Key (BYOK) flexibility allows you to integrate your preferred services and APIs into your workflow, enhancing the versatility of our platform.

Here is an example of how to add an OpenAI API key for Dalle-3 for text-to-image tasks.

curl -X POST "https://api.clarifai.com/v2/users/openai/apps/dall-e/models/dall-e-3/versions/dc9dcb6ee67543cebc0b9a025861b868/outputs" \
-H "Authorization: Key YOUR_PAT_HERE" \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{
"data": {
"text": {
"raw": "An expressive oil painting of a basketball player dunking, depicted as an explosion of a nebula"
}
}
}
],
"model": {
"model_version": {
"output_info": {
"params": {
"size":"1024x1024",
"quality":"hd",
"api_key":"ADD_THIRD_PARTY_KEY_HERE"
}
}
}
}
}'