Skip to main content

Legacy Inference via the API

Perform predictions using our older method


The legacy inference technique uses our previous API structure and is best suited for models built using the older techniques.

While this method remains functional, we recommend transitioning to the new inference method for improved efficiency, scalability, and access to the latest features.

info

Before making a prediction, ensure that your model has been deployed, as mentioned previously. Otherwise, the prediction will default to the Clarifai Shared deployment type.

Unary-Unary Predict Call

This is the simplest type of prediction. In this method, a single input is sent to the model, and it returns a single response. This is ideal for tasks where a quick, non-streaming prediction is required, such as classifying an image.

It supports the following prediction methods:

  • predict_by_url — Use a publicly accessible URL for the input.
  • predict_by_bytes — Pass raw input data directly.
  • predict_by_filepath — Provide the local file path for the input.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
IMAGE_URL = "https://samples.clarifai.com/birds.jpg"
MODEL_URL = "https://clarifai.com/qwen/qwen-VL/models/Qwen2_5-VL-7B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Make a unary-unary prediction using the image URL
model_prediction = model.predict_by_url(
IMAGE_URL,
input_type="image",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Output the model's response
print(model_prediction.outputs[0].data.text.raw)

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example prediction using a cluster and nodepool (no deployment ID needed):
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_url("INPUT_URL_HERE", input_type="image", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID_HERE", nodepool_id="YOUR_NODEPOOL_ID_HERE")

# Example prediction via bytes:
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_bytes("INPUT_TEXT_HERE".encode(), input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")

# Example prediction via filepath:
# model_prediction = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").predict_by_filepath("INPUT_FILEPATH_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
Example Output
2025-04-04 13:27:15.990657 INFO     Qwen2_5-VL-7B-Instruct model is still deploying, please wait...         model.py:443
2025-04-04 13:27:30.233847 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
2025-04-04 13:27:45.624827 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
2025-04-04 13:28:02.551081 INFO Qwen2_5-VL-7B-Instruct model is still deploying, please wait... model.py:443
This image captures three seagulls in flight over a body of water, likely a lake or river. The background is a natural setting with dry grass and trees, suggesting it might be late autumn or early spring. The seagulls appear to be gliding close to the water's surface, possibly searching for food. The lighting indicates it could be a sunny day. This scene is typical of coastal or lakeside environments where seagulls often congregate.

Unary-Stream Predict Call

The Unary-Stream predict call processes a single input, but returns a stream of responses. It is particularly useful for tasks where multiple outputs are generated from a single input, such as generating text completions from a prompt.

It supports the following prediction methods:

  • generate_by_url — Provide a publicly accessible URL and handle the streamed responses iteratively.
  • generate_by_bytes — Use raw input data.
  • generate_by_filepath — Use a local file path for the input.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
PROMPT = "What is the future of AI?"
MODEL_URL = "https://clarifai.com/meta/Llama-3/models/Llama-3_2-3B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Make a unary-stream prediction using the prompt as bytes
response_stream = model.generate_by_bytes(
PROMPT.encode(),
input_type="text",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Iterate through streamed responses and print them
for response in response_stream:
if response.outputs and response.outputs[0].data.text:
print(response.outputs[0].data.text.raw)

# Print a newline at the end for better formatting
print()

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example stream prediction using a cluster and nodepool (no deployment ID needed):
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_bytes("YOUR_PROMPT_HERE".encode(), input_type="text", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID", nodepool_id="YOUR_NODEPOOL_ID"):
# print(response.outputs[0].data.text.raw)

# Example unary-stream prediction via URL:
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_url("INPUT_URL_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE"):
# print(response.outputs[0].data.text.raw)

# Example unary-stream prediction via filepath:
# for response in Model(url=MODEL_URL, pat="YOUR_PAT_HERE").generate_by_filepath("INPUT_FILEPATH_HERE", input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE"):
# print(response.outputs[0].data.text.raw)
Example Output
2025-04-04 15:10:09.952752 INFO     Llama-3_2-3B-Instruct model is still     model.py:726
deploying, please wait...
2025-04-04 15:10:24.522422 INFO Llama-3_2-3B-Instruct model is still model.py:726
deploying, please wait...
The
future
of
Artificial
Intelligence
(
AI
)
is
vast
and
rapidly
evolving
.
Based
on
current
trends
and
advancements
,
here
are
some
potential
developments
that
may
shape
the
future
of
AI
:


**
Short
-term
(
202
5
-
203
0
)**



1
.
**
Increased
Adoption
**:
AI
will
become
more
ubiquitous
in
various
industries
,
including
healthcare
,
finance
,
transportation
,
and
education
.

2
.
**
Improved
Natural
Language
Processing
(
N
LP
)**
:
N
LP
will
continue
to
advance
,
enabling
more
accurate
and
effective
human
-com
puter
interactions
.

3
.
**
Enh
anced
Machine
Learning
(
ML
)**
:
ML
will
become
more
sophisticated
,
allowing
for
more
accurate
predictions
and
decision
-making
.

4
.
**
R
ise
of
Explain
able
AI
(
X
AI
)**
:
X
AI
will
become
more
prominent
,
enabling
users
to
understand
the
reasoning
behind
AI
decisions
.


**
Mid
-term
(
203
0
-
204
0
)**



1
.
**
Art
ificial
General
Intelligence
(
AG
I
)**
:
AG
I
,
which
refers
to
AI
systems
that
can
perform
any
intellectual
task
,
may
emerge
.

2
.
**
Quant
um
AI
**:
Quantum
computing
will
be
integrated
with
AI
,
leading
to
exponential
advancements
in
processing
power
and
AI
capabilities
.

3
.
**
Edge
AI
**:
Edge
AI
will
become
more
prevalent
,
enabling
AI
to
be
deployed
at
the
edge
of
networks
,
reducing
latency
,
and
improving
real
-time
decision
-making
.

4
.
**
Human
-A
I
Collaboration
**:
Humans
and
AI
systems
will
collaborate
more
effectively
,
leading
to
increased
productivity
and
innovation
.


**
Long
-term
(
204
0
-
205
0
)**



1
.
**
M
erging
of
Human
and
Machine
Intelligence
**:
The
line
between
human
and
machine
intelligence
will
blur
,
leading
to
new
forms
of
intelligence
and
cognition
.

2
.
**
Aut
onomous
Systems
**:
Autonomous
systems
,
such
as
self
-driving
cars
and
drones
,
will
become
more
common
,
revolution
izing
industries
like
transportation
and
logistics
.

3
.
**
C
ognitive
Architect
ures
**:
Cognitive
architectures
,
which
aim
to
create
AI
systems
that
can
reason
and
learn
like
humans
,
will
emerge
.

4
.
**
AI
Ethics
and
Governance
**:
As
AI
becomes
more
pervasive
,
there
will
be
a
growing
need
for
AI
ethics
and
governance
frameworks
to
ensure
responsible
AI
development
and
deployment
.


**
Potential
Ris
ks
and
Challenges
**


1
.
**
Job
Dis
placement
**:
AI
may
dis
place
certain
jobs
,
leading
to
significant
social
and
economic
impacts
.

2
.
**
Bias
and
Fair
ness
**:
AI
systems
may
perpet
uate
existing
biases
and
inequalities
,
highlighting
the
need
for
more
diverse
and
inclusive
AI
development
.

3
.
**
Security
and
Safety
**:
AI
systems
may
pose
new
security
and
safety
risks
,
such
as
autonomous
systems
malfunction
ing
or
being
exploited
.

4
.
**
Value
Alignment
**:
AI
systems
may
not
align
with
human

Stream-Stream Predict Call

The stream-stream predict call enables bidirectional streaming of both inputs and outputs, making it highly effective for processing large datasets or real-time applications.

In this setup, multiple inputs can be continuously sent to the model, and the corresponding multiple predictions are streamed back in real-time. This is ideal for tasks like real-time video processing/predictions or live sensor data analysis.

It supports the following prediction methods:

  • stream_by_url — Stream a list of publicly accessible URLs and receive a stream of predictions. It takes an iterator of inputs and returns a stream of predictions.
  • stream_by_bytes — Stream raw input data.
  • stream_by_filepath — Stream inputs from local file paths.
##################################################################################################
# Change these strings to run your own example
##################################################################################################

# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = "YOUR_PAT_HERE"
USER_ID = "YOUR_USER_ID_HERE"
PROMPTS = [
"What is the future of AI?",
"Explain quantum computing in simple terms.",
"How does climate change affect global economies?"
]
MODEL_URL = "https://clarifai.com/meta/Llama-3/models/Llama-3_2-3B-Instruct"
DEPLOYMENT_ID = "YOUR_DEPLOYMENT_ID_HERE"

##################################################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##################################################################################################

from clarifai.client.model import Model

# Initialize the model
model = Model(
url=MODEL_URL, # Or, use model_id="YOUR_MODEL_ID_HERE"
pat=PAT
)

# Prepare input iterator: each item is a bytes-encoded prompt
input_stream = (prompt.encode() for prompt in PROMPTS)

# Stream-stream prediction using bytes
response_stream = model.stream_by_bytes(
input_stream,
input_type="text",
user_id=USER_ID,
deployment_id=DEPLOYMENT_ID
)

# Iterate through streamed responses and print them
for response in response_stream:
if response.outputs and response.outputs[0].data.text:
print(response.outputs[0].data.text.raw)

# Print a newline at the end for better formatting
print()

##################################################################################################
# ADDITIONAL EXAMPLES
##################################################################################################

# Example stream prediction using a cluster and nodepool (no deployment ID needed):
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_bytes((prompt.encode() for prompt in PROMPTS), input_type="text", user_id="YOUR_USER_ID_HERE", compute_cluster_id="YOUR_CLUSTER_ID", nodepool_id="YOUR_NODEPOOL_ID")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)

# Example stream prediction via URL:
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_url(["INPUT_URL_1", "INPUT_URL_2"], input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)

# Example stream prediction via filepath:
# response_stream = Model(url=MODEL_URL, pat="YOUR_PAT_HERE").stream_by_filepath(["file1.txt", "file2.txt"], input_type="text", user_id="YOUR_USER_ID_HERE", deployment_id="YOUR_DEPLOYMENT_ID_HERE")
# for response in response_stream:
# print(response.outputs[0].data.text.raw)