Skip to main content

Using Clarifai Models In LiteLLM

Learn how to use Clarifai Models in LiteLLM


LiteLLM allows you to interact with LLM models from different providers. One such provider is Clarifai. Using LiteLLM users can easily use hosted LLM models in the Clarifai platform. Now let’s look at how you can use Clarifai LLMs through LiteLLM.

Prerequisites

  • Setting up the Clarifai Python SDK along with PAT. Refer to the installation and configuration with the PAT token here.
note

Guide to get your PAT

import os
#Replace your PAT
os.environ['CLARIFAI_PAT'] ="YOUR_PAT"
  • Install the required packages.
!pip install litellm
!pip install clarifai

Completion

The completion method is the core functionality of LiteLLM for interacting with large language models (LLMs) and generating text. It provides a standardized way to send prompts to various LLMs and retrieve the generated responses. You can set the model field as a model URL from the Clarifai platform.

Click here to learn more about completion in LiteLLM.

In the example given below, we are going to chat with the Mistral-Large model from Clarifai.

from litellm import completion

messages = [{"role": "user","content": """Write a poem about history?"""}]
# Using LLM from Clarifai Platform
response=completion(
model="clarifai/mistralai.completion.mistral-large",
messages=messages,
)

print(f"Mistral large response : {response}")
Output
Mistral large response : ModelResponse(id='chatcmpl-6eed494d-7ae2-4870-b9c2-6a64d50a6151', choices=[Choices
(finish_reason='stop', index=1, message=Message(content="In the grand tapestry of time, where tales unfold,\nLies the chronicle of ages, a sight to behold.\nA tale of empires rising, and kings of old,\nOf civilizations lost, and stories untold.\n\nOnce upon a yesterday, in a time so vast,\nHumans took their first steps,
casting shadows in the past.\nFrom the cradle of mankind, a journey they embarked,\nThrough stone and bronze and iron, their skills they sharpened and marked.\n\nEgyptians built pyramids, reaching for the skies,\nWhile Greeks sought wisdom, truth, in philosophies that lie.\nRoman legions marched, their empire to expand,\nAnd in the East, the Silk Road joined the world, hand in hand.\n\nThe Middle Ages came,
with knights in shining armor,\nFeudal lords and serfs, a time of both clamor and calm order.\nThen Renaissance bloomed, like a flower in the sun,\nA rebirth of art and science, a new age had begun.\n\nAcross the vast oceans, explorers sailed with courage bold,\nDiscovering new lands, stories of adventure, untold.\nIndustrial Revolution churned, progress in its wake,\nMachines and factories, a whole new world to make.\n\nTwo World Wars raged, a testament to man's strife,\nYet from the ashes rose hope, a renewed will for life.\nInto the modern era, technology took flight,\nConnecting every corner, bathed in digital light.\n\nHistory, a symphony, a melody of time,\nA testament to human will, resilience so sublime.\nIn every page, a lesson, in every tale, a guide,\nFor understanding our past, shapes our future's tide.", role='assistant'))], created=1713896412, model='https://api.clarifai.com/v2/users/mistralai/apps/completion/models/mistral-large/outputs', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=13, completion_tokens=338, total_tokens=351))

Now let’s ask the same question to Claude-2.1 model. Claude models require special tokens as input. Therefore by using LiteLLM we are standardizing the inputs for LLM applications.

from litellm import completion

messages = [{"role": "user","content": """Write a poem about history?"""}]
response=completion(
model="clarifai/anthropic.completion.claude-2_1",
messages=messages,
)

print(f"Claude-2.1 response : {response}")
Output
Claude-2.1 response : ModelResponse(id='chatcmpl-d126c919-4db4-4aa3-ac8f-7edea41e0b93', choices=[Choices(finish_reason='stop', index=1, message=Message
(content=" Here's a poem I wrote about history:\n\nThe Tides of Time\n\nThe tides of time ebb and flow,\nCarrying stories of long ago.\nFigures and events
come into light,\nShaping the future with all their might.\n\nKingdoms rise, empires fall, \nLeaving traces that echo down every hall.\nRevolutions bring change with a fiery glow,\nToppling structures from long ago.\n\nExplorers traverse each ocean and
land,\nSeeking treasures they don't understand.\nWhile artists and writers try to make their mark,\nHoping their works shine bright in the dark.\n\nThe cycle repeats again and again,\nAs humanity struggles to learn from its pain.\nThough the players may change on history's stage,\nThe themes stay the same from age to age.\n\nWar and peace, life and death,\nLove and strife with every breath.\nThe tides of time continue their dance,\nAs we join in, by luck or by chance.\n\nSo we study the past to light the way forward, \nHeeding warnings from stories told and heard.\nThe future unfolds from this unending flow -\nWhere the tides of time ultimately go.", role='assistant'))], created=1713896579, model='https://api.clarifai.com/v2/users/anthropic/apps/completion/models/claude-2_1/outputs', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=12, completion_tokens=232, total_tokens=244))

If you observe the above outputs, the response format from both the models is the same even though they are different LLMs. This is one of the advantages of LiteLLM.

Streaming

When using streaming, the completion method no longer returns a single dictionary with all the responses. Instead, it returns an iterator that yields dictionaries containing partial information as the LLM generates the response.

Click here to learn more about streaming in LiteLLM.

info

Set stream=True as an argument to the completion function.

from litellm import completion

messages = [{"role": "user","content": """Write a poem about history?"""}]
response = completion(
model="clarifai/openai.chat-completion.GPT-4",
messages=messages,
stream=True,
api_key = "OpenAI_API_KEY")

for chunk in response:
print(chunk)
Output
ModelResponse(id='chatcmpl-40ae19af-3bf0-4eb4-99f2-33aec3ba84af', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content="In the quiet corners 
of time's grand hall,\nLies the tale of rise and fall.\nFrom ancient
ruins to modern sprawl,\nHistory, the greatest story of them all.\n\nEmpires have risen, empires have decayed,\nThrough the eons, memories have stayed.\nIn the book of time, history is laid,\nA tapestry of events, meticulously displayed.\n\nThe pyramids of Egypt, standing tall,\nThe Roman Empire's mighty sprawl.\nFrom Alexander's conquest, to the Berlin Wall,\nHistory, a silent witness to it all.\n\nIn the shadow of the past we tread,\nWhere once kings and prophets led.\nTheir stories in our hearts are spread,\nEchoes of their words, in our minds are read.\n\nBattles fought and victories won,\nActs of courage under the sun.\nTales of love, of deeds done,\nIn history's grand book, they all run.\n\nHeroes born, legends made,\nIn the annals of time, they'll never fade.\nTheir triumphs and failures all displayed,\nIn the eternal march of history's parade.\n\nThe ink of the past is forever dry,\nBut its lessons, we cannot deny.\nIn its stories, truths lie,\nIn its wisdom, we rely.\n\nHistory, a mirror to our past,\nA guide for the future vast.\nThrough its lens, we're ever cast,\nIn the drama of life, forever vast.", role='assistant', function_call=None, tool_calls=None), logprobs=None)], created=1714744515, model='https://api.clarifai.com/v2/users/openai/apps/chat-completion/models/GPT-4/outputs', object='chat.completion.chunk', system_fingerprint=None)
ModelResponse(id='chatcmpl-40ae19af-3bf0-4eb4-99f2-33aec3ba84af', choices=[StreamingChoices(finish_reason='stop', index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=None), logprobs=None)], created=1714744515, model='https://api.clarifai.com/v2/users/openai/apps/chat-completion/models/GPT-4/outputs', object='chat.completion.chunk', system_fingerprint=None)

Async Completion

Async completion in LiteLLM uses asynchronous capabilities to handle language model completions efficiently. By using async function, LiteLLM can perform non-blocking I/O operations, making it well-suited for applications that require responsive and scalable interactions with language models.

Click here to learn more about async completion in LiteLLM.

from litellm import acompletion
async def test_get_response():
user_message = "Hello, how are you?"
messages = [{"content": user_message, "role": "user"}]
response = await acompletion(model="clarifai/openai.chat-completion.GPT-4", messages=messages, api_key="OpenAI_API_KEY")
return response

response = await test_get_response()
print(response)
Output
ModelResponse(id='chatcmpl-095f1f4f-66b0-4f1f-988d-159a809f0c9c', choices=[Choices(finish_reason='stop', index=1, message=Message(content="Hello! As an artificial intelligence, 
I don't have feelings, but I'm here and ready to assist you. How can I help you today?", role='assistant'))],
created=1717782689, model='https://api.clarifai.com/v2/users/openai/apps/chat-completion/models/GPT-4/outputs', object='chat.completion', system_fingerprint=None, usage=Usage(prompt_tokens=6, completion_tokens=30, total_tokens=36))

Async Streaming

Async streaming in LiteLLM refers to the process of handling real-time data streams asynchronously when interacting with language models. This is particularly useful for applications that need to process data as it arrives without waiting for the entire response, such as chatbots, real-time data processing systems, or live user interactions.

Click here to learn more about async streaming in LiteLLM.

from litellm import acompletion
import asyncio, os, traceback

async def completion_call():
try:
print("test acompletion + streaming")
response = await acompletion(
model="clarifai/mistralai.completion.mistral-large",
messages=[{"content": "Hello, how are you?", "role": "user"}],
stream=True
)
print(f"response: {response}")
async for chunk in response:
print(chunk)
except:
print(f"error occurred: {traceback.format_exc()}")
pass

await completion_call()
Output
test acompletion + streaming
response: <litellm.utils.CustomStreamWrapper object at 0x7e7b640d5250>
ModelResponse(id='chatcmpl-d70421b5-9701-4ed1-8926-09ccd79abd25', choices=[StreamingChoices(finish_reason=None,
index=0, delta=Delta(content="Hello! I'm an assistant designed to help answer your questions and provide information, so I don't have feelings, but I'm here and ready to assist you. How can I help you today?", role='assistant', function_call=None, tool_calls=None), logprobs=None)], created=1718733588, model='https://api.clarifai.com/v2/users/mistralai/apps/completion/models/mistral-large/outputs', object='chat.completion.chunk', system_fingerprint=None)
ModelResponse(id='chatcmpl-d70421b5-9701-4ed1-8926-09ccd79abd25',
choices=[StreamingChoices(finish_reason='stop', index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=None), logprobs=None)], created=1718733588, model='https://api.clarifai.com/v2/users/mistralai/apps/completion/models/mistral-large/outputs', object='chat.completion.chunk', system_fingerprint=None)