Skip to main content

Using Clarifai LLMs And Models In Embedchain

Using Clarifai models in Embedchain


In this example, we will explore ways to create a RAG agent with LLMs and embedders from Clarifai using Embedchain.

Prerequisites

  • Setting up the Clarifai Python SDK along with PAT. Refer to the installation and configuration with the PAT token here.

    note

    Guide to get your PAT

  • Install the required packages.

!pip install embedchain[clarifai]

Initialisation

Embedchain allows users to create personalised AI apps using the App method. It also allows users to create and manage instances of bots (applications) that leverage custom knowledge bases for answering queries. You can also add data from different sources such as text, documents, websites, images, and videos into your app.

Click here to learn more about apps in embedchain.

# Import the App class from the embedchain module
from embedchain import App

# Create an instance of the App class using a custom configuration
app = App.from_config(config={
# Configuration for the large language model (LLM)
"llm": {
# Specify the provider for the LLM, in this case, Clarifai
"provider": "clarifai",
# Configuration details for the LLM
"config": {
# Specify the model URL from Clarifai
"model": "https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct",
# Additional model parameters
"model_kwargs": {
# Temperature parameter for the model, which controls the randomness of the output
"temperature": 0.5,
# Maximum number of tokens (words or pieces of words) in the generated response
"max_tokens": 1000
}
}
},
# Configuration for the embedder
"embedder": {
# Specify the provider for the embedder, in this case, Clarifai
"provider": "clarifai",
# Configuration details for the embedder
"config": {
# Specify the embedding model URL from Clarifai
"model": "https://clarifai.com/openai/embed/models/text-embedding-ada",
}
}
})

This code initializes an application instance from the EmbedChain framework with specific configurations for a large language model (LLM) and an embedder, both sourced from Clarifai. The App.from_config method is used to set up this instance. The LLM configuration specifies the use of the mistral-7B-Instruct model hosted by Clarifai, with additional parameters such as a temperature setting of 0.5 to control response randomness and a maximum token limit of 1000 for generated outputs. The embedder configuration uses the text-embedding-ada model from Clarifai for embedding text data.

Data Ingestion

In EmbedChain you can add data and its embedding from various sources into the application's knowledge base, making it accessible for querying by the bots. This process involves extracting content from different data types, converting it into embeddings using specified models, and storing these embeddings in the knowledge base. For our example, we are going to use an Image URL.

Click here to learn more about data ingestion in embedchain.

info

By default, embedchain uses chromadb as vectorstore for your app.

app.add("https://www.forbes.com/profile/elon-musk")
Output
Inserting batches in chromadb: 100%|██████████| 1/1 [00:06<00:00,  6.86s/it]

Query

Users can now run queries on the data they ingested using the app.query() method.

Refer to this page to know more about app.query().

while True:
# Prompt the user to enter a question
question = input("Enter question: ")

# Check if the user wants to quit the loop
if question in ['q', 'exit', 'quit']:
# If the input is 'q', 'exit', or 'quit', break the loop and end the program
break

# Use the app to query the entered question and get an answer
answer = app.query(question)

# Print the answer returned by the app
print(answer)
Output
Enter question: identify the person
The person being referred to in the context is Elon Musk.