Skip to main content

Vector Search

Manage your inputs and training data on the Clarifai platform


Search Overview

Search helps you sort, save, organize, and filter your datasets


You can upload inputs to our platform as URLs or bytes. When you POST /inputs, your base workflow is used to index the inputs, and this index enables search over the outputs of the models in your workflow. Once indexed, you can search for the inputs by concept, annotation, or any other advanced search parameters.

Rank

Your model can identify concepts in your data and rank your search results by how confident it is that a given concept is present. You can even rank search results by how similar one input is to another input.

Filter

Trim down the amount of data returned in search. For example, you may only want to see inputs that one of your collaborators has labeled with the word “dog”. Or, perhaps you want only those inputs that were captured in a certain geographical region.

'AND'

Combine multiple search parameters. For example, you can find all the inputs within a geographical region with a "weapon" in them, or all annotations assigned to user "Joe", or visually similar product images that are assigned the word "XL" in metadata.

Image illustrating how to search by images using Clarifai's concepts

info
  • When performing a Smart Search with custom concepts, ensure that these concepts are first trained using an embedding-classifier model (transfer-learning model). Without this training, the search query may not work. In the Input-Manager screen, concepts that have already been trained with a model are marked with a blue circle, while untrained concepts are marked with a grey circle.
  • When you upload inputs to our platform, we use your app’s base workflow to index them. This immediately makes them searchable by concept, annotation, or other advanced parameters. You can update the search index by clicking the Update button in the search bar, then clicking the reindex button in menu that drops down.

Fully-Managed Vector Search Engine

Our Smart Search feature leverages vector search capabilities to power the search experience. Vector search is a type of search engine that uses vectors to search and retrieve text, images, and videos.

Vector embeddings are numerical “representations” of unstructured data, which enable their meaning to be encoded and processed mathematically. By converting the data into vectors, which is a language native to computers, we can efficiently perform search operations on them.

Instead of traditional keyword-based search, where exact matches are sought, vector search allows for searching based on visual and/or semantic similarity by calculating distances between vector embedding representations of the data.

Powered by a Vector Database

Our vector search engine uses deep learning embedding models to first analyze the visual features of each input, such as color, shape, and texture. This process, known as feature extraction, generates a corresponding vector representation for each piece of unstructured data.

The embedding models then index these vector representations and store them in our vector database (also called a vector store or a semantic search engine).

When a user performs a search, their query is also converted into a vector representation. The vector DB then searches for the vector representations that are most similar to the query vector representation. The results are then displayed to the user.

By using our vector search as a service, you can get more relevant search results, faster search times, and scalable performance.

Simplifies Smart Search Integration

Choosing Clarifai's turnkey smart search solution is better than building your own from scratch.

Without Clarifai, you would need to do the following in order to build smart search features into your own solutions:

  • Set up your own vector database instance. This involves choosing a database platform, installing the software, and configuring the database.
  • Build out the entire pipeline for turning images/text into embeddings. This involves using a computer vision or natural language processing (NLP) library to extract features from images or text, and then converting those features into a vector representation.
  • Insert or query the vector database. This involves using the database's API to add new data or search for existing data.

This process can be very time-consuming and complex, especially for developers who are not familiar with vector databases or machine learning.

Clarifai eliminates the need for developers to do all of this work by providing an out of the box solution for building state-of-the-art smart search capabilities.

We offer the following types of Smart Search options in our platform.

  • Smart Image Search — Allows you to retrieve images sorted by their visual relevance to a query in the form of:

    • Image — Provide a reference image of interest to compare inputs against.
    • Concept — Provide a trained concept to compare input predictions against.
    • Caption — Provide a full-text description to compare inputs against.
  • Smart Object Search — Allows you to retrieve annotated objects (bounding boxes within images) sorted by their visual relevance to a query in the form of:

    • Image — Provide a reference image of interest to compare inputs against.
    • Concept — Provide a trained concept to compare input predictions against.
    • Caption — Provide a full-text description to compare inputs against.
  • Smart Text Search — Allows you to retrieve text data sorted by their content and semantic similarity to a query in the form of:

    • Text — Provide a text description to compare input predictions against.
    • Concept — Provide a trained concept to compare input predictions against.