Skip to main content

Visual Detector

Learn how to train a visual detector model using Clarifai SDKs

A Visual Detector is a computer vision component designed to identify and locate specific objects or patterns within images or video streams. You can learn more about Visual Detector here.

App Creation

The first part of model training includes the creation of an app under which the training process takes place.

Here we are creating an app with the app id as “demo_train” and the base workflow is set as “Universal”. You can change the base workflows to Empty, Universal, Language Understanding, and General according to your use case.

from clarifai.client.user import User
#replace your "user_id"
client = User(user_id="user_id")
app = client.create_app(app_id="demo_train", base_workflow="Universal")

Dataset Upload

The next step involves dataset upload. You can upload the dataset to your app so that the model accepts the data directly from the platform. The data used for training in this tutorial is available in the examples repository you have cloned.

#importing load_module_dataloader for calling the dataloader object in in the local data folder
from clarifai.datasets.upload.utils import load_module_dataloader

# Construct the path to the dataset folder
module_path = os.path.join(os.getcwd().split('/models/model_train')[0],'datasets/upload/image_detection/voc')

# Load the dataloader module using the provided function from your module
voc_dataloader = load_module_dataloader(module_path)

# Create a Clarifai dataset with the specified dataset_id ("image_dataset")
dataset = app.create_dataset(dataset_id="train_dataset")

# Upload the dataset using the provided dataloader and get the upload status

If you have followed the steps correctly you should receive an output that looks like this,


Choose The Model Type

First let's list the all available trainable model types in the platform,


Click here to know more about Clarifai Model Types.

Model Creation

From the above list of model types we are going to choose visual-detector as it is similar to our use case. Now let's create a model with the above model type.

MODEL_ID = "model_detector"
MODEL_TYPE_ID = "visual-detector"
# Create a model by passing the model name and model type as parameter
model = app.create_model(model_id=MODEL_ID, model_type_id=MODEL_TYPE_ID)

Template Selection

Inside the Clarifiai platform there is a template feature. Templates give you the control to choose the specific architecture used by your neural network, as well as define a set of hyperparameters you can use to fine-tune the way your model learns. We are going to choose the 'MMDetection_SSD' template for training our model.


Setup Model Parameters

You can update the model params to your need before initiating training.

# Get the params for the selected template
model_params = model.get_params(template='MMDetection_SSD')
# list the concepts to add in the params
concepts = [ for concept in app.list_concepts()]
model.update_params(dataset_id = 'train_dataset',concepts = concepts)
{'dataset_id': 'train_dataset',
'dataset_version_id': '',
'concepts': ['id-hamburger', 'id-ramen', 'id-prime_rib', 'id-beignets'],
'train_params': {'invalid_data_tolerance_percent': 5.0,
'template': 'Clarifai_ResNext',
'logreg': 1.0,
'image_size': 256.0,
'batch_size': 64.0,
'init_epochs': 25.0,
'step_epochs': 7.0,
'num_epochs': 65.0,
'per_item_lrate': 7.8125e-05,
'num_items_per_epoch': 0.0}}

Initiate Model Training

We can initiate the model training by calling the model.train() method. The Clarifai SDKs also offers features like showing training status and saving training logs in a local file.


If the status code is 'MODEL-TRAINED', then the user can know the Model is Trained and ready to use.

import time
#Starting the training
model_version_id = model.train()

#Checking the status of training
while True:
status = model.training_status(version_id=model_version_id,training_logs=False)
if status.code == 21106: #MODEL_TRAINING_FAILED
elif status.code == 21100: #MODEL_TRAINED
print("Current Status:",status)

Model Prediction

Since the model is trained and ready let’s run some predictions to view the model performance,

import cv2
import matplotlib.pyplot as plt
from urllib.request import urlopen
import numpy as np

IMAGE_PATH = os.path.join(os.getcwd().split('/models')[0],'datasets/upload/image_detection/voc/images/2008_008526.jpg')

prediction_response = model.predict_by_filepath(IMAGE_PATH, input_type="image",inference_params={'detection_threshold': 0.5})

# Get the output
regions = prediction_response.outputs[0].data.regions

img = cv2.imread(IMAGE_PATH)

for region in regions:
# Accessing and rounding the bounding box values
top_row = round(region.region_info.bounding_box.top_row, 3) * img.shape[0]
left_col = round(region.region_info.bounding_box.left_col, 3)* img.shape[1]
bottom_row = round(region.region_info.bounding_box.bottom_row, 3)* img.shape[0]
right_col = round(region.region_info.bounding_box.right_col, 3)* img.shape[1]

cv2.rectangle(img, (int(left_col),int(top_row)), (int(right_col),int(bottom_row)), (36,255,12), 2)

# Get concept name
concept_name =[0].name

# Display text
cv2.putText(img, concept_name, (int(left_col),int(top_row-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)

Image Output

Model Evaluation

Now let's evaluate the model using train and test datasets. First let's see the evaluation metrics for training dataset,

# Evaluate the model on a specific dataset with ID 'train_dataset'
model.evaluate(dataset_id='train_dataset', eval_id='one')

# Get the evaluation result by its ID 'one'
result = model.get_eval_by_id(eval_id="one")

mean_avg_precision_iou_50: 1.0
mean_avg_precision_iou_range: 0.9453125

Before evaluating with a test dataset, we have to first upload the dataset using the data loader and then perform model evaluation,

# Set the path to the module containing the data

# Load the dataloader module from the specified path
voc_dataloader = load_module_dataloader(PATH)

# Create a new dataset object with a unique ID 'test_dataset_1'
test_dataset = app.create_dataset(dataset_id="test_dataset_1")

# Upload the dataset using the previously loaded dataloader

# Evaluate the model using the uploaded dataset, with evaluation ID 'two'
model.evaluate(dataset_id='test_dataset_1', eval_id='two')

# Retrieve the evaluation result with ID 'two' for the model
result = model.get_eval_by_id("two")

# Print the summary of the evaluation result
mean_avg_precision_iou_50: 1.0
mean_avg_precision_iou_range: 0.9555555582046509

Finally let's compare the results from multiple datasets using EvalResultCompare feature from Clarifai SDKs to get a better understanding of the model's performance.

# Importing the EvalResultCompare class from the clarifai.utils.evaluation module
from clarifai.utils.evaluation import EvalResultCompare

# Creating an EvalResultCompare object with specified models and datasets
eval_result = EvalResultCompare(models=[model], datasets=[dataset, test_dataset])

# Printing a detailed summary of the evaluation result
(     Concept  Average Precision  Total Labeled  True Positives  \
0 id-cow 1.0 2 2
0 id-horse 1.0 1 1
0 id-bottle 1.0 2 2
0 id-sofa 1.0 1 1
0 id-bird 1.0 1 1
0 id-cat 1.0 2 2
0 id-dog 1.0 1 1
0 id-person 1.0 8 8
0 id-dog 1.0 1 1
0 id-person 1.0 3 3

False Positives False Negatives Recall Precision F1 \
0 0 0 1.0 0.841 0.913634
0 0 0 1.0 0.783 0.878295
0 0 0 1.0 0.819 0.900495
0 0 0 1.0 0.769 0.869418
0 0 0 1.0 0.790 0.882682
0 0 0 1.0 0.836 0.910675
0 0 0 1.0 0.763 0.865570
0 0 0 1.0 0.940 0.969072
0 0 0 1.0 0.763 0.865570
0 0 0 1.0 0.884 0.938429

0 train_dataset2
0 train_dataset2
0 train_dataset2
0 train_dataset2
0 train_dataset2
0 train_dataset2
0 train_dataset2
0 train_dataset2
0 test_dataset_1
0 test_dataset_1 ,
Total Concept Average Precision Total Labeled True Positives \
0 Dataset:train_dataset2 1.0 18 18
0 Dataset:test_dataset_1 1.0 4 4

False Positives False Negatives Recall Precision F1
0 0 0 1.0 1.0 1.0
0 0 0 1.0 1.0 1.0 )