Skip to main content

Create Your Own Template

Learn how to create your own custom deep trained template

The Clarifai platform empowers advanced users to create their deep trained templates. You can customize your own templates to suit your specific needs and tasks.

This flexibility allows you to leverage Clarifai's advanced machine learning capabilities and customize various template hyperparameters—such as head, neck, backbone, and loss functions—to influence “how” your model learns.

When you select an MMDetection, MMClassification, or a MMSegmentation template, a Custom config field will appear that allows you to provide a Python file that details the configurations of your template.

For this example, we’ll demonstrate how you can create your own template using the MMDetection open source toolbox for visual detection tasks.

MMDetection Configurations

MMDetection, developed by OpenMMLab, is a user-friendly toolbox based on PyTorch for object detection, instance segmentation, and panoptic segmentation tasks. It is designed to facilitate research and development in the field of object detection and instance segmentation.

MMDetection provides a comprehensive collection of state-of-the-art models, datasets, and evaluation metrics, making it a valuable resource for both academic and industrial applications.

You can configure the MMDetection toolbox and create a unique model template with its own hyperparameters. By tweaking the various settings, you can tailor the template to match your specific tasks and improve its performance.

When configuring a MMDetection file, you need to set up the following basic component types under config/_base_:

  • Model
  • Dataset
  • Learning rate schedule
  • Runtime

Let’s talk about each of them, and other associated components.

Base Configuration

To make the configuration as easy as possible, MMDetection provides base configurations for many models, which you can then customize. If used, the base config is specified by the _base_ variable, which points to a config file relative to the parent directory /mmdetection/.

You can find all available pre-build configs here.

Here is an example:

_base_ = '/mmdetection/configs/yolof/'

In the above example, the _base_ field indicates that this configuration file is based on another existing configuration file located at /mmdetection/configs/yolof/

This means that the current configuration file inherits settings and parameters from the base configuration file, and any modifications made in the current file will override or extend the base configuration. This base configuration file serves as a template or starting point, providing the fundamental settings and components for the detector model.

In this particular instance, the base configuration file defines the overall architecture of a YOLOv5 detector model, including the backbone network, neck, head, and other components.

By inheriting from this base configuration, the current file can leverage all these predefined settings and focus on modifying specific aspects of the model, such as hyperparameters, training settings, or inference options.


MMDetection contains high-quality codebases for many popular models and task-oriented modules. You can find a list of all pre-built models supported by MMDetection here.

MMDetection lets you categorize model components into the following different parts.

  • Backbone—This is the part of the architecture that transforms the input images into raw feature maps. It is typically a pre-trained model, such as ResNet or MobileNet, that has been trained on a large dataset of images.
  • Neck—This is the component that connects the backbone with heads and performs reconfigurations and refinements on the raw feature maps so that heads can further process them.
  • DenseHead (AnchorHead/AnchorFreeHead) —This part processes the dense locations of the feature maps fed by the neck.
  • RoIExtractor—This part identifies the region of interest (RoI) and extracts RoI features from the feature maps.
  • RoIHead (BBoxHead/MaskHead) —This part takes RoI features as its input and makes predictions, such as bounding boxes classification or mask prediction as per the task assigned. To adapt the number of classes for your dataset, consider modifying the num_classes parameter within the bbox_head.
  • Loss— This is the part in the head that measures the difference between the predictions of the model and the ground truth labels, such as GHMLoss, F1Loss, and FocalLoss.

The whole network is built as a series of pipelines so that end-to-end training is made simple with any kind of network. During training, the whole network is traversed in the forward and backward directions over iterations.


The data configuration section in an MMDetection file is meant to define how data is loaded and processed during training and validation.

It specifies the number of samples processed per GPU, the number of data-loading workers, and the paths to annotation files, image directories, and class names for training and validation datasets.

Learning Rate Schedule

The learning rate schedule is specified in the lr_config section of the MMDetection configuration file. It controls how the learning rate changes during the training process.

You need to set the learning rate scheduling policy to use.

For example, you can set it to CosineAnnealing, which means the learning rate will follow a cosine annealing schedule. Cosine annealing is a popular learning rate schedule where the learning rate decreases and increases in a cosine-like manner over the course of training epochs. It is often used to help the model converge more effectively.

You can also incorporate a warmup strategy into the training process. This technique is used to stabilize training in the early stages. It gradually increases the learning rate from a minimal value to the desired target value, preventing large oscillations in the loss function and leading to more stable and reliable convergence.


In the MMDetection configuration file, the runner section defines runtime-related settings for the training process.

It specifies how the training process is organized, including details about the type of runner used, number of training epochs, and other runtime-related parameters.


In an MMDetection file, the optimizer settings are specified in the optimizer section. The optimizer is a crucial component of training deep learning models, responsible for updating the model's weights during the training process.

MMDetection already supports all the optimizers implemented in PyTorch. So, you can conveniently adjust the optimizer choice, learning rate, and other hyperparameters in the optimizer field.


Sometimes it's necessary to delete all the existing keys in a dictionary and replace them with a new set of keys. In MMDetection, this can be achieved by setting the _delete_=True flag in the target field. This flag instructs the configuration system to remove all keys in the dictionary except for the ones explicitly defined in the new configuration. If not used, the dict that is being defined is merged to the _base_ config, which might define an invalid configuration.


Here is an example of a file for creating a custom deep-trained template using the MMDetection toolbox.

# Base Configuration File
# This configuration file extends an existing YOLOF model configuration.

_base_ = '/mmdetection/configs/yolof/'

# Model Configuration
model = dict(
type='YOLOF', # Specify the YOLOF model
pretrained='torchvision://resnet50', # Pretrained model (if available)
type='ResNet', # Specify the backbone network (e.g., 'ResNet')
depth=50 # Specify the depth of the backbone (e.g., ResNet-50)
type='YOLOFNeck', # Specify the neck architecture (e.g., 'YOLOFNeck')
in_channels=[256, 512, 1024, 2048], # Input channels from the backbone
out_channels=256, # Output channels for the neck
num_csp_blocks=4 # Number of CSP blocks in the neck
num_classes=80, # Number of object classes in your dataset. It must be included with any value, and it will be updated based on your dataset's number of classes
in_channels=256, # Number of input channels from the neck
num_levels=5, # Number of levels used in the detection head
reg_decoded_bbox=True, # Whether to decode bounding box regression targets
loss_bbox=dict(type='CIoULoss', loss_weight=1.0), # Specify the bounding box loss type
loss_conf=dict(type='CIoULoss', loss_weight=1.0) # Specify the confidence loss type
roi_feat_size=7, # RoI feature size
roi_out_channels=256, # RoI feature output channels

# Data Configuration
# This section should include 'train' and 'val' sections, each with 'ann_file', 'img_prefix', and 'classes' fields with empty strings as values
# These values will be overwritten to be compatible with Clarifai's system, but must be included in the imported config
data = dict(
# Data Loader Configuration
samples_per_gpu=4, # Number of samples processed on each GPU
workers_per_gpu=4, # Number of data-loading workers per GPU
ann_file='', # Path to training dataset annotations
img_prefix='', # Directory containing training images
classes='' # List of class names in your dataset
ann_file='', # Path to validation dataset annotations
img_prefix='', # Directory containing validation images
classes='' # List of class names in your dataset


# Optimizer Configuration
optimizer = dict(
_delete_=True, # Delete existing optimizer settings
type='Adam', # Use the Adam optimizer
lr=0.0001, # Set the learning rate
weight_decay=0.0001 # Weight decay for regularization

# Learning Rate Schedule Configuration
lr_config = dict(
_delete_=True, # Delete existing learning rate schedule settings
policy='CosineAnnealing', # Learning rate schedule policy
warmup='linear', # Warm-up strategy
warmup_iters=1000, # Number of warm-up iterations
warmup_ratio=0.1, # Ratio for warm-up learning rate
min_lr_ratio=1e-5 # Minimum learning rate ratio

# Runner Configuration
runner = dict(
_delete_=True, # Delete existing runner settings
type='EpochBasedRunner', # Training based on the number of epochs
max_epochs=1 # Maximum number of training epochs