Skip to main content

Tracker

Learn about our tracker operators


Tracker operators are a specific type of agent system operators that are designed for object tracking in computer vision. Object tracking involves following the movement of objects in a sequence of images or frames in a video. Tracker models are trained using machine learning techniques to learn patterns and features that help them identify and track objects over time.

The goal of object tracking is to maintain the identity of the object(s) over time, despite changes in position, scale, orientation, and lighting conditions.

tip

Since the tracker operators can be "chained" together with models to automate tasks in a workflow, you can learn how to create workflows here.

BYTE Tracker

Input: frames[…].data.regions[…].data.concepts, frames[…].data.regions[…].region_info.bounding_box

Output: frames[…].data.regions[…].track_id

BYTE Tracker is a multi-object tracking by-detection model built upon the Simple Online and Real-time Tracking (SORT) principles. Multi-object tracking aims to predict the bounding boxes and identities of objects within video sequences.

Most tracking techniques retrieve identities by associating detection boxes whose scores are higher than a threshold. Unlike simpler trackers that ditch detections with low confidence scores, BYTE Tracker considers them, too, making it better at handling situations like temporary occlusions or lighting changes.

Typically, it works in two stages:

  1. High Confidence Matches: First, BYTE Tracker focuses on high-scoring detections (bounding boxes around objects). It uses a combination of motion similarity (how much the object moved between frames) and appearance similarity (features extracted from the object) to match these detections with existing tracks (tracklets). A motion prediction technique is then used to predict the position of these tracks in the next frame.

  2. Low Confidence Recovery: Here's where BYTE Tracker differs. It revisits the low confidence detections (discarded by simpler trackers) and unmatched tracklets from the previous stage. Using the same motion similarity metric, BYTE Tracker tries to re-associate these with each other, potentially recovering tracks that were lost due to occlusions or low initial confidence.

With this powerful operator, you can seamlessly integrate object tracking into your detect-track workflows and unlock advanced capabilities. Let's demonstrate how you can use the BYTE Tracker, alongside a detection model, to efficiently track objects in videos.

1. Go to the workflow builder page. Search for the visual-detector option in the left-hand sidebar and drag it onto the empty workspace. Then, use the pop-up that appears on the right-hand sidebar to search for a detection model, such as general-image-detection, and select its version. You can also set the other configuration options — including selecting the concepts you want to filter.

2. Search for the byte-tracker option in the left-hand sidebar and drag it onto the workspace. You can set up its output configuration parameters, which are outlined below.

3. Connect the visual-detector model with the byte-tracker operator and save your workflow.

To observe it in action, navigate to the workflow's individual page and click the + button to input your video. For this example, let's provide this video.

The workflow will analyze the video and identify objects consistently throughout its duration.

Centroid Tracker

Input: frames[…].data.regions[…].data.concepts, frames[…].data.regions[…].region_info.bounding_box

Output: frames[…].data.regions[…].track_id

Centroid trackers rely on the Euclidean distance between centroids of regions in different video frames to assign the same track ID to detections of the same object.

Here's a breakdown of how they operate:

  1. Object Detection: In the first step, an object detector or a segmentation model (not part of the centroid tracker itself) identifies objects in each frame of a video. The detector outputs bounding boxes around the identified objects.

  2. Centroid Calculation: For each bounding box, the centroid tracker calculates its centroid. The centroid is simply the center point of the box, typically represented by its X and Y coordinates.

  3. Distance Comparison: The tracker then compares the centroids of objects detected in the current frame with the centroids of objects from the previous frame. It calculates the Euclidean distance, which is a straight-line distance between two points in space.

  4. Track Assignment: Based on a predefined threshold value, the tracker assigns track IDs. Objects in the current frame whose centroids are within a certain distance of a centroid in the previous frame are considered to be the same object and are assigned the same track ID. Objects with centroids exceeding the threshold distance are assumed to be new objects and assigned new track IDs.

Let's demonstrate how you can use the centroid tracker, alongside a detection model, to efficiently track objects in videos.

1. Go to the workflow builder page. Search for the visual-detector option in the left-hand sidebar and drag it onto the empty workspace. Then, use the pop-up that appears on the right-hand sidebar to search for a detection model, such as general-image-detection, and select its version. You can also set the other configuration options — including selecting the concepts you want to filter.

2. Search for the centroid-tracker option in the left-hand sidebar and drag it onto the workspace. You can set up its output configuration parameters, which are outlined below.

3. Connect the visual-detector model with the centroid-tracker operator and save your workflow.

To observe it in action, navigate to the workflow's individual page and click the + button to input your video. For this example, let's provide this video.

The workflow will analyze the video and identify objects consistently throughout its duration.

Neural Tracker

Output: Regions

Neural tracker uses neural probabilistic models to perform filtering and association.

Kalman Filter Hungarian Tracker

Output: Regions

Kalman filter trackers rely on the Kalman filter algorithm to estimate the next position of an object based on its position and velocity in previous frames. Then detections are matched to predictions by using the Hungarian algorithm.

Kalman Reid Tracker

Output: Regions

Kalman reid tracker is a Kalman filter tracker that expects the embedding proto field to be populated for detections, and reassigns track IDs based off of the embedding distance.

Neural Lite Tracker

Output: Regions

Neural lite tracker uses lightweight trainable graphical models to infer states of tracks and perform associations using the hybrid similarity of IoU and centroid distance.

Tracker Operators Parameters

Here is a table outlining the various output configuration parameters you can configure for each operator (the symbol represents the operator that supports the parameter).

ParameterDescriptionBYTE TrackerCentroid TrackerNeural TrackerKalman Filter Hungarian TrackerKalman Reid TrackerNeural Lite Tracker
min_confidenceThis is the minimum confidence score for detections to be considered for tracking
min_visible_framesOnly return tracks with minimum visible frames > min_visible_frames
track_id_prefixPrefix to add on to track and eliminate conflicts
max_disappearedThis is the number of maximum consecutive frames a given object is allowed to be marked as “disappeared” until we need to deregister the object from tracking
new_track_confidence_threshInitialize a new track if the confidence score of the new detection is greater than the setting
confidence_threshThis is used to categorize high score detections for the first association if their scores are greater, and the second association if not
high_confidence_match_threshThe distance threshold for high-score detection
low_confidence_match_threshThe distance threshold for low-score detection
unconfirmed_match_threshThe distance threshold for unconfirmed tracks, usually tracks with only one beginning frame. {“min”: 0, “max”: 1}
max_distanceAssociate tracks with detections only when their distance is below max_distance
filtered_probabilityIf false, return original detection probability; if true, return processed probability from the tracker
max_detectionMaximum detection per frame
has_probability
has_embedding
association_confidenceThe list of association confidences to perform for each round
covariance_errorMagnitude of the uncertainty on the initial state
observation_errorMagnitude of the uncertainty on detection coordinates
distance_metricDistance metric for Hungarian matching
initialization_confidenceConfidence for starting a new track. Must be > min_confidence to have an effect
project_trackHow many frames in total to the project box when detection isn’t recorded for track
use_detect_boxHow many frames to project the last detection box, should be less than project_track_frames (1 is the current frame)
project_without_detectWhether to keep projecting the box forward if no detect is matched
project_fix_box_sizeWhether to fix the box size when the track is in a project state
detect_box_fall_backRely on the detect box if the association error is above this value
keep_track_in_imageIf this is 1, then push the tracker predict to stay inside image boundaries
match_limit_ratioMultiplier to constrain association (< 1 is ignored) based on other associations
match_limit_min_matchesMinimum number of matched tracks needed to invoke match limit
optimal_assignmentIf True, rule out pairs with distance > max_distance before assignment
max_emb_distanceMaximum embedding distance to be considered a re-identification
max_deadMaximum number of frames for track to be dead before we re-assign the ID
var_trackerString that determines how embeddings from multiple timestamps are aggregated, defaults to “na” (most recent embedding overwrites past embeddings)
reid_model_pathThe path to the linker
iou_dist_ratioIf 1.0 purely IoU similarity, if 0.0 purely centroid distance similarity
mortal_thMortality threshold
min_box_areaMinimum area of a valid box
min_activityReturns only tracks with activities above min_activity
nms_iou_thNMS IoU threshold
shrink_factorChange box size by shrink_factor