Skip to main content

Input and Output Data Types

Learn about supported input and output data types along with usage examples


Clarifai's model framework supports rich data typing for both inputs and outputs, enabling flexible and type-safe model development.

Note that when preparing the model.py file for uploading a model to the Clarifai platform, each parameter in the class methods must be annotated with a type, and the return type must also be specified.

The supported types are categorized into Core Primitive, Python Primitive & Generic, and Custom Structured types.

Core Primitive Types

These are fundamental data types supported by Clarifai's framework for handling common data formats.

TypePython ClassDescriptionInitialization Examples
TextTextUTF-8 encoded textText("Hello World") Text(url="https://example.com/text.txt")
ImageImageRGB images (PNG/JPG format)Image(bytes=b"") Image(url="https://example.com/image.jpg") Image.from_pil(pil_image)
AudioAudioAudio data (WAV/MP3 format)Audio(bytes=b"") Audio(url="https://example.com/audio.mp3")
VideoVideoVideo data (MP4/AVI format)Video(bytes=b"") Video(url="https://example.com/video.mp4")
FrameFrameVideo frame with metadataFrame(time=1.5, image=Image(...))
ConceptConceptLabel with confidence scoreConcept("cat", 0.97) Concept(name="dog", value=0.92)
RegionRegionBounding box and list of ConceptsRegion(box=[0.7, 0.3, 0.9, 0.7], [Concept("cat", 0.7), Concept(name="dog", value=0.2)])
NameFieldsdictStructured data{"scores": [0.9, 0.1]}

Python Primitive & Generic Types

Clarifai's framework also supports standard Python primitive and generic types for flexible data handling. These types enable type-safe processing of complex structures while maintaining compatibility with Python's native type system.

Standard Python Primitive Types

These fundamental data types are supported as both inputs and outputs.

TypeExample InputsExample Outputs
int42, user_age: int = 30return 100
float0.95, temperature: float = 36.6return 3.14159
str"Hello", prompt: str = "Generate..."return "success"
boolTrue, flag: bool = Falsereturn is_valid
bytesb'raw_data', file_bytes: bytesreturn processed_bytes
NoneNonereturn None

Here is an example of a primitive type usage:

class MyModel(ModelClass):

@ModelClass.method
def calculate_bmi(
self,
height_cm: float,
weight_kg: float
) -> float:
"""Calculate Body Mass Index"""
return weight_kg / (height_cm/100) ** 2

Generic Container Types

Clarifai supports generic types for handling complex structures while maintaining compatibility with Python’s type system.

List[T]

This handles homogeneous collections of any supported type, such as a list of images.

Here is an example of using List[T] for batch processing:

class MyModel(ModelClass):

def load_model(self):
self.model = ...

@ModelClass.method
def predict_images(self, images: List[Image]) -> List[str]:
"""Process multiple images simultaneously"""
return [self.model(img) for img in images]

Here is a client usage example:

images = [
Image(file_path="img1.jpg"),
Image(url="https://example.com/img2.png")
]
predictions = model.predict_images(images=images)

Dynamic Batch Prediction Handling

Clarifai's model framework automatically handles both single and batch predictions through a unified interface. It dynamically adapts to the input format, eliminating the need for code changes to support different input types.

Input detection is carried out automatically by:

  • Single input — Automatically processed as a singleton batch.

  • Multiple inputs — When inputs are provided as a list, the system processes them as a parallel batch.

This flexibility allows you to pass either a single input or a list of inputs, and the system will handle them appropriately without requiring additional configuration.

Here is an example of a model configuration that supports both single and batch predictions:

class TextClassifier(ModelClass):
@ModelClass.method
def predict(self, text: Text) -> float:
"""Single text classification (automatically batched)"""
return self.model(text.text)

Here is a client usage example:

# Single prediction
single_result = model.predict(Text("Positive review"))

# Batch prediction
batch_results = model.predict([
Text("Great product"),
Text("Terrible service"),
Text("Average experience")
])

Dict[K, V]

This supports JSON-like structures with string keys.

Here is an example of using Dict[K, V] for handling model configuration:

class MyModel(ModelClass):

@ModelClass.method
def configure_model(
self,
params: Dict[str, float]
) -> Dict[str, str]:
"""Update model parameters"""
self.threshold = params.get('threshold', 0.5)
return {"status": "success", "new_threshold": str(self.threshold)}

Tuple[T1, T2, ...]

This handles fixed-size heterogeneous data.

Here is an example of using Tuple[T1, T2, ...] for multi-output models:

class MyModel(ModelClass):

@ModelClass.method
def analyze_document(
self,
doc: List[Text]
) -> Tuple[List[Text], Dict[str, float]]:
"""Return keywords and sentiment scores"""
return (doc, {"docs": len(doc)})

Custom Structured Types with NamedFields

The NamedFields class enables creation of custom structured data types for handling complex inputs and outputs. This is particularly useful for models requiring multi-field data or producing compound results.

Here is an example of using NamedFields to define a custom document metadata type:

DocumentMetadata = NamedFields(
author=str,
title=str,
page_count=int,
keywords=List[str]
)
class MyModel(ModelClass):

@ModelClass.method
def process_document(
self,
content: Text,
metadata: DocumentMetadata
) -> NamedFields(
summary=Text,
sentiment=float,
topics=List[str]):
...

Here is an example of streaming complex structured data using Stream[NamedFields]:

class RealTimeAnalytics(ModelClass):
@ModelClass.method
def monitor_sensors(
self,
sensor_stream: Stream[NamedFields(
temperature=float,
pressure=float,
timestamp=float
)]) -> Stream[NamedFields(
status=str,
anomaly_score=float
)]:
for reading in sensor_stream:
yield self._analyze_reading(reading)

Here is a client usage example:

sensor_data = [
{"temperature": 25.6, "pressure": 1013, "timestamp": 1625097600},
{"temperature": 26.1, "pressure": 1012, "timestamp": 1625097610},
{"temperature": 27.5, "pressure": 1011, "timestamp": 1625097620}
]

for status in model.monitor_sensors(iter(sensor_data)):
if status.anomaly_score > 0.9:
return True