Skip to main content

Release 12.3

Release Date: April 8th, 2026



New FeatureImprovementBug FixEnterprise Only
new-featureimprovementbugenterprise

Agent Skills

StatusChangeDetails
new-featureReleased Clarifai Skills
  • Clarifai Skills are specialized prompt templates that transform AI coding assistants — Claude Code, Cursor, Codex, and more — into Clarifai platform experts.
  • Learn more about them here.

Model Training

StatusChangeDetails
new-featureAdded model pipeline training from templates
  • You can now train models using pipeline templates, enabling a streamlined, configuration-driven training workflow.
DeprecationDeprecated legacy model training (Triton + Kubeflow)
  • Legacy model training methods using Triton and Kubeflow have been deprecated.
  • Note: Transfer Learn training remains available.

Request Routing

StatusChangeDetails
improvementImproved how Clarifai routes prediction requests for optimal performance
  • Added KV cache affinity to route requests to replicas with relevant cache state.
  • Added session-aware routing to keep user requests on the same replica.
  • Reduced cold starts with automatic pre-warming of popular instances.
  • Added prediction caching for identical input + model + version combinations.
  • Learn more about them here.

UI Updates

StatusChangeDetails
improvementUpdated Compute List and View pages
  • Refreshed the List and View pages for deployments, nodepools, and clusters with improved layouts and information display.
improvementUpdated the log viewer component
  • Improved the log viewer UI for better readability and navigation of model and runner logs.
improvementUpdated the Home experience
  • Refreshed the Clarifai platform Home page with an improved experience for navigating resources and getting started.
improvementUpdated the model page UI
  • Redesigned model page with an improved layout and user experience.

Python SDK

Model Serving & Deployment

StatusChangeDetails
new-featureAdded clarifai model deploy command and simplified clarifai model init
  • New clarifai model deploy command with multi-cloud GPU discovery and a zero-prompt deployment flow.
  • Simplified config.yaml structure for model initialization.
improvementSmart resource reuse and private-by-default for clarifai model serve
  • Model serve now reuses existing resources when available instead of creating new ones.
  • Served models are private by default.
improvementAdded --keep flag to clarifai model serve
  • Use --keep to preserve the build directory after serving, useful for debugging and inspecting build artifacts.
improvementLocal Runner is now public by default
  • Models launched via the local runner are now publicly accessible by default, removing the need to manually set visibility.

Model Runner

StatusChangeDetails
new-featureAdded VLLMOpenAIModelClass
  • New VLLMOpenAIModelClass parent class with built-in cancellation support and health probes for vLLM-backed models.
improvementOptimized model runner memory and latency
  • Reduced memory footprint and improved response latency in the model runner.
  • Streamlined overhead in SSE (Server-Sent Events) streaming.
improvementAuto-detect and clamp max_tokens
  • The runner now automatically detects the backend's max_seq_len and clamps max_tokens to that value, preventing out-of-range errors.

Bug Fixes

StatusChangeDetails
bugFixed reasoning model token tracking and streaming in agentic class
  • Fixed token tracking for reasoning models to correctly account for reasoning tokens.
  • Fixed event-loop safety, streaming, and tool call passthrough in the agentic class.
bugFixed user/app context conflicts in CLI
  • Resolved conflicts between user_id and app_id when using named contexts in CLI commands.
bugFixed clarifai model init directory handling
  • clarifai model init now correctly updates an existing model directory instead of creating a subdirectory.