Skip to main content

Release 11.9

Release Date: October 13th, 2025

New Feature	Improvement	Bug Fix	Enterprise Only

Reasoning Engine

Status	Change	Details
	Clarifai has launched a Reasoning Engine optimized for agentic AI inference	In independent benchmarks conducted by Artificial Analysis on GPT-OSS 120B, the Clarifai Reasoning Engine set new records on standard GPUs: 544 tokens/sec throughput, 3.6s time-to-first-token, and $0.16 per million tokens. Learn more here.

Toolkits

Status	Change	Details
	Added support for toolkits	Added support for initializing models with the `vLLM`, `LMStudio`, and `Hugging Face` toolkits for local runners.

Published Models

Status	Change	Details
	Published new models	Published Qwen3-Next-80B-A3B-Thinking, a 80B-parameter, sparsely activated reasoning-optimized LLM that delivers near-flagship performance on complex reasoning tasks with extreme efficiency in training and ultra-long context inference (up to 256K tokens). Published Qwen3-30B-A3B-Instruct-2507, which improves comprehension, coding, multilingual knowledge, user alignment, and has 256K long-context handling. Published Qwen3-30B-A3B-Thinking-2507, which is an enhanced version with significantly improved reasoning, general capabilities, user alignment, and a long-context understanding.

B200s and GH200s

Status	Change	Details
	Added new Vultr cloud instances	Introduced competitively priced B200 instances operating from Seattle and GH200 instances powered by Vultr.

Platform

Status	Change	Details
	Made some platform improvements	Added short role descriptions in the Invite Member modal to clarify each member’s permissions when inviting them to an organization or modifying their access. In the Control Center, charts for computer vision models (displaying predictions) and for LLMs (displaying tokens) are now presented separately, each with its own usage and cost information.
	Fixed some bugs	Fixed an issue that prevented the signup/login modal from appearing on the Compute page (clarifai.com/compute).

Python SDK

Status	Change	Details
	Improved the Python SDK Learn more here	Introduced a new `patch_version` method in the Model class and integrated it into local runner workflows. Improved logging by highlighting example code scripts printed during local runner workflows. Changed the default local development model type from `text-to-text` to `any-to-any`. Fixed an issue with converting gRPC response enums to integers during runner creation. Fixed a `TypeError` when parsing checkpoint size from an environment variable. Enabled health probe support, allowing `ModelClass` implementations to define liveness/readiness checks. Improved interactive pipeline initialization with user prompts that replace placeholder TODO values. Implemented Git registry metadata capture during model upload with model-scoped change detection. Local runner now automatically uses the latest local-dev model version. Reduced friction by continuing to leverage a single prebuilt AMD base image. Updated type hints and docstring descriptions across all major files in the `clarifai/client` folder for better code quality, maintainability, and developer experience. Improved overall Model CLI UX with consolidated flags, clearer help text, and better error surfacing. Updated the `clarifai model predict` CLI to align with recent Pythonic model changes. Updated the local runner default API base URL. Refined logging in model and pipeline step builders for clearer diagnostics. Added pagination support to pipeline log monitoring, returning all entries beyond the first 50.

Reasoning Engine
Toolkits
Published Models
B200s and GH200s
Platform
Python SDK