Ant Group's Ling 2.6 Flash scores 26 on the Artificial Analysis Intelligence Index, a 10-point jump from Ling-flash-2.0. It is one of few recent open weights releases focused on non-reasoning capabilities and focuses on a reasonable cost to intelligence ratio.
Ling 2.6 Flash is a non-reasoning model from Ant Group's
@TheInclusionAI lab. Ant Group's model family comprises three series: Ling (non-reasoning), Ring (reasoning), and Ming (multimodal). Ling-flash-2.0 was the previous flash-tier non-reasoning model. Ling 2.6 Flash is expected to be open weights shortly after release, but as of today the weights have not been released on Hugging Face.
Key takeaways:
➤ At 104B total parameters with 7.4B active parameters, Ling 2.6 Flash (26) sits in intelligence near GPT-5.4 nano (Non-Reasoning, 24) and Gemma 4 26B A4B (Non-reasoning, 27), both models with comparable active parameter counts. However, at 18 points behind GLM-5.1 (Non-reasoning, 44), there remains a gap to frontier non-reasoning open weights models
➤ Ling 2.6 Flash is comparatively token efficient, using ~15M output tokens to run the Intelligence Index. This is comparable to Gemma 4 26B A4B (~14M) but a fraction of Qwen3.5 9B (~78M). Compared to models in the similar intelligence tier, Ling 2.6 Flash represents a reasonable efficiency tradeoff, which has positive effects on cost when deployed on larger workloads. At a price of $0.1 / million input tokens and $0.3 / million output tokens, Ling 2.6 Flash costs only ~$23 to run the full Artificial Analysis Intelligence Index.
➤ Gains from Ling-flash-2.0 were driven mostly by improvements agentic capabilities and instruction following. τ²-Bench jumped from 21% to 86% ( 65 points), IFBench from 34% to 57% ( 23 points), and GDPval-AA Elo from 425 to 783 ( 84%). Conversely, GPQA Diamond fell from 66% to 59% (-6 points) and SciCode from 29% to 27% (-2 points).
➤ AA-Omniscience performance is at -66 with 15% accuracy and 96% hallucination rate. This is consistent with the model's small 7.4B active parameter count. Knowledge recall benefits from larger parameter counts, and sub-10B active-parameter models systematically underperform on this metric.
Additional model details:
➤ Architecture: MoE, 104B total parameters, 7.4B active parameters
➤ Context window: 262K tokens (doubled from 128K for Ling-flash-2.0)
➤ Pricing: $0.10 / $0.30 per 1M input/output tokens (via Novita API)
➤ License: Weights not yet released ➤ Availability: Third party API through
@novita_labs