Tag

#parameter efficiency

3 articles

Liquid AI Releases LFM2.5-8B-A1B: An On-Device MoE Model With 8.3B Total and 1.5B Active Parameters

This explainer explores the advanced Mixture of Experts (MoE) architecture used in Liquid AI's LFM2.5-8B-A1B model, examining how sparse parameter activation enables powerful on-device AI capabilities.

May 2834

Qwen3.6-27B beats much larger predecessor on most coding benchmarks

This article explains how Alibaba's Qwen3.6-27B model outperforms its much larger predecessor on coding benchmarks, highlighting advancements in parameter efficiency and model optimization techniques.

Apr 2586

NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities

This explainer article dives into NVIDIA's Nemotron-Cascade 2, an advanced Mixture-of-Experts (MoE) model that demonstrates how strategic parameter allocation can enhance reasoning capabilities while maintaining computational efficiency.

Mar 20107