JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines

JetBrains has released Mellum2, a 12-billion parameter MoE model trained on 10.6 trillion tokens, designed to accelerate specialized AI tasks in multi-model pipelines.

JetBrains, the renowned software development company known for its integrated development environments (IDEs), has unveiled Mellum2, a new 12-billion parameter mixture-of-experts (MoE) model designed to accelerate specialized AI tasks within multi-model pipelines. The release, made under the Apache 2.0 open-source license, marks a significant step forward in the company's commitment to advancing AI capabilities for developers and enterprises.

Training and Performance

Mellum2 was trained on an impressive dataset of 10.6 trillion tokens, enabling it to excel in a variety of AI workflows. Unlike traditional models that process all data through a single, large neural network, Mellum2 leverages MoE architecture, where only a subset of experts is activated for specific tasks. This approach allows for more efficient processing, particularly for complex and compute-intensive operations, while maintaining high performance standards.

Use Cases and Implications

The model is particularly suited for environments where developers need to integrate multiple AI models into a single workflow. Its specialized architecture makes it ideal for tasks such as code generation, automated testing, and intelligent debugging—functions that are increasingly vital in modern software development. By enabling faster execution and reduced computational overhead, Mellum2 could significantly enhance productivity in development teams using AI-assisted tools.

With this release, JetBrains continues to push the boundaries of how AI can be effectively integrated into software development tools. The open-source nature of Mellum2 also invites collaboration and innovation from the broader developer community, potentially accelerating advancements in AI-driven development practices.

JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines

Training and Performance

Use Cases and Implications

Related Articles

Music streamer Deezer says more than 50% of daily uploads are AI-generated

Google launches a cheaper alternative to large AI security models like Mythos

US threatens sanctions against Chinese AI models over IP theft