Tag
16 articles
Learn to implement and evaluate a hybrid MoE-diffusion model that demonstrates the performance benefits of converting autoregressive LLMs into diffusion models for improved inference speed.
Meta and Stanford researchers introduce the Fast Byte Latent Transformer, reducing inference memory bandwidth by over 50% without subword tokenization.
Learn how to set up and use FlashKDA, an open-source high-performance implementation of Kimi Delta Attention from Moonshot AI, for accelerating attention computation in large language models.
Learn to implement compressed sparse attention mechanisms that enable processing one-million-token context windows, similar to DeepSeek-V4's approach.
An open-source project called OpenMythos attempts to reconstruct Anthropic's Claude Mythos architecture from first principles, achieving 1.3B-level performance with only 770M parameters through advanced modeling techniques.
Learn how Parcae, a new AI architecture, helps language models become more efficient and powerful without needing to be twice the size. Understand how this breakthrough could make AI more sustainable and accessible.
This article explains how AI-driven content automation works in journalism and examines the labor implications of AI integration in newsrooms, as demonstrated by the ProPublica strike.
This article explains the technical aspects of embedding models and how Microsoft's Harrier model achieves superior multilingual performance while remaining compact and efficient.
Learn to analyze emotional-like representations in language models using transformer activation analysis, attention visualization, and behavioral pattern detection techniques.
Learn how Falcon Perception is a new AI system that combines image and language processing to better understand natural language prompts and find specific objects in images.
This explainer explores the concept of General Artificial Intelligence (AGI) and how OpenAI's Greg Brockman believes GPT reasoning models are on a clear path toward achieving it, using the term 'line of sight' to describe this trajectory.
Explore the significance of Hugging Face's TRL v1.0, a unified framework for aligning large language models through post-training techniques like SFT, Reward Modeling, DPO, and GRPO.