Tag

#supervised learning

2 articles

A Coding Guide on LLM Post Training with TRL from Supervised Fine Tuning to DPO and GRPO Reasoning

Learn how to improve large language models using post-training techniques like Supervised Fine-Tuning, Reward Modeling, DPO, and GRPO with the TRL library.

May 148

NVIDIA AI Introduces PivotRL: A New AI Framework Achieving High Agentic Accuracy With 4x Fewer Rollout Turns Efficiently

Learn how NVIDIA's new PivotRL framework improves AI training efficiency by combining supervised learning and reinforcement learning techniques to achieve better performance with fewer attempts.

Mar 2449