Tag
2 articles
Learn how to improve large language models using post-training techniques like Supervised Fine-Tuning, Reward Modeling, DPO, and GRPO with the TRL library.
Learn how NVIDIA's new PivotRL framework improves AI training efficiency by combining supervised learning and reinforcement learning techniques to achieve better performance with fewer attempts.