Tag
1 article
Learn how to improve large language models using post-training techniques like Supervised Fine-Tuning, Reward Modeling, DPO, and GRPO with the TRL library.