Tag
1 article
Researchers have developed a complete multimodal RLVR pipeline using the TuringEnterprises/Open-MM-RL dataset, integrating vision-language prompting, reward scoring, and GRPO export capabilities.