Sveriges mest populära poddar

TalkRL: The Reinforcement Learning Podcast

Arash Ahmadian on Rethinking RLHF

34 min • 25 mars 2024

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.

Featured Reference

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker


Additional References

Kategorier
Förekommer på
00:00 -00:00