What motivates cooperative inverse reinforcement learning? What can we gain from recontextualizing our safety efforts from the CIRL point of view? What possible role can pre-AGI systems play in amplifying normative processes?
Cooperative Inverse Reinforcement Learning with Dylan Hadfield-Menell is the eighth podcast in the AI Alignment Podcast series, hosted by Lucas Perry and was recorded at the Beneficial AGI 2019 conference in Puerto Rico. For those of you that are new, this series covers and explores the AI alignment problem across a large variety of domains, reflecting the fundamentally interdisciplinary nature of AI alignment. Broadly, Lucas will speak with technical and non-technical researchers across areas such as machine learning, governance, ethics, philosophy, and psychology as they pertain to the project of creating beneficial AI. If this sounds interesting to you, we hope that you will join in the conversations by following us or subscribing to our podcasts on Youtube, SoundCloud, or your preferred podcast site/application.
In this podcast, Lucas spoke with Dylan Hadfield-Menell. Dylan is a 5th year PhD student at UC Berkeley advised by Anca Dragan, Pieter Abbeel and Stuart Russell, where he focuses on technical AI alignment research.
Topics discussed in this episode include:
-How CIRL helps to clarify AI alignment and adjacent concepts
-The philosophy of science behind safety theorizing
-CIRL in the context of varying alignment methodologies and it's role
-If short-term AI can be used to amplify normative processes