AI Safety Fundamentals: Governance
This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.
A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.