00:00:00 Intro. Bogdan's journey through the French system towards PhD in AI. Inspiration by early DeepMind papers, research on LSTM and other recurrent architectures.
00:05:29 Oxford postdoc between ML and Neuroscience, theory of mind. Turn towards safety. Influence of Nick Bostrom's 2014 book Superintelligence https://en.wikipedia.org/wiki/Superintelligence:_Paths,_Dangers,_Strategies.
00:07:29 AI acceleration. AlphaGo, Startcraft, BART, GPT-2. Language. Learning people's preferences.
00:10:24 The curious path of an independent AI safety researcher. 80000 hours https://80000hours.org/, the Alignment Forum https://www.alignmentforum.org/, LessWrong https://www.lesswrong.com/, effective altruism.
00:18:15 GPT: in-context learning, math.
00:23:47 GPT: planning and thinking. Planning in the real world (reinforcement learning) vs planning in a math proof, planning as problem solving.
00:27:29 GPT: chain of thought. "Let's think about this step by step."
00:31:47 GPT: lying? HAL from 2001 Space Odyssey. Does GPT have the will to do something? Simulators, Bayesian inference, simulacra, autoregressivity. The surprising coherence of GPT-4. Playing personas.
00:43:38 GPT: reinforcement learning with human feedback. RLHF is like an anti-psychotic drug? Or cognitive behavioral therapy?
00:45:38 GPT: Vervaeke's dialog Turing test https://balazskegl.substack.com/p/gpt-4-in-conversation-with-itself.
00:52:36 AI Safety. The issue of timescale. The OpenAI initiative https://openai.com/blog/our-approach-to-ai-safety. Aligning by debating.
00:57:46 Direct alignment research; Bogdan's pessimism.
The 2-step approach: automate alignment research. Who will align the aligner AI?
01:04:11 Alignment by giving agency to AI. Embodiment. Let them confabulate but confront reality.
01:12:09 Max Tegmark's waterfall metaphor. Munk debate on AI https://www.youtube.com/watch?v=144uOfr4SYA, Yoshua Bengio's interview https://www.youtube.com/watch?v=0RknkWgd6Ck.
01:22:21 Open source AI. George Hotz interview https://www.youtube.com/watch?v=dNrTrx42DGQ. Bogdan's counterargument: engineering a pandemic. Some tools make a few people very powerful.
01:28:15 Adversarial examples.
01:31:32 Bogdan's dreams and fears, where are we heading?
Hosted on Acast. See acast.com/privacy for more information.