Start / Doom Debates / Toy model of the ai control problem

Toy Model of the AI Control Problem

26 min • 6 februari 2025

Why does the simplest AI imaginable, when you ask it to help you push a box around a grid, suddenly want you to die?

AI doomers are often misconstrued as having "no evidence" or just "anthropomorphizing". This toy model will help you understand why a drive to eliminate humans is NOT a handwavy anthropomorphic speculation, but rather something we expect by default from any sufficiently powerful search algorithm.

We’re not talking about AGI or ASI here — we’re just looking at an AI that does brute-force search over actions in a simple grid world.

The slide deck I’m presenting was created by Jaan Tallinn, cofounder of the Future of Life Institute.

00:00 Introduction

01:24 The Toy Model

06:19 Misalignment and Manipulation Drives

12:57 Search Capacity and Ontological Insights

16:33 Irrelevant Concepts in AI Control

20:14 Approaches to Solving AI Control Problems

23:38 Final Thoughts

Watch the Lethal Intelligence Guide, the ultimate introduction to AI x-risk! https://www.youtube.com/@lethal-intelligence

PauseAI, the volunteer organization I’m part of: https://pauseai.info

Join the PauseAI Discord — https://discord.gg/2XXWXvErfA — and say hi to me in the #doom-debates-podcast channel!

Doom Debates’ Mission is to raise mainstream awareness of imminent extinction from AGI and build the social infrastructure for high-quality debate.

Support the mission by subscribing to my Substack at https://doomdebates.com and to https://youtube.com/@DoomDebates

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit lironshapira.substack.com

Kategorier

Näringsliv Poddar Teknologi

Förekommer på

Teknik

00:00 -00:00