Steve Omohundro joins the podcast to discuss Provably Safe Systems, a paper he co-authored with FLI President Max Tegmark. You can read the paper here: https://arxiv.org/pdf/2309.01933.pdf
Timestamps:
00:00 Provably safe AI systems
12:17 Alignment and evaluations
21:08 Proofs about language model behavior
27:11 Can we formalize safety?
30:29 Provable contracts
43:13 Digital replicas of actual systems
46:32 Proof-carrying code
56:25 Can language models think logically?
1:00:44 Can AI do proofs for us?
1:09:23 Hard to proof, easy to verify
1:14:31 Digital neuroscience
1:20:01 Risks of totalitarianism
1:22:29 Can we guarantee safety?
1:25:04 Real-world provable safety
1:29:29 Tamper-proof hardware
1:35:35 Mortal and throttled AI
1:39:23 Least-privilege guarantee
1:41:53 Basic AI drives
1:47:47 AI agency and world models
1:52:08 Self-improving AI
1:58:21 Is AI overhyped now?