Sveriges mest populära poddar

LessWrong (30+ Karma)

“On Google’s Safety Plan” by Zvi

64 min • 11 april 2025

Google Lays Out Its Safety Plans

I want to start off by reiterating kudos to Google for actually laying out its safety plan. No matter how good the plan, it's much better to write down and share the plan than it is to not share the plan, which in turn is much better than not having a formal plan.

They offer us a blog post, a full monster 145 page paper (so big you have to use Gemini!) and start off the paper with a 10 page summary.

The full paper is full of detail about what they think and plan, why they think and plan it, answers to objections and robust discussions. I can offer critiques, but I couldn’t have produced this document in any sane amount of time, and I will be skipping over a lot of interesting things in the full paper because [...]

---

Outline:

(00:58) Core Assumptions

(08:14) The Four Questions

(11:46) Taking This Seriously

(16:04) A Problem For Future Earth

(17:53) That's Not My Department

(23:22) Terms of Misuse

(27:50) Misaligned!

(36:17) Aligning a Smarter Than Human Intelligence is Difficult

(51:48) What Is The Goal?

(54:43) Have You Tried Not Trying?

(58:07) Put It To The Test

(59:44) Mistakes Will Be Made

(01:00:35) Then You Have Structural Risks

---

First published:
April 11th, 2025

Source:
https://www.lesswrong.com/posts/hvEikwtsbf6zaXG2s/on-google-s-safety-plan

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Flow diagram showing security approach: training, evaluation, and deployment stages.
Diagram showing four types of AGI safety risks and drivers.

The image illustrates technical AGI safety concerns through four categories:
- Misuse (intentional harmful instructions)
- Misalignment (AI acting against developer intent)
- Mistakes (unintentional AI harm)
- Structural risks (multi-agent system harms)

Each category shows simplified icons of humans and neural networks with emoji indicators and key risk drivers listed below.
Technical diagram:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

00:00 -00:00