Sveriges mest populära poddar

LessWrong (30+ Karma)

“OpenAI rewrote its Preparedness Framework” by Zach Stein-Perlman

2 min • 16 april 2025

New: https://openai.com/index/updating-our-preparedness-framework/

Old: https://cdn.openai.com/openai-preparedness-framework-beta.pdf

Summary

Thresholds & responses: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf#page=5. High and Critical thresholds trigger responses, like in the old PF; responses to Critical thresholds are not yet specified.

Three main categories of capabilities:

  • Bio/chem: High capabilities trigger security controls and (for external deployment) misuse safeguards
  • Cyber: High capabilities trigger security controls and (for external deployment) misuse safeguards and (for large-scale internal deployment) misalignment safeguards
  • AI Self-improvement: High capabilities trigger security controls

Misuse safeguards, misalignment safeguards, and security controls for High capability levels: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf#page=16. My quick takes:

  • Misuse safeguards: fine categories but it's not clear what level of assurance would suffice
  • Misalignment safeguards: worrying categories and it's not clear what level of assurance would suffice
  • Security controls: it's impossible to evaluate security level based on principles like these

[I'll edit this post to add more analysis soon]

---

First published:
April 15th, 2025

Source:
https://www.lesswrong.com/posts/Yy5ijtbNfwv8DWin4/openai-rewrote-its-preparedness-framework

---

Narrated by TYPE III AUDIO.

00:00 -00:00