Sveriges mest populära poddar

LessWrong (30+ Karma)

“Anti-Slop Interventions?” by abramdemski

10 min • 5 februari 2025

In his recent post arguing against AI Control research, John Wentworth argues that the median doom path goes through AI slop, rather than scheming. I find this to be plausible. I believe this suggests a convergence of interests between AI capabilities research and AI alignment research.

Historically, there has been a lot of concern about differential progress amongst AI safety researchers (perhaps especially those I tend to talk to). Some research is labeled as "capabilities" while other is labeled as "safety" (or, more often, "alignment"[1]). Most research is dual-use in practice (IE, has both capability and safety implications) and therefore should be kept secret or disclosed carefully.

Recently, a colleague expressed concern that future AIs will read anything AI safety researchers publish now. Since the alignment of future AIs seems uncertain and even implausible, almost any information published now could be net harmful for the future.

I argued the [...]

---

Outline:

(02:42) AI Slop

(04:46) Coherence and Recursive Self-Improvement

(08:00) Whats to be done?

The original text contained 10 footnotes which were omitted from this narration.

---

First published:
February 4th, 2025

Source:
https://www.lesswrong.com/posts/PdtHzEb3cebnWCjoj/anti-slop-interventions

---

Narrated by TYPE III AUDIO.

00:00 -00:00