Start / LessWrong (30+ Karma) / Ai control may increase existential risk by jan_kulveit

“AI Control May Increase Existential Risk” by Jan_Kulveit

3 min • 11 mars 2025

Epistemic status: The following isn't an airtight argument, but mostly a guess how things play out.

Consider two broad possibilities:

I. In worlds where we are doing reasonably well on alignment, AI control agenda does not have much impact.

II. In worlds where we are failing at alignment, AI control may primarily shift probability mass away from "moderately large warning shots" and towards "ineffective warning shots" and "existential catastrophe, full takeover".

The key heuristic is that the global system already has various mechanisms and feedback loops that resist takeover by a single agent (i.e. it is not easy to overthrow the Chinese government). In most cases where AI control would stop an unaligned AI, the counterfactual is that broader civilizational resistance would have stopped it anyway, but with the important side effect of a moderately-sized warning shot.

I expect moderately sized warning shots to increase the chances humanity [...]

---

First published:
March 11th, 2025

Source:
https://www.lesswrong.com/posts/rZcyemEpBHgb2hqLP/ai-control-may-increase-existential-risk

---

Narrated by TYPE III AUDIO.

Kategorier

Filosofi Poddar Samhälle och kultur Teknologi

Förekommer på

Teknik

00:00 -00:00