Epistemic status: Uncertain in writing style, but reasonably confident in content. Want to come back to writing and alignment research, testing waters with this.
Current state and risk level
I think we're in a phase in AI>AGI>ASI development where rogue AI agents will start popping out quite soon.
Pretty much everyone has access to frontier LLMs/VLMs, there are options to run LLMs locally, and it's clear that there are people that are eager to "let them out"; truth terminal is one example of this. Also Pliny. The capabilities are just not there yet for it to pose a problem.
Or are they?
Thing is, we don't know.
There is a possibility there is a coherent, self-inferencing, autonomous, rogue LLM-based agent doing AI agent things right now, fully under the radar, consolidating power, getting compute for new training runs and whatever else.
Is this possibility small? Sure, it seems [...]
---
Outline:
(00:22) Current state and risk level
(02:45) What can be done?
(05:23) Rogue agent threat barometer
(06:56) Does it even matter?
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
March 23rd, 2025
Source:
https://www.lesswrong.com/posts/4J2dFyBb6H25taEKm/we-need-a-lot-more-rogue-agent-honeypots
Narrated by TYPE III AUDIO.