Epistemic status: You probably already know if you want to read this kind of post, but in case you have not decided: my impression is that people are acting very confused about what we can conclude about scaling LLMs from the evidence, and I believe my mental model cuts through a lot of this confusion - I have tried to rebut what I believe to be misconceptions in a scattershot way, but will attempt to collect the whole picture here. I am a theoretical computer scientist and this is a theory. Soon I want to do some more serious empirical research around it - but be aware that most of my ideas about LLMs have not had the kind of careful, detailed contact with reality that I would like at the time of writing this post. If you're a good engineer (or just think I am dropping the ball [...]
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
February 13th, 2025
Source:
https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms
Narrated by TYPE III AUDIO.