Start / LessWrong (30+ Karma) / How i force llms to generate correct code by claudio

“How I force LLMs to generate correct code” by claudio

11 min • 21 mars 2025

In my daily work as software consultant I'm often dealing with large pre-existing code bases. I use GitHub Copilot a lot. It's now basically indispensable, but I use it mostly for generating boilerplate code, or figuring out how to use a third-party library.
As the code gets more logically nested though, Copilot crumbles under the weight of complexity. It doesn't know how things should fit together in the project.

Other AI tools like Cursor or Devon, are pretty good at generating quickly working prototypes, but they are not great at dealing with large existing codebases, and they have a very low success rate for my kind of daily work.
You find yourself in an endless loop of prompt tweaking, and at that point, I'd rather write the code myself with the occasional help of Copilot.

Professional coders know what code they want, we can define it [...]

---

Outline:

(02:52) How it works

(06:27) Which models work best

(07:39) Search algorithm

(09:08) Research

(09:45) A Note from the Author

---

First published:
March 21st, 2025

Source:
https://www.lesswrong.com/posts/WNd3Lima4qrQ3fJEN/how-i-force-llms-to-generate-correct-code

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Python code showing Lisp interpreter implementation with test cases and Unvibe runner.

Execution report showing test results tree and Python code snippets.

Diagram showing program clustering process with sampling and LLM generation steps.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Kategorier

Filosofi Poddar Samhälle och kultur Teknologi

Förekommer på

Teknik

00:00 -00:00