Sveriges mest populära poddar

LessWrong (30+ Karma)

“A long list of concrete projects and open problems in evals” by Marius Hobbhahn

1 min • 23 mars 2025

We made a long list of concrete projects and open problems in evals with 100+ suggestions!

https://docs.google.com/document/d/1gi32-HZozxVimNg5Mhvk4CvW4zq8J12rGmK_j2zxNEg/edit?usp=sharing

We hope that makes it easier for people to get started in the field and to coordinate on projects.

Over the last 4 months, we collected contributions from 20+ experts in the field, including people who work at Apollo, METR, Redwood, RAND, AISIs, frontier labs, SecureBio, AI futures project, many academics, and independent researchers (suggestions have not necessarily been made in an official capacity). The doc has comment access, and further well-intentioned contributions are welcome!

Here is a screenshot of the table of contents:

---

First published:
March 22nd, 2025

Source:
https://www.lesswrong.com/posts/LhnqegFoykcjaXCYH/a-long-list-of-concrete-projects-and-open-problems-in-evals

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Hierarchical menu structure showing various categories for AI evaluation and research.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

00:00 -00:00