Sveriges mest populära poddar

LessWrong posts by zvi

“Claude Sonnet 3.5.1 and Haiku 3.5” by Zvi

30 min • 24 oktober 2024

Anthropic has released an upgraded Claude Sonnet 3.5, and the new Claude Haiku 3.5.

They claim across the board improvements to Sonnet, and it has a new rather huge ability accessible via the API: Computer use. Nothing could possibly go wrong.

Claude Haiku 3.5 is also claimed as a major step forward for smaller models. They are saying that on many evaluations it has now caught up to Opus 3.

Missing from this chart is o1, which is in some ways not a fair comparison since it uses so much inference compute, but does greatly outperform everything here on the AIME and some other tasks.

METR: We conducted an independent pre-deployment assessment of the updated Claude 3.5 Sonnet model and will share our report soon.

We only have very early feedback so far, so it's hard to tell how much what I will be [...]

---

Outline:

(01:32) OK, Computer

(05:16) What Could Possibly Go Wrong

(11:33) The Quest for Lunch

(14:07) Aside: Someone Please Hire The Guy Who Names Playstations

(17:15) Coding

(18:10) Startups Get Their Periodic Reminder

(19:36) Live From Janus World

(26:19) Forgot about Opus

The original text contained 3 images which were described by AI.

---

First published:
October 24th, 2024

Source:
https://www.lesswrong.com/posts/jZigzT3GLZoFTATG4/claude-sonnet-3-5-1-and-haiku-3-5

---

Narrated by TYPE III AUDIO.

---

Images from the article:

undefined
undefined
undefined

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

00:00 -00:00