About 1.5 hours ago, Anthropic released Claude 3.7 Sonnet, a hybrid reasoning model that interpolates between a normal LM and long chains of thought:
Today, we’re announcing Claude 3.7 Sonnet1, our most intelligent model to date and the first hybrid reasoning model on the market. Claude 3.7 Sonnet can produce near-instant responses or extended, step-by-step thinking that is made visible to the user. API users also have fine-grained control over how long the model can think for.
They call this ability "extended thinking" (from their system card):
Claude 3.7 Sonnet introduces a new feature called "extended thinking" mode. In extended thinking mode, Claude produces a series of tokens which it can use to reason about a problem at length before giving its final answer. Claude was trained to do this via reinforcement learning, and it allows Claude to spend more time on
questions which require [...]
---
Outline:
(01:22) Benchmark performance
(02:19) At least its not named Claude 3.5 Sonnet
---
First published:
February 24th, 2025
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.