Start / Large Language Model (LLM) Talk / S1 simple test time scaling

S1: Simple Test-time Scaling

16 min • 9 februari 2025

'S1' refers to simple test-time scaling, an efficient approach to enhance language model reasoning with minimal resources. It involves training a model on a small, carefully curated dataset like s1K and using budget forcing to control test-time compute. Budget forcing enforces maximum or minimum thinking tokens by appending delimiters or the word "Wait". The s1-32B model, developed using this method, outperforms other models on competition math questions. The approach combines a curated dataset with a straightforward test-time technique, leading to strong reasoning performance and effective test-time scaling.

Kategorier

Poddar Teknologi

Förekommer på

Teknik

00:00 -00:00