Sveriges mest populära poddar

Large Language Model (LLM) Talk

S1: Simple Test-time Scaling

16 min • 9 februari 2025

'S1' refers to simple test-time scaling, an efficient approach to enhance language model reasoning with minimal resources. It involves training a model on a small, carefully curated dataset like s1K and using budget forcing to control test-time compute. Budget forcing enforces maximum or minimum thinking tokens by appending delimiters or the word "Wait". The s1-32B model, developed using this method, outperforms other models on competition math questions. The approach combines a curated dataset with a straightforward test-time technique, leading to strong reasoning performance and effective test-time scaling.

Kategorier
Förekommer på
00:00 -00:00