Nathan Lambert is the author of the popular AI newsletter Interconnects. He is also a research scientist who leads post-training at the Allen Institute for Artificial Intelligence, a research organization funded by the estate of Paul Allen. This means that the organization can afford to train its own models—and it’s one of the only such organizations committed to doing so in an open manner. So Lambert is one of the few people with hands-on experience building cutting-edge LLMs who can talk freely about his work. In this December 17 conversation, Lambert walked us through the steps required to train a modern model and explained how the process is evolving. Note that this conversation was recorded before OpenAI announced its new o3 model later in the month.
Links mentioned during the interview:
The Allen Institute's Tülu 3 blog post
The Allen Institute's OLMo 2 model
The original paper that introduced RLHF
Nathan Lambert on OpenAI's reinforcement fine-tuning API