Patrick and Jacob sit down with Mike Conover, Staff Software Engineer at Databricks and Co-Creator of Databricks Dolly, the world’s first truly open instruction-tuned LLM, to discuss the magic behind Dolly, Alpaca and other instruction-tuned LLMs, the unreasonable effectiveness of fine-tuning, how they got all Databricks employees to help them curate the Dolly dataset (hint: google forms), and more.
(0:00) - Intro
(5:54) - The birth of Dolly
(12:03) - Data curation at Databricks
(15:34) - Advice for building LLMs
(24:10) - The future of instruction-tuning datasets
(30:43) - UI innovation
(38:16) - The future of machine learning infrastructure
(42:05) - How SkipFlag would be different with the tools we have today
(47:01) - What Mike has learned since Dolly
With your co-hosts:
@jasoncwarner
- Former CTO GitHub, VP Eng Heroku & Canonical
@ericabrescia
- Former COO Github, Founder Bitnami (acq’d by VMWare)
@patrickachase
- Partner at Redpoint, Former ML Engineer LinkedIn
@jacobeffron
- Partner at Redpoint, Former PM Flatiron Health