Sveriges mest populära poddar

Large Language Model (LLM) Talk

"Deep Dive into LLMs like ChatGPT" - Andrej Karpathy's Tech Talk Learning

18 min • 15 februari 2025

Andrej Karpathy's tech talk (youtube), provides a comprehensive yet accessible overview of Large Language Models (LLMs) like ChatGPT. The talk details the process of building an LLM, including pre-training, data processing, and neural network training.Key stages include downloading and filtering internet text, tokenizing the text, and training neural networks to model token relationships. The discussion covers the distinction between base models and assistants, highlighting fine-tuning to create conversational AIs. It also addresses challenges like hallucinations and mitigation strategies, such as knowledge-based refusal and tool use. The talk further explores reinforcement learning and the emergence of "thinking" in models.

Kategorier
Förekommer på
00:00 -00:00