Start / Large Language Model (LLM) Talk / Llama 1

LLaMA-1

14 min • 17 januari 2025

LLaMA-1 is a collection of large language models ranging from 7B to 65B parameters, trained on publicly available datasets. LLaMA models achieve competitive performance compared to other LLMs like GPT-3, Chinchilla, and PaLM, with the 13B model outperforming GPT-3 on most benchmarks, despite being much smaller, and the 65B model being competitive with the best large language models. The document also discusses the training approach, architecture, optimization, and evaluations of LLaMA on common sense reasoning, question answering, reading comprehension, mathematical reasoning, code generation, and massive multitask language understanding, as well as its biases and toxicity. The models are intended to democratize access and study of LLMs with some models being able to run on a single GPU, and to be a basis for further research.

Kategorier

Poddar Teknologi

Förekommer på

Teknik

00:00 -00:00