Retrieval-augmented generation (RAG) enhances large language models (LLMs) by connecting them to external knowledge sources. It works by retrieving relevant documents based on a user's query, using an embedding model to convert both into numerical vectors, then using a vector database to find matching content. The retrieved data is then passed to the LLM for response generation. This process improves accuracy and reduces "hallucinations" by grounding the LLM in factual, up-to-date information. RAG also increases user trust by providing source attribution, so users can verify the information.