Sveriges mest populära poddar

Agentic Horizons

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

9 min • 16 november 2024

This episode explores RAPTOR, a tree-based retrieval system designed to enhance retrieval-augmented language models (RALMs). RAPTOR addresses the limitations of traditional RALMs, which struggle with understanding large-scale discourse and answering complex questions by retrieving only short text chunks.RAPTOR builds a multi-layered tree by embedding, clustering, and summarizing text chunks recursively, allowing it to capture both high-level and low-level details of a document. The system uses two querying strategies—Tree Traversal and Collapsed Tree—to retrieve relevant information.Experiments on question-answering datasets show RAPTOR consistently outperforms traditional methods like BM25 and DPR, especially when combined with GPT-4. The recursive summarization and soft clustering methods significantly improve performance, particularly for complex, multi-step reasoning tasks. RAPTOR demonstrates the potential for enhanced retrieval by leveraging deeper document structure and thematic connections.


https://arxiv.org/pdf/2401.18059

Kategorier
Förekommer på
00:00 -00:00