Editorially highlighted article
Applied Data Science and Artificial Intelligence
Summary RAG: A Multi-Format Document Retrieval System with Document-Level Summarization
Abstract
Retrieval-augmented generation (RAG) systems fragment documents into fixed-size chunks, losing context and poorly representing structured content such as tables, multi-column layouts, and hierarchical information. We propose Summary RAG, an architecture combining document-level summarization with selective full-content access. The system creates semantic summaries for document-level retrieval while preserving orig...