PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

SAGE: Benchmarking and Improving Retrieval for Deep Research Agents

ArXivSource

Tiansheng Hu, Yilun Zhao, Canyu Zhang, Arman Cohan, Chen Zhao

cs.IR
cs.CL
|
Feb 5, 2026
265 views

One-line Summary

The SAGE benchmark reveals that traditional BM25 outperforms LLM-based retrievers for scientific literature retrieval, with enhancements possible through document augmentation using LLMs.

Plain-language Overview

Researchers are exploring how well large language model (LLM) based systems can help with retrieving scientific papers for answering complex questions. They created a benchmark called SAGE to test different retrieval systems, and found that traditional keyword-based searches (like BM25) were more effective than the newer LLM-based methods. However, by enhancing documents with additional metadata and keywords using LLMs, they were able to improve retrieval performance. This suggests that while LLMs have potential, they currently need further refinement to compete with traditional retrieval methods.

Technical Details