π DRAGON. Dynamic RAG Benchmark On News
This leaderboard allows comparing RAG systems based on generative and retrieval metrics across different question types (simple, comparison, multi-hop, conditional, etc.).
Version 1.34.1 β 600 questions, generated from news sources β 03 ΠΈΡΠ»Ρ 2025
Generation Metrics
Retrieval Metrics
Citation
@misc{chernogorskii2025dragondynamicragbenchmark,
title={DRAGON: Dynamic RAG Benchmark On News},
author={Fedor Chernogorskii and Sergei Averkiev and Liliya Kudraleeva and Zaven Martirosian and Maria Tikhonova and Valentin Malykh and Alena Fenogenova},
year={2025},
eprint={2507.05713},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2507.05713},
}
Version Selection
Start counting from the current dataset version
1 5
Click on models in the table to add them to the charts