Google Наука

Статии

Наука

Моят потребителски профил Моята библиотека

Lost in the middle: How language models use long contexts

NF Liu, K Lin, J Hewitt, A Paranjape… - Transactions of the …, 2024 - direct.mit.edu

NF Liu, K Lin, J Hewitt, A Paranjape, M Bevilacqua, F Petroni, P Liang

Transactions of the Association for Computational Linguistics, 2024•direct.mit.edu

While recent language models have the ability to take long contexts as input, relatively little
is known about how well they use longer context. We analyze the performance of language
models on two tasks that require identifying relevant information in their input contexts: multi-
document question answering and key-value retrieval. We find that performance can
degrade significantly when changing the position of relevant information, indicating that
current language models do not robustly make use of information in long input contexts. In …

Abstract

While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts. In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models. Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context language models.

MIT Press

Показване на ощеПоказване на по-малко

Запазване Позоваване С позовавания в 587 Сродни статии Всички 11 версии

Показан е най-добрият резултат за това търсене. Показване на всички резултати

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Lost in the middle: How language models use long contexts