Editorially highlighted article
Research Articles
Stability Evaluation of Confidence Features Across Model and Data Variants in Large Language Models
Abstract
Large language models (LLMs) have demonstrated remarkable generative capabilities but often produce outputs with uncertain reliability, and although there are methods to estimate confidence using several features, work on calculating the importance of these features and evaluating their stability in different model and data settings is limited. In this work, we present a comprehensive framework for estimating the ...