Revisiting Softmax for Uncertainty Approximation in Text Classification

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

Revisiting Softmax for Uncertainty Approximation in Text Classification. / Holm, Andreas Nugaard; Wright, Dustin; Augenstein, Isabelle.

In: Information (Switzerland), Vol. 14, No. 7, 420, 2023.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Holm, AN, Wright, D & Augenstein, I 2023, 'Revisiting Softmax for Uncertainty Approximation in Text Classification', Information (Switzerland), vol. 14, no. 7, 420. https://doi.org/10.3390/info14070420

APA

Holm, A. N., Wright, D., & Augenstein, I. (2023). Revisiting Softmax for Uncertainty Approximation in Text Classification. Information (Switzerland), 14(7), [420]. https://doi.org/10.3390/info14070420

Vancouver

Holm AN, Wright D, Augenstein I. Revisiting Softmax for Uncertainty Approximation in Text Classification. Information (Switzerland). 2023;14(7). 420. https://doi.org/10.3390/info14070420

Author

Holm, Andreas Nugaard ; Wright, Dustin ; Augenstein, Isabelle. / Revisiting Softmax for Uncertainty Approximation in Text Classification. In: Information (Switzerland). 2023 ; Vol. 14, No. 7.

Bibtex

@article{e43fe7e94b4a48ccb4c2afc9afa618d8,

title = "Revisiting Softmax for Uncertainty Approximation in Text Classification",

abstract = "Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use a softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive, and in some cases better, uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.",

keywords = "efficiency, text classification, uncertainty quantification",

author = "Holm, {Andreas Nugaard} and Dustin Wright and Isabelle Augenstein",

note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

doi = "10.3390/info14070420",

language = "English",

volume = "14",

journal = "Information (Switzerland)",

issn = "2078-2489",

publisher = "MDPI - Open Access Publishing",

number = "7",

}

RIS

TY - JOUR

T1 - Revisiting Softmax for Uncertainty Approximation in Text Classification

AU - Holm, Andreas Nugaard

AU - Wright, Dustin

AU - Augenstein, Isabelle

PY - 2023

Y1 - 2023

N2 - Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use a softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive, and in some cases better, uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.

AB - Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use a softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive, and in some cases better, uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.

KW - efficiency

KW - text classification

KW - uncertainty quantification

UR - http://www.scopus.com/inward/record.url?scp=85166384773&partnerID=8YFLogxK

U2 - 10.3390/info14070420

DO - 10.3390/info14070420

M3 - Journal article

AN - SCOPUS:85166384773

VL - 14

JO - Information (Switzerland)

JF - Information (Switzerland)

SN - 2078-2489

IS - 7

M1 - 420

ER -

ID: 364498618

Department of Computer Science