Beyond Single Ground Truth: Reference Monism as Epistemic Injustice in ASR Evaluation

arXiv:2605.07084v1 Announce Type: new Abstract: Automatic speech recognition (ASR) evaluation compares system output to ground truth transcripts, with Word Error Rate (WER) quantifying the distance between them. But ground truth transcripts are not discovered - they are produced by human annotators following conventions that encode normative assumptions about which speech features matter. Different conventions (verbatim, non-verbatim, legal) produce different transcripts of identical speech and…

cs.CL updates on arXiv.org · May 12 · 1 min read · score 7.0

From the source