Word Error Rate Estimation for Speech Recognition: e-WER

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Authors Ahmed Ali, Steve Renals
Journal/Conference Name ACL 2018 7
Paper Category
Paper Abstract Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9{\%} WER root mean squared error (RMSE) across 1,400 sentences. The estimated overall WER e-WER was 25.3{\%} for the three hours test set, while the actual WER was 28.5{\%}.
Date of publication 2018
Code Programming Language Jupyter Notebook
Comment

Copyright Researcher 2022