I have transcripts from different speech recognition engines for same audio files, and want to compare them. I want to take Mozilla’s method of wer calculation, however I could not find the relevant file. Can someone please let me know if this is possible?
We don’t have a standalone code, but you should be able to do that with
DeepSpeech.py, if you don’t run any training or validation step and only test step, and that you initialize from an existing model instead of a checkpoint ?
That would hold for DeepSpeech transcripts. I’m looking to get WER between a ref and hyp text files. This way, one would be able to compare various speech engines in a fast way rather than transcribing each time?
I’m not sure what you mean by “ref” and “hyp”, but yeah, I was mostly thinking of a usecase where you need to compare on deepspeech model (e.g., comparing your own variation of the model to a reference one?).
I mean a reference text file containing ground truth and a hypotheses text file containing the transcript.
WER is being computed by
calculate_report: https://github.com/mozilla/DeepSpeech/blob/27444d67ec4da563aea8a42ae8daec6fe877378b/DeepSpeech.py#L756-L788 and
wer() is defined here https://github.com/mozilla/DeepSpeech/blob/27444d67ec4da563aea8a42ae8daec6fe877378b/util/text.py#L85-L97
I have transcripts from different speech recognition engines for same audio files, and want to compare them.
You can do so with https://github.com/Franck-Dernoncourt/ASR_benchmark . It doesn’t use Mozilla’s code for WER calculation, but hopefully it gives the same / similar numbers.