Get WER for entire test set

agarwalaashish20 · April 27, 2020, 12:39pm

Hello team,

Is there a way to the predicted transcripts for all the TEST set with WER? At present (v0.60) shows the WER only for a few (~10) test transcripts.
Kindly guide.

reuben · April 27, 2020, 12:44pm

The first line, before the samples are shown, is the WER/CER for the entire set.

othiele · April 27, 2020, 12:47pm

Adding to Reuben, I usually run it just with the test_files param and set

--test_output_file "/xxx/out.txt" \
--report_count 50 \

If you want to dig deeper, try the benchmarkstt repo on the out.txt file.

agarwalaashish20 · April 27, 2020, 2:13pm

@reuben: I meant I want to get the predicted transcripts for the entire test set with WER for each transcript. Is there a way to get it? Kindly guide.

@othiele: Thank you, will try it out.

othiele · April 27, 2020, 2:26pm

Either set report_count = test_size or check the output file, it lists it for all inputs.

agarwalaashish20 · April 27, 2020, 3:22pm

Thank you @othiele, it worked. But the file is in default ASCII format. I tried

iconv -f ASCII -t UTF-8 out.txt > “out_utf8.txt”

but it didn’t work. You did you handle it?

{
	"char_distance": 35,
	"word_length": 3,
	"wer": 2.6666666666666665,
	"char_length": 26,
	"loss": 208.03085327148438,
	"src": "beim vorliegenden gesch\u00e4ft",
	"word_distance": 8,
	"wav_filename": "/media/data/LTLab.lan/agarwal/german-speech-corpus/swiss_german/clips/35795.wav",
	"res": "die kollegen und kolleginnen wie marie den elfte",
	"cer": 1.3461538461538463
}

reuben · April 27, 2020, 3:32pm

It’s this problem: https://stackoverflow.com/a/18337754/346048

I’ll make a PR for a fix. Alternatively you can load the file using Python and apply a fix locally so you don’t have to re-run the test epoch.

agarwalaashish20 · April 29, 2020, 1:26pm

@reuben:

In the test results, I see a problem. Some resulting transcripts are very short (1-2 words) and some are very long (15-20 words) for a source transcripts of 5-8 words.

I tried changing LM_alpha and beta parameters, but got not much success. Do you recommend anything to solve this problem.

Note: I am working on German data and have trained the model with ~1000 Hours of speech data.

agarwalaashish20 · April 30, 2020, 7:09pm

@reuben, Kindly advice on it.

nmstoker · May 1, 2020, 1:08pm

Others will be better qualified than I to comment, but I’d guess it’s worth looking at two areas:

1. Your dataset: what’s the audio quality like? And how about the transcription quality? Bear in mind that 0.70 was trained on maybe six times as much audio as your 1,000 hrs (just a back of envelope calc. based on the datasets mentioned on release page under Training Regimen section)

2. Your language model/scorer: how large a text corpus did you use to create it? Was it just the transcribed text from your audio dataset or was it more comprehensive? Unless you’re targeting a narrow vocab scenario (+ it doesn’t sound like you are) then you’ll likely want the biggest you can manage, so that the model makes sensible predictions about sentence probabilities.

@reuben, Kindly advice on it.

Given it wasn’t that long after your earlier question and people had already helped you, it might be worth a little patience People are often happy to help but they aren’t sitting there just waiting for your next question…

Anyway, I hope you get to the bottom of your issues with the transcriptions

Topic		Replies	Views
DeepSpeech Latest Results with English DeepSpeech	10	1320	July 14, 2019
Difference between Test Epoch WER/CER and Inference WER/CER with DeepSpeech 0.7.4 DeepSpeech	5	819	August 11, 2020
Fine tuning with custom dataset doubles WER DeepSpeech	3	511	July 9, 2020
Getting wrong output from the trained model DeepSpeech	4	774	May 25, 2019
How can i check WER from my inference results? DeepSpeech	1	877	February 28, 2019

Get WER for entire test set

Related topics