Deep Speech training results not reproducible

michelle.liman · April 20, 2021, 8:55am

I trained a DeepSpeech model with the same training set and configuration twice but obtained different WERs that deviated by 3%. I checked flags.py and saw that the random_seed flag has been set to a certain value by default.

Is there something that I’m missing, or are the results not reproducible? How can I ensure that I’m getting reliable results?

Your help would be much appreciated!

lissyx · April 20, 2021, 10:39am

It could be from a huge random number of things, you need to be more clear on:

exact parameters you pass
os / env

Like, automatic mixed precision? changing cudnn/cuda subversions between runs?

NanoNabla · April 20, 2021, 1:14pm

Maybe try longer trainings (bigger dataset, more epochs). Random indeterministic effects should be minimized on longer runs.