I trained a DeepSpeech model with the same training set and configuration twice but obtained different WERs that deviated by 3%. I checked flags.py and saw that the random_seed flag has been set to a certain value by default.
Is there something that I’m missing, or are the results not reproducible? How can I ensure that I’m getting reliable results?
Your help would be much appreciated!
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
It could be from a huge random number of things, you need to be more clear on:
exact parameters you pass
os / env
Like, automatic mixed precision? changing cudnn/cuda subversions between runs?