Low loss but high WER on test set

jahir · November 17, 2018, 4:00pm

I trained DeepSpeech v4 on a portion of data from http://openslr.org/53 (Total 33hr data used, and I know it’s not even close for a good model). Upon training, it gives relatively good values for loss on validation and test dataset. But WER on test set is very high.

Parameter for training were :

N_hidden = 2048 (I wanted to overfit to get an overview)
Dropout = 0.2
Learning Rate = 0.0001
Epoch = 50 (Early Stop triggered after 15 epochs)
Beam width = 1024
Alphabet Size = 62
Train/Dev/Test Ratio = 80/10/10
** Other Parameters were their default values

And Model performance:

Train Loss = 13
Dev Loss = 24
Test Loss = 24
Test WER = 0.75
Test Edit Distance = 0.36

The dataset was split based on speaker. Total utterences of each speaker were distributed 80/10/10 among train, dev and test set.

Experiments on another dataset http://openslr.org/37/ gave much lower WER on test set. This dataset contains only 9hrs speech data. Using pitch/tempo/speed augmentation I was able to reach minimum 0.24 WER and 0.08 Edit Distance on test set.

The parameters were all same except N_hidden. N_hidden was 1024 for this experiment.

So, what may be causing the high WER on test set. What things may I investigate other than increasing dataset (Training is prohibitively time consuming on larger dataset). Thanks

Topic		Replies	Views
Training Loss vs Test Loss DeepSpeech	8	2576	August 26, 2019
How to decrease WER Loss? DeepSpeech	1	404	February 19, 2020
Optimaizatiom the WER result DeepSpeech	8	1020	December 19, 2020
Training pretrained deepspeech-0.6.1 on other datasets DeepSpeech	3	750	February 18, 2020
What the best train loss and validation loss is DeepSpeech	2	375	March 14, 2020

Low loss but high WER on test set

Related topics