Training on Voxforge dataset

Hi all, I am using deepspeech v0.6.1 and trying to train the model using Voxforge dataset(which is downloaded using import_voxforge.py script). These are the hyper parameters that I am using,

nohup ./DeepSpeech.py --use_cudnn_rnn=True --checkpoint_dir /d/deepspeech-dev/deepspeech-0.6.1-checkpoint/ --train_files /d/voxforge/voxforge-train.csv --dev_files /d/voxforge/voxforge-dev.csv --test_files /d/voxforge/voxforge-test.csv --train_batch_size 32  --dev_batch_size 32 --test_batch_size 32 --n_hidden 2048 --epochs 75 --dropout_rate 0.20 --learning_rate 0.00005 –lm_alpha 0.75 –lm_beta 1.85 --augmentation_freq_and_time_masking True –augmentation_pitch_and_tempo_masking True  --export_dir /d/deepspeech-dev/exportmodel &

Using this paramters I am getting Training loss ~13.41 and validation loss ~23.74.

But the final inferences are as,

--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.478261, loss: 41.041939
- wav: file:///d/voxforge/test/anonymous-20090726-dqn/wav/b0178.wav
- src: "also i want information"
- res: "i saw i want to salvation"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.280000, loss: 43.730026
- wav: file:///d/voxforge/test/kockot-20130530-euq/wav/b0391.wav
- src: "at sea tuesday march seventeenth nineteen oh eight"
- res: "at sea tues the marsh event in its nineteenth or eight"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.553191, loss: 84.168678
- wav: file:///d/voxforge/test/anonymous-20110530-zqg/wav/a0545.wav
- src: "the italian rancho was a bachelor establishment"
- res: "the team the jews bachelor as to this"
--------------------------------------------------------------------------------
WER: 1.000000, CER: 0.615385, loss: 168.527725
- wav: file:///d/voxforge/test/anonymous-20110530-zqg/wav/a0542.wav
- src: "without a doubt some of them have dinner engagements"
- res: "to the have to engage in"
--------------------------------------------------------------------------------
WER: 0.909091, CER: 0.548387, loss: 115.628990
- wav: file:///d/voxforge/test/Alpsa-20100604-mzc/wav/ar-17.wav
- src: "you are coming of course i'm not certain said arthur undaunted"
- res: "most common of course i am a simple factors in dated"
--------------------------------------------------------------------------------
WER: 0.900000, CER: 0.666667, loss: 118.485672
- wav: file:///d/voxforge/test/anonymous-20090726-dqn/wav/b0181.wav
- src: "and you preferred a cannibal isle and a cartridge belt"
- res: "are you here i ever come of universe"
--------------------------------------------------------------------------------
WER: 0.888889, CER: 0.652174, loss: 90.288841
- wav: file:///d/voxforge/test/anonymous-20090726-dqn/wav/b0185.wav
- src: "your being wrecked here has been godsend to me"
- res: "you minihan here wolete"
--------------------------------------------------------------------------------
WER: 0.888889, CER: 0.488889, loss: 104.486588
- wav: file:///d/voxforge/test/anonymous-20090726-dqn/wav/b0186.wav
- src: "i can't go elsewhere by your your own account"
- res: "i can as a he you your as a co"
--------------------------------------------------------------------------------
WER: 0.875000, CER: 0.526316, loss: 59.387451
- wav: file:///d/voxforge/test/anonymous-20110530-zqg/wav/a0544.wav
- src: "he may anticipate the day of his death"
- res: "he indicated that he is the"
--------------------------------------------------------------------------------
WER: 0.875000, CER: 0.500000, loss: 73.614395
- wav: file:///d/voxforge/test/gilrim-20080120-vgs/wav/b0416.wav
- src: "i arose obediently and went down the beach"
- res: "i aroused of the temple and went on to"
--------------------------------------------------------------------------------
I Exporting the model...
I Models exported at /d/deepspeech-dev/exportmodel
-----------------------------------------------------------------------------------------------------------

It would be helpful if someone share opinions how can I achieve better results and decrease the loss? Thanks in advance :slight_smile:

You should not focus on the loss value only, but also on its behavior during training.

THose are the worst examples from the test set

What’s your Test set WER ?

Hi lissyx, Thanks for the reply. I have trained the same dataset using below hyper parameters:

nohup ./DeepSpeech.py --use_cudnn_rnn=True --checkpoint_dir /deepspeech-dev/deepspeech-0.6.1-checkpoint/ --train_files /voxforge/voxforge-train.csv --dev_files /voxforge/voxforge-dev.csv --test_files /voxforge/voxforge-test.csv --train_batch_size 16  --dev_batch_size 16 --test_batch_size 16 --n_hidden 2048 --epochs 75 --dropout_rate 0.20 --learning_rate 0.00001 --export_dir /deepspeech-dev/exportmodel &

I am getting Training loss ~12.94 , validation loss ~19.67 and WER is 0.13341. It would be helpful if you guide me how can I achieved much better results than this?

As I already said, the value in themselves are not interesting, you need to look how they evolve together during training.

13% WER on VoxForge trained on top of 0.6.1 english model ? That’s not too bad.