Hello everyone,
I am trying to fine-tune the pretrained 0.6.0 checkpoints on ted-lium 3, actually a specific subset of it after some cleaning. but my validation loss doesn’t decrease after Epoch 7, while my training loss continues to decrease. Is this to be expected or should i stop the training and modify more hyperparams? thank you.
Training Snapshot:
Epoch 2 | Training | Elapsed Time: 3:16:10 | Steps: 113547 | Loss: 30.255176
Epoch 2 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.143296 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.143296 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-574425
Epoch 3 | Training | Elapsed Time: 3:14:47 | Steps: 113547 | Loss: 28.822134
Epoch 3 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 56.931689 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 56.931689 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-687972
Epoch 4 | Training | Elapsed Time: 3:13:50 | Steps: 113547 | Loss: 27.622577
Epoch 4 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 56.640776 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 56.640776 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-801519
Epoch 5 | Training | Elapsed Time: 3:13:55 | Steps: 113547 | Loss: 26.533800
Epoch 5 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 56.558053 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 56.558053 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-915066
Epoch 6 | Training | Elapsed Time: 3:13:31 | Steps: 113547 | Loss: 25.547271
Epoch 6 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 56.451978 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 56.451978 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1028613
Epoch 7 | Training | Elapsed Time: 1:20:29 | Steps: 63983 | Loss: 15.020540 ^Epoch 7 | Training | Elapsed Time: 3:13:36 | Steps: 113547 | Loss: 24.656791
Epoch 7 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 56.751594 | Dataset: ../datasets/ted-dev.csv
Thank you for the fast reply, if you don’t mind me asking, should i stop and start from the latest/best checkpoint saved here or restart the entire training again from the beginning?
Thanks again
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
I’d restart from scratch from the released the checkpoint
sorry, I made a mistake in the above post, I was using learning rate 0.000001 with --noearly_stop, should i reduce it more or change a different parameter?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
6
@m.nasr from my experiments with transfer learning from English to Spanish reducing the batch size also improves the WER, the lower WER I got was 20% using 2 epochs, dropout 0.26, lr 1e-6, batch_size 11, and 600h of Spanish data.
You maybe want to test with lower batch size and share your results.
@lissyx
Thank you so much for replying, it has been more or less the same, the first few Epocs were faster by a small margin, but it has been steady since then, for reference, please see the full output here:
python3 DeepSpeech.py --n_hidden 2048 --checkpoint_dir ../checkpoints/deepspeech-0.6.0-checkpoint --train_files ../dat
asets/ted-train.csv --dev_files ../datasets/ted-dev.csv --test_files ../datasets/ted-test.csv --learning_rate 0.0000001 --export_tflite --export_dir export --lm_alpha 0.75
--lm_beta 1.85 --use_cudnn_rnn=True --noearly_stop
Epoch 0 | Training | Elapsed Time: 3:13:55 | Steps: 113547 | Loss: 41.318807
Epoch 0 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 66.726466 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 66.726466 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-347331
Epoch 1 | Training | Elapsed Time: 3:13:50 | Steps: 113547 | Loss: 37.794589
Epoch 1 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 63.394380 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 63.394380 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-460878
Epoch 2 | Training | Elapsed Time: 3:13:58 | Steps: 113547 | Loss: 36.374533
Epoch 2 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 61.814787 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 61.814787 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-574425
Epoch 3 | Training | Elapsed Time: 3:14:00 | Steps: 113547 | Loss: 35.468848
Epoch 3 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 61.019693 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 61.019693 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-687972
Epoch 4 | Training | Elapsed Time: 3:14:02 | Steps: 113547 | Loss: 34.813177
Epoch 4 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 60.532636 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 60.532636 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-801519
Epoch 5 | Training | Elapsed Time: 3:14:03 | Steps: 113547 | Loss: 34.283967
Epoch 5 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 60.032274 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 60.032274 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-915066
Epoch 6 | Training | Elapsed Time: 3:14:06 | Steps: 113547 | Loss: 33.831716
Epoch 6 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 59.746349 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 59.746349 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1028613
Epoch 7 | Training | Elapsed Time: 3:14:00 | Steps: 113547 | Loss: 33.440893
Epoch 7 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 59.475887 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 59.475887 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1142160
Epoch 8 | Training | Elapsed Time: 3:14:06 | Steps: 113547 | Loss: 33.102300
Epoch 8 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 59.257533 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 59.257533 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1255707
Epoch 9 | Training | Elapsed Time: 3:21:10 | Steps: 113547 | Loss: 32.780655
Epoch 9 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 59.061049 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 59.061049 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1369254
Epoch 10 | Training | Elapsed Time: 3:20:30 | Steps: 113547 | Loss: 32.487156
Epoch 10 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.880547 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.880547 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1482801
Epoch 11 | Training | Elapsed Time: 3:13:58 | Steps: 113547 | Loss: 32.222437
Epoch 11 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.791987 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.791987 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1596348
Epoch 12 | Training | Elapsed Time: 3:13:43 | Steps: 113547 | Loss: 31.967294
Epoch 12 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.645778 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.645778 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1709895
Epoch 13 | Training | Elapsed Time: 3:14:25 | Steps: 113547 | Loss: 31.717820
Epoch 13 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.526424 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.526424 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1823442
Epoch 14 | Training | Elapsed Time: 3:13:48 | Steps: 113547 | Loss: 31.481923
Epoch 14 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.491846 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.491846 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-1936989
Epoch 15 | Training | Elapsed Time: 3:13:57 | Steps: 113547 | Loss: 31.274614
Epoch 15 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.362835 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.362835 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2050536
Epoch 16 | Training | Elapsed Time: 3:13:33 | Steps: 113547 | Loss: 31.055216
Epoch 16 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.280660 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.280660 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2164083
Epoch 17 | Training | Elapsed Time: 3:13:33 | Steps: 113547 | Loss: 30.848452
Epoch 17 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.168590 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.168590 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2277630
Epoch 18 | Training | Elapsed Time: 3:13:40 | Steps: 113547 | Loss: 30.655320 d
Epoch 18 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.056185 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.056185 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2391177
Epoch 19 | Training | Elapsed Time: 3:13:32 | Steps: 113547 | Loss: 30.443127
Epoch 19 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 58.018187 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 58.018187 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2504724
Epoch 20 | Training | Elapsed Time: 3:13:35 | Steps: 113547 | Loss: 30.253230
Epoch 20 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.941861 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.941861 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2618271
Epoch 21 | Training | Elapsed Time: 3:13:32 | Steps: 113547 | Loss: 30.078165
Epoch 21 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.895592 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.895592 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2731818
Epoch 22 | Training | Elapsed Time: 3:13:30 | Steps: 113547 | Loss: 29.900475
Epoch 22 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.818665 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.818665 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2845365
Epoch 23 | Training | Elapsed Time: 3:13:44 | Steps: 113547 | Loss: 29.738825
Epoch 23 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.743074 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.743074 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-2958912
Epoch 24 | Training | Elapsed Time: 3:13:35 | Steps: 113547 | Loss: 29.581321
Epoch 24 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.708766 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.708766 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3072459
Epoch 25 | Training | Elapsed Time: 3:13:42 | Steps: 113547 | Loss: 29.404826
Epoch 25 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.704890 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.704890 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3186006
Epoch 26 | Training | Elapsed Time: 3:13:57 | Steps: 113547 | Loss: 29.261388
Epoch 26 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.636500 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.636500 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3299553
Epoch 27 | Training | Elapsed Time: 3:14:00 | Steps: 113547 | Loss: 29.094743
Epoch 27 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.602169 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.602169 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3413100
Epoch 28 | Training | Elapsed Time: 3:13:40 | Steps: 113547 | Loss: 28.973353
Epoch 28 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.594331 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.594331 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3526647
Epoch 29 | Training | Elapsed Time: 3:13:32 | Steps: 113547 | Loss: 28.816902
Epoch 29 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.611178 | Dataset: ../datasets/ted-dev.csv
Epoch 30 | Training | Elapsed Time: 3:14:17 | Steps: 113547 | Loss: 28.664467
Epoch 30 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.526799 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.526799 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3753741
Epoch 31 | Training | Elapsed Time: 3:21:00 | Steps: 113547 | Loss: 28.511221
Epoch 31 | Validation | Elapsed Time: 0:00:51 | Steps: 507 | Loss: 57.472085 | Dataset: ../datasets/ted-dev.csv
I Saved new best validating model with loss 57.472085 to: ../checkpoints/deepspeech-0.6.0-checkpoint/best_dev-3867288
I actually tried bigger batch sizes and increased GPU in the instance, i got the same results, WER is at best 20% whatever i do, maybe the TED-LIUM Dataset won’t do better?
I was hoping to improve accents and it seemed like a good option
1 Like
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
16
I remember with TED-LIUM 2 we would get something quite similar when trained only on it.
Now, how are your running the test phase ? It seems it is on TED’s test set only ?
I also ran it separately on Libri test set and it gave me around 8% WER, but when i ran it against my own voice ( I have a mixed American / British Accent ) I got a significantly better accuracy after TED training than the one pre-trained model officially available. I guess it makes sense on some level given more variety of voices in the TED-LIUM.
I am wondering how will the results be if i continue training on VoxForge. I am hoping to cover more accents. I will post more results as i go.
Please point me in the right direction if you think i can do this differently. Thank you so much.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
18
Well, that seems to confirm a very good improvement, I’m not sure what is bothering you here. 8% on LibriVox clean test set is quite good, and if you say you confirmed a live-usecase where it did seriously improve with your own accent.
It’d be interesting you can run the same confirmation tests after VoxForge