I am facing an issue while training on top of the existing model (using 0.7.4 checkpoint) using 50 GB Common Voice Dataset .
python3 DeepSpeech.py --train_files …/cv-corpus-5.1-2020-06-22/en/clips/train.csv --dev_files …/cv-corpus-5.1-2020-06-22/en/clips/dev.csv --test_files …/cv-corpus-5.1-2020-06-22/en/clips/test.csv --train_batch_size 128 --test_batch_size 128 --dev_batch_size 128 --epochs 125 --n_hidden 2048 --learning_rate 0.0001 --dropout_rate 0.40 --export_dir …/exports/ --checkpoint_dir …/load_checkpoint/deepspeech-0.7.4-checkpoint/ --load_cudnn
Process starts fine , but it ends without any error after certain steps . I don’t have any clue what is going wrong with it?
I have ran the same process twice with slight modifications in the learning_rate and epochs , rest of the parameters being the same.
First time it stopped after 183 steps and second time it stopped after 481 steps.
Please help if someone had faced the similar issues.
Thanks in advance .