Hello,
I am using DeepSpeech version 0.7.4 , I don’t wan’t use transfer learning, I uses deep speech english dataset, and these are my hyperparameters :
I have 3 questions now,
1 - ) each epoch takes too long and too many steps( near million steps ) but I’ve seen in your documentation you never reach this step count ( is something wrong with my batchSize? is default batchsize is 1? )
2 - ) when you wrote 120 epoch for the first phase in 0.7.4 version. did you really cover all the training data? I mean you didn’t use any iteration count for training data to just use some of them?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
How much data do you have ? You mention “DeepSpeech English dataset” but your command line links to some Common Voice english, and no mention of the release.
I guess 12h on your GPU might be expected.
Yes, please read the doc and the help of --helpfull, it’s all documented.
A higher batch size will speed up your training significantly and you should set it for train/dev as high as possible without producing out of memory erros. With your GPU you should get to 4/8?
Training is really resource intensive, so yes this looks realistic.
Absolutely, if you are using batch 1 and around 500-700 hours of input. This could be an OK number.
How much data do you have? You mention the “DeepSpeech English dataset” but your command line links to some Common Voice English, and no mention of the release.
I guess 12h on your GPU might be expected.
yes, you are right that is Common Voice English.
Yes, please read the doc and the help of --helpfull , it’s all documented.
if I increase batchsize does it reduce training time?
As you mentioned in the release doc in Github. for 0.7.4 you had around 300 epoch. (That’s somehow impossible for my hardware)
When there is 500h of data ( my own data) how many epoch do I need for acceptable accuracy and WER?