I am running deepspeech training on Tedlium-2 and Mozzila open voice.
in approximate 12 hours it reached
Training of Epoch 0 - loss - 141.758459
I used batch size 8 for training, validation and test data with -use_warpctc option. Except these I am using default options.
Approximately how much time my training should take.
I am using 1 GeForce GTX 1080 GPU.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
How big is your overall dataset? TED-LIUM2 + CommonVoice ? That would be ~1000 hours of audio ?
It should be between 300 - 500 hours. I have used valid sets only
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
I think that ballpark for ~500 hours on our cluster with 16 TITAN X GPUs is around 2h for one epoch, so 12 hours for one epoch on your 1080 seems consistent.
How much time an epoch will last on TED-LIUM2 if using newly NVIDIA Titan V ? Or what would be the speed up over Titan X Pascal? Will the tensor cores help ?