How can I estimate training time of deepspeech

(Sanjay Rao) #1

I am running deepspeech training on Tedlium-2 and Mozzila open voice.
in approximate 12 hours it reached

  • Training of Epoch 0 - loss - 141.758459
    I used batch size 8 for training, validation and test data with -use_warpctc option. Except these I am using default options.
    Approximately how much time my training should take.
    I am using 1 GeForce GTX 1080 GPU.

(Lissyx) #2

How big is your overall dataset? TED-LIUM2 + CommonVoice ? That would be ~1000 hours of audio ?

(Sanjay Rao) #3

It should be between 300 - 500 hours. I have used valid sets only

(Lissyx) #4

I think that ballpark for ~500 hours on our cluster with 16 TITAN X GPUs is around 2h for one epoch, so 12 hours for one epoch on your 1080 seems consistent.

(Sanjay Rao) #5

What total training time I can expect ? Any idea ?

(Sanjay Rao) #6


Can you provide some hints about the total training time in my case ? Thanks a lot.

(Lissyx) #7

I think I just did …

(Sanjay Rao) #8

Does it mean that on my hardware, to complete 75 epochs it will take ~38 days ?

(Tilman Kamp) #9

Yes. Running one full epoch will give you a better estimate.

(Adrian) #10

How much time an epoch will last on TED-LIUM2 if using newly NVIDIA Titan V ? Or what would be the speed up over Titan X Pascal? Will the tensor cores help ?