How much increasing number of GPUs help?

Hi,

I’m trying to train different versions of DeepSpeech (using different random seeds) for my research project. Assuming I need to rent computing instances, I’m assessing which of these two paths I should go: (1) Rent multiple instances each with one gpu and train one model on each gpu, or (2) Rent an instance with multiple GPUs and train one model at a time using all GPUs. I.e., I’m wondering if I use 4 GPUs, how shorter the training will be? Ideal case would be 0.25x.

Thanks for your time. :slight_smile:

Yes, there is more or less a linear speedup in term of number of GPUs, assuming you are able to feed them appropriately.

Wow! thanks for your quick prompt. I really appreciate it.

I just trust TensorFlow/DeepSpeechv0.4.1 code, i.e., I did not add a single line of code regarding using more than one GPU.

TensorFlow deals with that, we don’t have anything to do on our side. You should work with newer version than 0.4.1, though. It’s very old now.

cool. Yeah! Unfortunately, I need to reproduce the results for some paper. :frowning: