Actual Batch size for DeepSpeech 0.8.2

shan18 · September 10, 2020, 12:40pm

The release notes mention that v0.8.2 was trained with 8 GPUs and a training batch size of 128.

So my question is that what was the actual batch size?

Was it 128 per GPU, thus making and an effective batch size of 128*8 = 1024
Or, was it 128 across all 8 GPUs, thus making an effective batch size of 16 per GPU.

othiele · September 10, 2020, 1:04pm

If I remember correctly that was for each GPU, but search the forum, this question came up before.

For your own experiments start with a smaller number like 8 and go up until you get an error in the first epoch. 32, 64 are good values for a sth like a V100.

reuben · September 10, 2020, 2:29pm

Yes, this one, algorithmic batch size of 1024

shan18 · September 10, 2020, 2:50pm

Alright! Thanks for the help @othiele and @reuben