Actual Batch size for DeepSpeech 0.8.2

The release notes mention that v0.8.2 was trained with 8 GPUs and a training batch size of 128.

So my question is that what was the actual batch size?

  1. Was it 128 per GPU, thus making and an effective batch size of 128*8 = 1024
  2. Or, was it 128 across all 8 GPUs, thus making an effective batch size of 16 per GPU.

If I remember correctly that was for each GPU, but search the forum, this question came up before.

For your own experiments start with a smaller number like 8 and go up until you get an error in the first epoch. 32, 64 are good values for a sth like a V100.

Yes, this one, algorithmic batch size of 1024

Alright! Thanks for the help @othiele and @reuben