Batch size does have impact on final model accuracy though, discussed e.g. here.
Batch size=1 tends to generate most accurate model but training takes the longest time to complete.
This is also consistent with my experiments on deepspeech with different batch sizes for the same data - smaller batch sizes yield on avg lower loss on test data.
@kdavis
how much dev set do you recommend.
if i have 1000 samples with batch size 100 in my training data set how many samples should be in dev set and its batch size…
thank you…
I recommend having a dev set that’s a “statistically sound” sample when compared to the size of your training set.
To calculate how many clips to use in a dev set I use the sample size calculator with a population size equal to the number of training clips, a confidence level of 99%, and a margin of error of 1%. For example, for 2 million training clips this gives a dev set size of 16504 clips.
For smaller sample sizes it’s much harder to get a statistically sound sample as the dev set size and the training set size end up being almost equal. For example for 1000 training clips the dev set size should be 944.