About batch size, dev loss and scorer

Hey there!

We started seriously training DeepSpeech models and I have collected some questions about training.

  1. When I started training on our initial data (couple of hours) I could use a higher batch size of 24, but as we progress and get more data I find myself having to lower the training batch size everytime we add a new hour ot the data, otherwise I run out of GPU memory. How can that be? More data should not lead to more memory consumption per batch (if not some new files are notoriously long but in our case it seems pretty even). Is this normal, is there something I am missing? Maybe is it related to augmentation?

  2. I always wondered about the loss pattern during training. At the beginning of an epoch, it is low and then steadily increases. How is the loss calculated there?

(as you can see we are doing model fine tuning)

  1. For training the acoustic model you still have to give it the scorer. Is the training actually influenced by the scorer? Or is it just for dev set evaluation and finding the best performing model?

  2. Which files are recommended to run LM optimization on (the lm_optimizer script)? Just the test set files? Train + dev + test? What is your intuition about that?

Thank you very much :slight_smile:

Depends on your data itself. Batch size is limited by your GPU memory and the longest sample in the batch.

As much as I recall, but @reuben can fix me, we order in batches by length and thus start of the batch are small sample, for which loss is easier to get high.

No, as documented scorer is only used during the test step, not even validation. So your training is NOT influenced by it, only the WER/CER computed.

A cleaning step, which drops all files longer than a specific duration/transcription length, should help here.

I did use eval dataset, because it’s smaller than the train set. Optimization was quite slow, I think it did take about 2 days in my runs. Don’t use the test set, else you might get a better WER than your model really has.

1 Like