Loading/feeding data order effect

I was checking the code. In util/feeding.py file, data is sorted in ascending order by wav_filesize, that is shorter audios will be processed first. I have a large data of audios with duration from 0.5 sec - 24 sec. How could the order of the data affect the training loss?
I am trying now a run with descending order, and already noticed the loss is better… but maybe the initial random weights is the reason.

Ascending/train (2 epochs):
image

Descending/train (1 epoch):
image

What is the idea of the sorting anyway?

I think this is a type of curriculum learning - start on easier samples first and continue with the difficult ones later in the training. This type of training often leads to better results, e.g. see here curriculum learning paper

Good to know, thank you.

I have this plot for training loss. What I noticed is that the loss is decreasing in a good way until it starts processing large wavs by which it starts increasing, see around 13k step. Then first epoch finished around 25k and then the same is occurring in the second epoch.
Does that mean that long wavs are reducing the accuracy of my model? or this is normal?

My audios ranges from 0.5 - 24 seconds.

@yv001, looking forward your suggestions.

This looks correct to me, for longer wavs, the network has lower accuracy. The important thing is that overall after each epoch the curve ends up lower so the network is learning.

1 Like