Hope everyone doing good
I had 3 lacs audio voice from which 70% i used it for training and 15% for validation and other 15% in test and the result which i got on test after training it on deepspeech 0.4.1 is WER 0.32 CER 0.11 and loss 6 .
After that i increased my dataset eventually to 6 lacs.
I just want to know should i change my test dataset again and take 10% of 6 lacs and see if my model is doing better on test than previous or is it wrong to evaluate that way?
Or should i keep my test dataset same while keep increasing my training dataset and then test on the same dataset?
How to know that with more training on audio file the model is improving? What is the right approach?