Got it ! Very helpful thanks !
And what were your machine features ? GPU ?
You need 500h because you had to transfer from english to spanish but for my accent problematic, I think I don’t need so much data as I stay in english.
meaning only the output layer ? or the output layer and the last layer ?
I just re-trained my model with CV dataset (14 epochs, --drop_source_layers 2, lr 0.00001) and the result is worse than before retraining (goes from 48% WER before to 56% after). That shows that indeed more retraining make it worse…
Next try, 1 epoch with same parameters