lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
24
“too bad” “a little” are not really helpful. Please have proper figures, you might be tricked by examples.
According to me, you need to share more context on what you have tested. Fine-tuning requires work, maybe your learning rate is too high, maybe you need to tune dropout, maybe you need to train on more or less epochs. Have you had a look at train / dev loss evolution ?
Hi @othiele
I have shared the link of YouTube Playlist. I used the transcript provided by the channel and did some pre-processing.
And I don’t find .29 bad for 30 hours from Youtube.
Yeah, even I did some hyperparameter tuning and got WER 0.20 (plz refer my last comment).
But the problem is, it disturbed the previous weights.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
28
So, on your YouTube-based test-set, you have 52.5% WER and 31.1% CER before fine-tuning, and 20.4% WER / 10.5% CER after fine-tuning with ~30h of data ?
That indeed does look like a very nice improvement.
Well, that’s going to be the issue you have to work on.
I suspect you want more than just those validation and test set if you want to avoid degrading quality on previous data. Otherwise, it makes sense that the new learning optimizes for the new data.
Any suggestions from your side? It would be great.
I suspect you want more than just those validation and test set if you want to avoid degrading quality on previous data.
Sorry, I didn’t get this point. Can you plz elaborate?
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
34
I don’t have your dataset, I can’t do your work there. I’ve already shared suggestions.
I don’t see how to say it otherwise: you are fine-tuning and using only one validation set, so your network is getting optimized for this one. That’s also why it regressed on previous dataset.
I guess - stating that in plain English - you could go for an even lower learning rate or put more plain English examples into the validation set so as to alter the original weights a little less
One thing about transfer learning on master: the checkpoints from v0.6.1 are not compatible with master, due to a bugfix in the MFCC computation code. But they will still load, just give bad results. Make sure you don’t mix those up.
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
38
True, forgot about that. Are you referring to the upper limit of frequency? Maybe in this case, it’s hackable by removing it. Being able to use up-to-date transfer learning from 0.6.1 model will likely bring more good than harm?
Haha… Thanks for the suggestion.
I will try to add more audio with American accent and will update you guys and for other people who might face this problem.
-Do you have a patch that addresses the issue with incompatibility between the 0.6.1 checkpoint and the master branch w.r.t to transfer learning?
-What is the difference between fine tuning with a lower learning rate(0.000001) Vs. using transfer learning for Indian accent English over 0.6.1 checkpoint? Is one of this approach better than the other, in producing a better WER on resultant model?
@josh_meyer, could you please point me which English checkpoints are compatible with TransferLearning2. I want to use different alphabet file for training the model.
@lissyx: Is the TransferLearning2 also part of v0.6.0? I can find checkpoints only for versions v0.6.0 or before.