Transfer learning between different languages

Hi! I want to train a model to recognize Spanish, since there are not too much data available, I have thought of using transfer learning from English. I have a couple questions.

  • I have seen several branches of transfer learning, but I am not sure of how to use them, is there some docs about them?
  • Is it really worth it to try transfer learning from English to Spanish? I do not know if the sounds are close enough to be helpful
  • If the transfer learning branches are not what I am looking for I have thought of removing the last layer and adding a new one, have anybody done that before or is there some kind of guide, I am not sure of how to do it within deepsearch since there is a lot of code?

Thanks a lot for your help

4 Likes

I guess @josh_meyer should be able to help specifically on that but he’s traveling right now?

@daniel.cruzado I was thinking of transfer learning from English to Arabic. Does it worth trying?

1 Like

Does anyone tried this? I’m also trying to train a Spanish model.

I tried transfer learning from English to German and after removing 2 last layers and allowing fine tuning with 0.0001 I was able to get down with WER from 11,7% to 9,4%.

@reyxuan here’s my result for spanish using transfer learning Non native english with transfer learning from V0.5.1 Model, right branch, method and discussion

@carlfm01 I don’t get what you did there. Did you train with English and test with Spanish?

@Jendker Thank you very much. I’ll try that. How many hours of audio did you use?

I had about 500 hours, did not try any data augmentation though, maybe that would additionally help.

Sorry for the confusion, no, with the English model did transfer learning to Spanish using 500h of Spanish, both test and train using Spanish.

Perfect! I’ll try a model from scratch firstly and then the Transfer Learning method. How did you get 500h of Spanish? Using Data Augmentation?

No, using data from voxforge,openslr, librivox, and private speech.

You can download my 120h of clean Spanish here:https://www.kaggle.com/carlfm01/120h-spanish-speech/
From Librivox, under public domain :slight_smile:

I’d already downloaded it. Thanks! I mixed it with these datasets:

Be careful with the tedx dataset, it has a lot of grammar mistakes, and caito did not converge for me.
About the crowdsourced data, I think it needs silence trimming.

I’ve already trimmed the audio. I’ll check what you said about tedx!