Transfer learning between different languages

daniel.cruzado · March 11, 2019, 4:51pm

Hi! I want to train a model to recognize Spanish, since there are not too much data available, I have thought of using transfer learning from English. I have a couple questions.

I have seen several branches of transfer learning, but I am not sure of how to use them, is there some docs about them?
Is it really worth it to try transfer learning from English to Spanish? I do not know if the sounds are close enough to be helpful
If the transfer learning branches are not what I am looking for I have thought of removing the last layer and adding a new one, have anybody done that before or is there some kind of guide, I am not sure of how to do it within deepsearch since there is a lot of code?

Thanks a lot for your help

lissyx · March 11, 2019, 4:54pm

I guess @josh_meyer should be able to help specifically on that but he’s traveling right now?

SamahZaro · August 26, 2019, 12:59pm

@daniel.cruzado I was thinking of transfer learning from English to Arabic. Does it worth trying?

reyxuan · September 6, 2019, 12:16pm

Does anyone tried this? I’m also trying to train a Spanish model.

Jendker · September 6, 2019, 4:08pm

I tried transfer learning from English to German and after removing 2 last layers and allowing fine tuning with 0.0001 I was able to get down with WER from 11,7% to 9,4%.

carlfm01 · September 6, 2019, 8:00pm

@reyxuan here’s my result for spanish using transfer learning Non native english with transfer learning from V0.5.1 Model, right branch, method and discussion

reyxuan · September 10, 2019, 11:48am

@carlfm01 I don’t get what you did there. Did you train with English and test with Spanish?

reyxuan · September 10, 2019, 11:48am

@Jendker Thank you very much. I’ll try that. How many hours of audio did you use?

Jendker · September 10, 2019, 3:15pm

I had about 500 hours, did not try any data augmentation though, maybe that would additionally help.

carlfm01 · September 10, 2019, 4:07pm

Sorry for the confusion, no, with the English model did transfer learning to Spanish using 500h of Spanish, both test and train using Spanish.

reyxuan · September 11, 2019, 8:04am

Perfect! I’ll try a model from scratch firstly and then the Transfer Learning method. How did you get 500h of Spanish? Using Data Augmentation?

carlfm01 · September 12, 2019, 5:04pm

No, using data from voxforge,openslr, librivox, and private speech.

You can download my 120h of clean Spanish here:120h Spanish Speech | Kaggle
From Librivox, under public domain

reyxuan · September 13, 2019, 7:53am

I’d already downloaded it. Thanks! I mixed it with these datasets:

https://www.caito.de/data/Training/stt_tts/
https://www.openslr.org/67/
https://www.openslr.org/61/
https://www.openslr.org/39/
https://www.openslr.org/71/
https://www.openslr.org/72/
https://www.openslr.org/73/
https://www.openslr.org/74/
https://www.openslr.org/75/
apart from Common Voice. Those are ~180h.
I’ll add Data Augmentation once I learn how to use DeepSpeech (visualization, correct hyperparams…) and then try Transfer Learning.

carlfm01 · September 13, 2019, 5:13pm

Be careful with the tedx dataset, it has a lot of grammar mistakes, and caito did not converge for me.
About the crowdsourced data, I think it needs silence trimming.

reyxuan · September 16, 2019, 6:13am

I’ve already trimmed the audio. I’ll check what you said about tedx!

Karsten · November 1, 2019, 8:02pm

Did you substitute ä=>ae and so on? If not, how were you able to restore the weights from the english model with the german alphabet having more letters and therefore more nodes in the last layer?

Jendker · November 1, 2019, 10:25pm

At transfer-learning2 there is a parametry to specify how many last layers should be dropped. If you specify at least 1 you should be fine.

Karsten · November 2, 2019, 10:36am

I am aware of this parameter. I tried to load the english model with the drop_source_layers = 2, but it fails restoring the weights because the amount of nodes in the last layer dont accord (due to german alphabet being bigger). Have you had different experiences?

Jendker · November 3, 2019, 4:23pm

That’s interesting, I have German alphabet with 3 umlaut characters, but did not experience any problems with transfer learning. I’ll have a closer look and try to report my findings tomorrow.

Karsten · November 3, 2019, 4:38pm

Thanks, that would be very helpful. Because I suspect that the drop_source_layers parameter only drops the weights after they have been initially loaded. But the initial loading doesn’t work if the network (or alphabet in that sense) deviates.

This is the concrete error with usung the drop_source_layers flag:

deepspeech_asr_1 | E InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
deepspeech_asr_1 | E
deepspeech_asr_1 | E Assign requires shapes of both tensors to match. lhs shape= [2048,33] rhs shape= [2048,29]
deepspeech_asr_1 | E [[node save/Assign_32 (defined at DeepSpeech.py:448) ]]
deepspeech_asr_1 | E
deepspeech_asr_1 | E The checkpoint in /model/model.v0.5.1 does not match the shapes of the model. Did you change alphabet.txt or the --n_hidden parameter between train runs using the same checkpoint dir? Try moving or removing the contents of /model/model.v0.5.1.

Topic		Replies	Views
Question with DeepSpeech Transfer Learning DeepSpeech	40	2828	March 28, 2020
Non native english with transfer learning from V0.5.1 Model, right branch, method and discussion DeepSpeech	31	2858	July 11, 2019
Transfer learning to Urdu with less amount of data - better approach? DeepSpeech learning , issue , dataset	13	1452	May 16, 2022
Train ENGLISH model and transfer to RUSSIAN DeepSpeech	2	463	March 26, 2020
Too many steps? DeepSpeech	8	1291	May 5, 2020

Transfer learning between different languages

Related topics