Fine Tuning Language Model

sruteeshkumar · November 9, 2018, 12:10pm

Hi,
I would like to fine tune the language model to decode my domain specific vocab.
I referred to the Readme at https://github.com/mozilla/DeepSpeech/blob/master/data/lm/README.md

I have domain specific text of about 1 Lakh sentences which is much smaller compared to the Common Voice Text and the Libri Texts. This would not modify the langauge model much.

I want my ASR model to be able to decode generic english as well as my domain specific vocab.

Is there any way I can use different weights to the generic english texts and to my domain specific texts while training the language model.

Thanks.

lissyx · November 9, 2018, 12:15pm

Something like Allow use of several decoders (language models) with a single model in the API · Issue #1678 · mozilla/DeepSpeech · GitHub ?

sruteeshkumar · November 9, 2018, 12:19pm

Hi
Yeah something similar. Is there any reference I can look into for building this?
Ideally I would want a single model trained using both the texts.
Thanks.

lissyx · November 9, 2018, 12:23pm

Check the issue, we don’t yet support that.

sruteeshkumar · November 9, 2018, 12:26pm

@lissyx Yes, I saw the issue. Do you have any references which I can look into so that I can build it myself.
Thanks.

lissyx · November 9, 2018, 12:27pm

I don’t understand your question: what do you want to build yourself?

sruteeshkumar · November 9, 2018, 12:34pm

Build the language model with different weights for both the texts (generic english and domain specific texts). Any references/ pointers for this is appreciated.

Topic		Replies	Views
Have some domain specific vocabulary. What would be the best thing to do? DeepSpeech	2	393	October 21, 2019
Language Model Tuning DeepSpeech	7	1059	March 26, 2019
Building the language model for fine tuning a model (transfer learning) DeepSpeech	1	773	November 28, 2018
Unable to predict domain-specific keywords perfectly DeepSpeech	4	551	November 27, 2019
Customizing language model DeepSpeech	13	8586	February 27, 2018

Fine Tuning Language Model

Related topics