I’ve been reading the forums/other resources online looking for different ways to alter the language model behind DeepSpeech.
The data/lm/README.md
seems pretty straight forward on how to train a language model using KenLM. Including updating the language model with additional vocab.
I’ve also seen posts such as:
Fine tuning the language model
and
TUTORIAL : How I trained a specific french model to control my robot
Which go through examples of different language model use.
But I was wondering if anyone has tried any other kind of language model?
As in one not built by KenLM? Such as BERT.
I’ve been looking at BERT lately (state of the art language model, achieving the best results on many language tasks) and was wondering how this would go behind the DeepSpeech acoustic model.
I’m aware it’s probably not a straight forward switch out but I plan on spending a few more days trying to figure out if it’s possible.
If anyone has any inputs/has tried something similar, I’d love to hear.