Hello,
I’d like to improve WER accuracy for my use cases by adding domain specific phrases and names (e.g. for a technical text, adding words and phrases like “Node.js”, “GPU”, “javascript”).
The rest of the text is in general English so I’d like to leverage the existing models.
The first thing I’d like to try is to use output_graph.pb as is and adapt the language model. So far, I have only seen the language model in it’s binary form in deepspeech repos but not in arpa format or even the original text file from which the model has been generated - are these available somewhere? Alternatively has anyone tried to extend the binary language model?
If anyone can see an easier way to customize the inference with a list of domain specific words/phrases, I’d appreciate your ideas.