I’ve came across of a pre-trained model for Deepspeech 0.7.4 and wanted to adjust it to my use. I use it with a speech assistant so I created a list of all words that are recognized by the assistant and called it alphabet.txt.
In the guide it says something about generating a KenLM model. Do I have to create my own one and if yes what kind of input does it need? Or should I rather stick to the given one and only create the scorer with my alphabet.txt?
Thank you in advance
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
2
Yeah but it’s still not clear to me. I already have a lm.binary that was created with a big text corpus though. My question is if it makes more sense to reuse it or for example generate a file with all the possible sentences that can occur instead. Or a combination of text corpus + sentences of the voice assistant
lissyx
((slow to reply) [NOT PROVIDING SUPPORT])
4
please define your problem, the correct handling will be obvious once you states exactly what you are trying to achieve
I have a custom voice assistant where I have different skills installed so I have various intents that can be triggered. I want to improve the accuracy of the speech-to-text so the natural language processing has it easier when there is ambiguity.
I’ve installed a custom model that was created using different speech corpuses including mozilla common voice. Now since this model is for general use, I want to adapt it to my use so it becomes more accurate.
I already created a list of all words that can occur when interacting with my voice assistant. I’m not sure if I also need to create all possible sentences that can occur to generate an own lm.binary. Or if using the given lm.binary is better in this case.