Training for a specific purpose with specific vocabulary

Hi everyone

I’m currently working on a project which requires some speech to text tools to get text on agile scrum’s meeting. I tried Google STT and Amazon Transcribe, and both of them are not enough powerful in this context for two reasons : there are a lot of technical or specific words, and the native language is french (with english words based on scrum terminology). Now I’m wondering, is DeepSpeech able to train with records mixing all these terms to correctly transcribe to text ? And is it possible to make this training on a platform like AWS EC2, as I have not an enough powerful computer to make this training ?

Thank you in advance for your answers

Yes, if you have data that is representative you can fine-tune a model for that. You can also try and experiment with a specific language model?

You need to use GPU-backed instances, but it should work.

Any idea of how many hours should I collect to correctly train the model ?

That’s what I’m looking for currently, or there’s something else I didn’t understand ?

As much as possible.

https://deepspeech.readthedocs.io/en/v0.7.3/Scorer.html?highlight=language%20model#building-your-own-scorer

Thank’s, I’m going to try both this and train my own model

1 Like