Training for a specific purpose with specific vocabulary

tsaya · June 12, 2020, 12:56pm

Hi everyone

I’m currently working on a project which requires some speech to text tools to get text on agile scrum’s meeting. I tried Google STT and Amazon Transcribe, and both of them are not enough powerful in this context for two reasons : there are a lot of technical or specific words, and the native language is french (with english words based on scrum terminology). Now I’m wondering, is DeepSpeech able to train with records mixing all these terms to correctly transcribe to text ? And is it possible to make this training on a platform like AWS EC2, as I have not an enough powerful computer to make this training ?

Thank you in advance for your answers

lissyx · June 12, 2020, 12:59pm

Yes, if you have data that is representative you can fine-tune a model for that. You can also try and experiment with a specific language model?

You need to use GPU-backed instances, but it should work.

tsaya · June 12, 2020, 1:06pm

Any idea of how many hours should I collect to correctly train the model ?

That’s what I’m looking for currently, or there’s something else I didn’t understand ?

lissyx · June 12, 2020, 1:11pm

As much as possible.

https://deepspeech.readthedocs.io/en/v0.7.3/Scorer.html?highlight=language%20model#building-your-own-scorer

tsaya · June 12, 2020, 1:14pm

Thank’s, I’m going to try both this and train my own model