As of now, is Deep Speech viable for real-world applications?

dan0 · January 4, 2019, 10:53am

By acoustic vs language model do you mean trying transcription without the language model? If so, we did that accidentally in older versions of DeepSpeech Frontend, the results were poor. This was prior to VAD though.

Manually breaking apart audio doesn’t create these incorrect transcriptions. My working theory (corroborated by others on the #machinelearning IRC channel) is that WebRTCVAD is a touch aggressive in slicing up audio, so the beginning and the end of the word can sometimes get partially chopped off resulting in DeepSpeech not understanding the malformed audio its given.

Topic		Replies	Views
Deepspeech recognition rate DeepSpeech	16	8593	July 23, 2018
Share your trained model for Mozilla DeepSpeech? DeepSpeech	6	466	April 14, 2020
Deep Speech vs Picovoice Cheetah DeepSpeech	8	2071	November 17, 2019
General status of DeepSpeech DeepSpeech	10	829	September 23, 2019
Deepspeech accuracy decreasing? DeepSpeech	8	2691	October 10, 2018

As of now, is Deep Speech viable for real-world applications?

Related topics