The noise reduction componet in the pre-transcription processing

dan.bmh · October 30, 2020, 6:25pm

Thanks for sharing your approaches. It did motivate me to run a benchmark for comparison:

In my experiments frequency filtering only had a very small impact. The noise reduction (with rnnoise) did help much, but also can lower the accuracy in more silent environments.

Note that the benchmark did not test transcription accuracy directly, because I’m doing an additional step afterwards (Speech → Text → Intent+Slots).

The benchmark code can be found here

Update: The new model version (0.9) has a much better accuracy in noisy environments due to the noise augmentations in training. Extra noise reduction now decreases the accuracy while frequency filtering does increase it a little in very noise environments.

Topic		Replies	Views
Add support for Real-time Noise cancellation in all DeepSpeech Inference Examples (Feature Request) DeepSpeech	4	623	April 2, 2021
Audio preprocessing TTS (Text-to-Speech)	6	2086	September 9, 2020
Need help with audio cleaning/de-noising DeepSpeech learning , issue	14	2661	February 17, 2021
Transcription having lot of spelling errors and giving wrong spaces for words DeepSpeech	5	1409	January 17, 2019
Very low accuracy with 0.6.1 model - would you sanity check? DeepSpeech	7	960	February 6, 2020

The noise reduction componet in the pre-transcription processing

Related topics